At the 2016 tc-worker workweek the Taskcluster Platform team laid out our core design principles. The four key principles are:
- Self-service
- Robustness
- Enable rapid change
- Community friendliness
These are all under an umbrella we call Getting Things Built™. None of our work matters unless it works! Read further for a slightly expanded list of principles!
- Task Isolation
- API-driven UI Tools
- Extensibility
- Granular Security
- Clearly-defined interfaces
- Separation of concerns
- Scalability
- Correctness
- Minimal Self-hosting
- Use managed services, e.g. S3, Azure Storage
- Don't self-host mutable services
- Stateless services
- 12-factor applications
- Agility
- Clearly-defined interfaces
- Microservices
- Separation of concerns
- Transparency
- Granular Security
- Public by Default
- Self-Service
- Changes are made in an open fashion, considering all (real and potential) users of the platform
- In particular, we strive to implement general solutions even when a single user has a very specific requirement. More precisely, despite Firefox CI being the dominant user of Taskcluster, implemnetations of features are never Firefox-specific.
Here are a few bullet-point practical principles we follow in developing and reviewing changes to Taskcluster:
-
Services do not share code - no service ever
require
s (orimports
) code from another service. When necessary, common code is factored out into libraries (underlibraries/
and in packages namedtaskcluster-lib-...
). -
Services are tiered - the "platform" services are interdependent and not expected to work without each other. For example, the queue service will fail if the auth service is down. The "core" and "integration" services depend on platform services, but the reverse is not the case. For example, the secrets service will fail if the queue service is down, but the queue service will continue running when the secrets service is down.
-
Services own their database tables - each database table belongs to a single service, which has write access to that table In general, other services needing that data should prefer to get it via "normal" REST API calls. In limited cases where that data is required for database-level operations, a service can be granted read-only access to another service's tables. For example, as of this writing the worker-manager service can read the queue's
queue_workers
table to determine whether a worker is quarantined. Such cross-service data sharing should be minimal. -
Database access has lots of rules - special care is required around database design. Important user-facing features such as downtime-free upgrades depend on adherence to these rules. See the taskcluster-lib-postgres and db documentation for details.