-
Notifications
You must be signed in to change notification settings - Fork 16.6k
Description
Description
Modify the /api/v2/monitor/health response to surface information for the state of any horizontally scaled / replicated components. Given that all the components described in the current response are horizontally scalable, It would be much more useful to return a list of state payloads for each component. For example, in a high-availability scheduler setup running 2 schedulers, it would be useful to see the status information for both those two schedulers in the response.
A very rough outline of what the proposed API response might look like is below:
{
"metadatabase": {
"status": "string"
},
"scheduler": [
{
"hostname": "scheduler-1",
"status": "healthy",
"latest_scheduler_heartbeat": "scheduler-1 latest heartbeat here"
},
{
"hostname": "scheduler-2",
"status": "healthy",
"latest_scheduler_heartbeat": "scheduler-2 latest heartbeat here"
}
],
"triggerer": [
...
],
"dag_processor": [
...
]
}Concretely, I believe the most noteworthy changes necessary for this would be:
- Adding a
hostnamefield to the(Scheduler|Triggerer|DagProcessor)InfoResponsemodels inairflow.api_fastapi.core_api.datamodels.monitor - Changing the fields of
HealthInfoResponsemodel to be a list of the corresponding per-field models. - Adding an index on the
hostnamecolumn in thejobstable (this particular change would benefit a number of other things unrelated to this ticket).
Use case/motivation
Currently, as of version 3.1.7, responses from the /api/v2/monitor/health endpoint offer a single status and latest_*_heartbeat field per Airflow component. This is fine for any Airflow deployment which has a single scheduler, a single triggerer, and so on. However, the information from this endpoint does not provide reliable insight into the system's health if one or more of those components are horizontally scaled.
Related issues
Tangentially, the feature proposed here could make it easier to implement #17191, but to be clear that ticket describes a different feature.
Are you willing to submit a PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct