-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Color code viable/strict blocking jobs differently on HUD #6239
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
} | ||
|
||
// Source of truth for these jobs is in https://github.com/pytorch/pytorch/blob/main/.github/workflows/update-viablestrict.yml#L26 | ||
const viablestrict_blocking_jobs_patterns = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic has a small issue is that it wrongly marks memory leak check jobs as blocking v/s. Also this would only work for PyTorch, other repos have different logic there, i.e. https://github.com/pytorch/executorch/blob/main/.github/workflows/update-viablestrict.yml#L23. It would be nice to have this for all repos, but it might be tricky to implement when this list is hardcoded here, so maybe just limit this feature to PyTorch for a start?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some pull jobs in that memory leak check, oddly enough. Any idea how those are triggered?
The closest match for those are these distributed jobs, but if those are indeed the jobs running then the memory leak check condition is buried somewhere deep. @clee2000 might now...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, memory leak check jobs are just regular pull and trunk jobs running once per day with memory leak check turning on. They are triggered by this cron schedule https://github.com/pytorch/pytorch/blob/main/.github/workflows/pull.yml#L14, which is picked up and set by https://github.com/pytorch/pytorch/blob/main/.github/scripts/filter_test_configs.py#L603-L606
mem_leak_check
and its cousinrerun_disable_test
jobs are supplementary and should not blocking viable/strict update. I vaguely remember that we have logic to exclude them in Rockset day.- I don't see people pay attention to
mem_leak_check
jobs at all, maybe we should kill it @clee2000
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested the query on ClickHouse, giving it as an input a commit that failed the mem_leak_check on pull.
Those failed commits do indeed show up in the results.
While I agree that this is not behavior we want, I think today memleak checks that affect pull requests do indeed become viable/strict blocking. It's generally not an issue for us though because only one commit a day will have that check run.
But I'll change this feature to only work on pytorch/pytorch for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I agree that this is not behavior we want, I think today memleak checks that affect pull requests do indeed become viable/strict blocking. It's generally not an issue for us though because only one commit a day will have that check run.
But I'll change this feature to only work on pytorch/pytorch for now
Sounds good! I’m pretty sure we ignored mem leak check in the past, so it’s a regression (likely due to CH migration). I could take a look
[key: string]: RegExp[]; | ||
}; | ||
|
||
// TODO: Move this to a config file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/pytorch/pytorch/blob/main/.github/pytorch-probot.yml seems to be a a natural place for this, but agree that we could do it later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Let's also do a quick post about this to let folks know about the change, any thoughts?
@ZainRizvi, one minor note: to improve UX it would be helpful to add the information about what this new border-coding means in the tooltip, for people who are not familiar with this change. ![]() |
A small experiment. The idea is to make it easier to see when failing jobs are blocking viable/strict upgrades.
This would have helped notice:
Adds a thin border around grouped jobs on the hud home page when there's a viable/strict blocking job failing inside them.
I didn't add a border to the individual jobs (the non grouped view), since it seemed like it might feel more noisy than helpful. But I can add it in if folks feel otherwise.