-
-
Notifications
You must be signed in to change notification settings - Fork 31.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-129236: Use stackpointer
in free threaded GC
#129240
Conversation
The stack pointers in interpreter frames are nearly always valid now, so use them when visiting each thread's frame. For now, don't collect objects with deferred references in the rare case that we see a frame with a NULL stack pointer.
{ | ||
_Py_FOR_EACH_TSTATE_BEGIN(interp, p) { | ||
for (_PyInterpreterFrame *f = p->current_frame; f != NULL; f = f->previous) { | ||
PyObject *executable = PyStackRef_AsPyObjectBorrow(f->f_executable); | ||
if (executable == NULL || !PyCode_Check(executable)) { | ||
if (f->owner >= FRAME_OWNED_BY_INTERPRETER) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also updated this to match the condition in Python/gc.c
.
Benchmark: 1% faster The most extreme results are:
The impactful part of this change is that stack marking (from GH-128807) is much more effective now because we are also looking at non-deferred references. From some separate benchmarking, I think that skipping the extra frame initialization has a much smaller effect (close to 0.1%). |
Python/gc_free_threading.c
Outdated
|
||
_PyStackRef *top = f->stackpointer; | ||
if (top == NULL) { | ||
// GH-129236: The stackpointer may be NULL. Skip this frame |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know the cases when stackpointer can be NULL
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see it can be because of PyStackRef_CLOSE
, perhaps add that here too or hint to check gc_visit_thread_stacks_mark_alive
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the comment.
This looks okay to me (although I'm not very familiar with how the stack currently works in ceval). Perhaps this comment can be expanded, since it took me a while to convince myself that it should work.
As I understand: the object can have one or more deferred references to it that are not accounted for in the refcnt value and will also not be traced by tp_traverse. By adding one to the "gc refs" value, we prevent the GC from incorrectly concluding this object is part of a garbage cycle (all other refs to it are accounted for). |
Co-authored-by: Kumar Aditya <[email protected]>
@nascheme - I've updated the comment and logic in |
@nascheme, would you please take another look at this? |
I'm not sure about the expectation that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks correct to me, provided (as Neil said) the stack pointer is either always correct or NULL
when the gc runs.
The stack pointers in interpreter frames are nearly always valid now, so use them when visiting each thread's frame. For now, don't collect objects with deferred references in the rare case that we see a frame with a NULL stack pointer.
_PyInterpreterFrame.stackpointer
in free threaded GC when traversing threads' stacks #129236