Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-115999: Enable free-threaded specialization of LOAD_CONST #129365

Merged
merged 2 commits into from
Jan 29, 2025

Conversation

Yhg1s
Copy link
Member

@Yhg1s Yhg1s commented Jan 27, 2025

Enable free-threaded specialization of LOAD_CONST, which is an unconditional specialization without stats bookkeeping, so it's a simple opcode replace. (It could even be a relaxed store if we didn't have to worry about instrumentation.)

@Yhg1s
Copy link
Member Author

Yhg1s commented Jan 27, 2025

I'm not sure why LOAD_CONST was already checked off on the issue page. The test fails for free-threaded builds without the accompanying bytecode change.

@mpage
Copy link
Contributor

mpage commented Jan 27, 2025

I'm not sure why LOAD_CONST was already checked off on the issue page.

I think specializing LOAD_CONST during bytecode execution was a recent change: #128708. Previously, it happened when the bytecode was quickened during creation of the code object.

@Yhg1s Yhg1s marked this pull request as ready for review January 27, 2025 19:30
@Yhg1s Yhg1s requested a review from markshannon as a code owner January 27, 2025 19:30
@Yhg1s
Copy link
Member Author

Yhg1s commented Jan 27, 2025

Looks like about 1% gain from this, but given noise levels it could be a wash. Still, considering way more things are currently immortal in the free-threaded build it's probably a bigger win than in the regular build :D

https://github.com/facebookexperimental/free-threading-benchmarking/blob/main/results/bm-20250127-3.14.0a4%2B-c92089c-NOGIL/bm-20250127-vultr-x86_64-Yhg1s-more_spec-3.14.0a4%2B-c92089c-vs-base.svg

uint8_t expected = LOAD_CONST;
_Py_atomic_compare_exchange_uint8(
&this_instr->op.code, &expected,
_Py_IsImmortal(obj) ? LOAD_CONST_IMMORTAL : LOAD_CONST_MORTAL);
Copy link
Member

@Fidget-Spinner Fidget-Spinner Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want a bigger win, perhaps you could allow deferred references as well. They behave the same as immortal objects if they're live on the stack.

Considering many LOAD_CONST are deferred objects, that should be a nice win in perf.

That can come in another PR though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's probably a good idea. Running some tests on the testsuite, about 75% of all LOAD_CONST invocations involved immortal objects, 25% were deferred, and the remainder is less than 1% (in the free-threaded build. Not nearly as many immortal objects in a regular build). I suspect we'll get fewer immortal objects over time, and most of them will probably be deferred instead, I guess?

Now the question is: do I add PyStackRef_FromPyObjectDeferred(), or do I change PyStackRef_FromPyObjectImmortal() to accept deferred objects as well? :)

Copy link
Contributor

@mpage mpage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@Yhg1s Yhg1s merged commit 5c930a2 into python:main Jan 29, 2025
70 checks passed
@Yhg1s Yhg1s deleted the more-spec branch January 29, 2025 00:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants