Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support free-threaded Python 3.13 #925

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

ngoldbaum
Copy link
Contributor

Towards fixing #913.

  • Adds pytest-run-parallel as a test dependency. This does a good job of detecting thread-unsafe use of global state but doesn't catch thread safety issues due to sharing mutable state stored in objects. I marked one test as thread-unsafe because it uses the warnings module, which isn't thread-safe.

  • Declares that the rust PyO3 module exprted in _bcrypt doesn't use the GIL. It looks like the rust code exports python functions and doesn't define any objects with state. Am I missing something?

  • Sets up free-threaded CI using quansight-labs/setup-python.

It looks like the existing tests don't use threading at all. Should I add some multithreaded tests? Or is pytest-run-parallel executing the existing tests in parallel sufficient?

After this I'll look at updating the wheel-building and uploading 313t wheels.

@ngoldbaum ngoldbaum force-pushed the support-free-threading branch from dcb6071 to 1d8f655 Compare November 25, 2024 19:39
@ngoldbaum
Copy link
Contributor Author

ngoldbaum commented Nov 25, 2024

It looks like the lack of Python3.7 support in pytest-run-parallel is a little annoying to deal with in tox (or at least I don't see how to deal with it straightforwardly given that we need to make a code modification to the tests).

Let me see how hard it is to support Python 3.7 in pytest-run-parallel.

@alex
Copy link
Member

alex commented Nov 25, 2024 via email

@reaperhulk
Copy link
Member

I think we probably do want some explicit threading testing separate from the use of threading in the test harness itself. A quick lazy glance at the stats suggests we're ~5-7% Python 3.7 downloads per day. On pyca/cryptography that'd be too high, but this project sees less change so I think I'd be okay with dropping it.

@ngoldbaum
Copy link
Contributor Author

I think we probably do want some explicit threading testing separate from the use of threading in the test harness itself.

Can you share a little more what you had in mind here? Maybe several threads hashing and checking passwords based on shared bytes or hashes? Is there any state stored anywhere? It looks like everything is a pure function.

@ngoldbaum
Copy link
Contributor Author

I think I'd be okay with dropping it.

Great, I'll look at dropping Python 3.7 in a separate PR.

@reaperhulk
Copy link
Member

Can you share a little more what you had in mind here? Maybe several threads hashing and checking passwords based on shared bytes or hashes? Is there any state stored anywhere? It looks like everything is a pure function.

Yeah there's no state, but your proposed idea is what I'd do. Just mimicking what a multi-threaded workload would look like with hash gen, verification, and kdf operations.

@ngoldbaum ngoldbaum force-pushed the support-free-threading branch from fed9d38 to ba2bcd4 Compare November 26, 2024 19:37
@ngoldbaum
Copy link
Contributor Author

I added a test that uses threading, let me know if you have any suggestions.

It also looks like PyO3 0.23.2 might have broken setuptools-rust on Windows?

@ngoldbaum
Copy link
Contributor Author

@davidhewitt
Copy link

I think PyO3/pyo3#4733 might fix it, want to try setting that git branch here and see if resolves?

@ngoldbaum ngoldbaum force-pushed the support-free-threading branch from 994a54e to 9187561 Compare December 6, 2024 17:53
@ngoldbaum
Copy link
Contributor Author

@reaperhulk it looks like all the upstream issues have been sorted out and this is ready now.

self.id_ = id_
self.salt = bcrypt.gensalt(4)
self.hash_ = bcrypt.hashpw(pw, self.salt)
self.key = bcrypt.kdf(pw, self.salt, 32, 50)
Copy link
Contributor Author

@ngoldbaum ngoldbaum Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose these parameters to make this run in a reasonable time. Still, it's a very slow test compared with the rest of the test suite.

@ngoldbaum
Copy link
Contributor Author

ping @reaperhulk - any chance you can look at this PR again?

Copy link
Member

@alex alex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last thing to consider is when we want to add this to the wheel builder?

@@ -494,3 +498,41 @@ def test_2a_wraparound_bug():
)
== b"$2a$04$R1lJ2gkNaoPGdafE.H.16.1MKHPvmKwryeulRe225LKProWYwt9Oi"
)


@pytest.mark.thread_unsafe()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this thread unsafe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably better to use pytest.mark.parallel_threads(1) instead of thread_unsafe(). They're equivalent but the former communicates better what I want. Since the test is already spawning a bunch of threads and is slow it seemed incorrect to me to also run it under pytest-run-parallel.

tests/test_bcrypt.py Show resolved Hide resolved
tests/test_bcrypt.py Show resolved Hide resolved
Comment on lines 519 to 525
# use UUIDs as both ID and passwords
num_users = 50
ids = [get_id() for _ in range(num_users)]
pws = {id_: get_id() for id_, _ in zip(ids, range(num_users))}

user_creator = ThreadPoolExecutor(max_workers=4)

def create_user(id_, pw):
return id_, User(id_, pw)

creator_futures = [
user_creator.submit(create_user, id_, pw) for id_, pw in pws.items()
]

users = [future.result() for future in creator_futures]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like there's way too much going on relative to what we're actually trying to test. At base we're trying to to run all the bcrypt functions concurrently.

So I think this can simplify to:

futures = [user_creator.submit(create_user, uuid.uuid4()) for _ in range(num_users)]

remoe the unused id field from User, and then just use the attributes, rather than the pws dict?

@ngoldbaum
Copy link
Contributor Author

add this to the wheel builder?

I was planning to do it in a followup. Happy to do it here if you'd prefer to have it all ready in one PR.

@alex
Copy link
Member

alex commented Jan 14, 2025

Seperate PR for wheel builder is fine/better.

@ngoldbaum ngoldbaum force-pushed the support-free-threading branch from c65835f to 1a26622 Compare January 14, 2025 16:49
@ngoldbaum ngoldbaum requested a review from alex January 14, 2025 16:57
user_creator = ThreadPoolExecutor(max_workers=4)
pws = [uuid.uuid4().bytes for _ in range(num_users)]

futures = [user_creator.submit(User, pw) for pw in pws]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need a user class at all, just a function that returns a tuple of values.

def check(self, pw):
return bcrypt.checkpw(pw, self.hash_)

# use UUIDs as both ID and passwords
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment no longer makes a ton of sense

Comment on lines +504 to +505
# this test spawns threads and is slow, so don't run it in many threads
@pytest.mark.parallel_threads(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll be honest, I'm really struggling to understand the pytest-run-parallel docs on what this marker does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now I'm setting --parallel-threads=10 in the call to pytest in tox. WIth that setting, for each test, pytest-run-parallel creates a thread pool that, by default, simultaneously executes every test in the test suite in 10 threads.

pytest-run-parallel also additionally lets you repeatedly run the test for a number of iterations in each spawned thread. By default it runs the test once in each thread. I'm not using that feature in this PR.

The pytest.mark.parallel_threads(1) decorator overrides the global default we're setting in the tox call to pytest and tells pytest-run-parallel to spawn a single thread for this test. I'm also not overriding the default iterations, so it runs the test once, just like if pytest-run-parallel wasn't turned on in the test session.

I think the pytest-run-parallel docs might be clearer if the readme included some example code using concurrent.futures.ThreadPoolExecutor that is equivalent to what pytest-run-parallel is doing. Thanks for the feedback that the docs are unclear.

passenv =
RUSTUP_HOME
commands =
coverage run -m pytest --strict-markers {posargs}
coverage run -m pytest --parallel-threads=10 --strict-markers {posargs}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 10? Is there any way to just get ncpus?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants