Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock on Provider init #1299

Open
chrfwow opened this issue Jan 23, 2025 · 3 comments
Open

Deadlock on Provider init #1299

chrfwow opened this issue Jan 23, 2025 · 3 comments
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed

Comments

@chrfwow
Copy link
Contributor

chrfwow commented Jan 23, 2025

In my newest PR, which replaces a busy wait with a wait/notify construct, I discovered that one can run into a deadlock when emitting events during the initialization.
In more detail, the thread that initializes a FlagdProvider starts the initialization of the flagResolver, and then waits for an event announcing that the provider is ready.
When this ready event is received in the FlagdProvider, it calls EventProvider#emitProviderReady(…), which then deadlocks.

Steps to Reproduce

I wrote a very basic provider, which always deadlocks on initialization. It can be found here.
I made my original discovery in this PR. It can be triggered by running the e2e tests. Notice that the test will keep printing

[Thread-228] WARN dev.openfeature.contrib.providers.flagd.resolver.process.storage.FlagStore - Failed to convey OK satus, queue is full
[Thread-228] WARN dev.openfeature.contrib.providers.flagd.resolver.process.model.FlagParser - Invalid flag configuration: $.flags.equal-greater-lesser-version-flag: must be valid to one and only one schema, but 0 are valid
[Thread-228] WARN dev.openfeature.contrib.providers.flagd.resolver.process.storage.FlagStore - Failed to convey OK satus, queue is full
[Thread-228] WARN dev.openfeature.contrib.providers.flagd.resolver.process.model.FlagParser - Invalid flag configuration: $.flags.equal-greater-lesser-version-flag: must be valid to one and only one schema, but 0 are valid

to the console. The warning of the invalid flag config can be ignored, this is expected in the test. The warning Failed to convey OK satus, queue is full, however, is an indicator of the problem, as the thread that is supposed to drain this queue is deadlocked.

Expected Behavior

The EventProvider should never deadlock, no matter under which conditions the emit (or emitProviderReady) method is called. When I wrap the call to the emitProviderReady()method inside the onReady() method in the FlagdProvider in a new thread, the deadlock is prevented.

Possible Fix

EventProvider#emit(…) should not directly call this.onEmit.accept(this, event, details);, instead it should store the call in some thread safe queue, which is drained by another thread.

@toddbaert
Copy link
Member

toddbaert commented Jan 27, 2025

I think we should consider this an improvement or even a bug and fix it as you describe. If you are willing, please open a PR with a fix (otherwise I'll just add some labels to this issue to make it more visible to the community).

Your PR could also include a very simple deadlocking test provider that would cause a test timeout of there was a regression in your improvement.

@toddbaert toddbaert added bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed labels Jan 30, 2025
@toddbaert
Copy link
Member

I've added labels. This is still open to anyone.

@sideshowcoder
Copy link

I got started on this today, but not sure I 100% grasped the reason for the issue, yet. But porting the test case provided over seemed like a useful start, so sharing as a Draft.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants