Skip to content

Conversation

@rmosolgo
Copy link
Owner

@rmosolgo rmosolgo commented Dec 5, 2025

Fixes #5463

@rmosolgo
Copy link
Owner Author

rmosolgo commented Dec 5, 2025

Sad, it seems like the test suite hangs now...

@rmosolgo rmosolgo force-pushed the async-dataloader-deadlock-fix branch from b04fa42 to 983d1fb Compare December 5, 2025 18:26
@rmosolgo
Copy link
Owner Author

rmosolgo commented Dec 5, 2025

Ok, I think the problem was that Fiber / Task nesting. I reworked it so that each pass over the set of pending dataloader fibers spins up a new job. This avoids any mixing between Async primitives and plain Ruby Fibers. It messed up some of the fiber counting tests though -- I'll sort them out in the morning.

@iyotov-havelock, could you try this branch in your app's test suite and development? You can bundle it with:

gem "graphql", github: "rmosolgo/graphql-ruby", ref: "async-dataloader-deadlock-fix"

please let me know how it goes!

@rmosolgo rmosolgo changed the title Use top-level Sync instead of plain Fiber for AsyncDataloader Don't nest Async primitives inside plain Ruby fibers Dec 6, 2025
@iyotov-havelock
Copy link

Hey @rmosolgo, thanks for checking this issue out.

Tested it with my 'real' app in development, sadly there's still an issue there:
err

Tested it with the demo app (https://github.com/iyotov-havelock/async-dataloader-issue), the No live threads left. Deadlock? error is no longer showing, but the GraphQL response is {"errors":[{"message":"undefined method 'nfields' for nil"}]}, which does not show up when using GraphQL::Dataloader instead of GraphQL::Dataloader::AsyncDataloader

@rmosolgo
Copy link
Owner Author

rmosolgo commented Dec 6, 2025

I saw that error, too, and assumed that GraphQL was "working," since it called code as expected. Inspecting the dataset, I see that it was passed the right values:

#<Sequel::Dataset::_Subclass: "SELECT * FROM \"foo_models\" WHERE (\"id\" IN (1, 2))">

My hunch is that Async + Sequel aren't playing nice here ... but do you have a reason to think that AsyncDataloader in particular is the cause of the issue?

Could you please share the full stack trace (NOT a screenshot) from the error in your application? Maybe some lines there will be a clue.

Also, if you can modify the replication app so that it deadlocks on this branch, I'd be happy to keep debugging on that.


UPDATE:

I got the example to run successfully by adding the fiber_concurrency extension:

diff --git a/config/database.rb b/config/database.rb
index 16f3b03..7ec45a8 100644
--- a/config/database.rb
+++ b/config/database.rb
@@ -1,4 +1,5 @@
 require 'sequel'
+Sequel.extension :fiber_concurrency

 DB = Sequel.connect(
   adapter: 'postgres',

Have you tried that extension in your app? I found it by searching randomly around the Sequel repo for the word "Fiber"...

With that addition, I get:

curl -X POST http://localhost:9292/graphql \
  -H "Content-Type: application/json" \
  -d '{"query": "{ getAllFoos { id name barModels { id name } } getAllBars { id name fooModel { id name } } }"}'

{"data":{"getAllFoos":[{"id":"1","name":"First Foo","barModels":[{"id":"1","name":"Bar 1 for Foo 1"},{"id":"2","name":"Bar 2 for Foo 1"},{"id":"3","name":"Bar 3 for Foo 1"}]},{"id":"2","name":"Second Foo","barModels":[{"id":"4","name":"Bar 1 for Foo 2"},{"id":"5","name":"Bar 2 for Foo 2"}]}],"getAllBars":[{"id":"1","name":"Bar 1 for Foo 1","fooModel":{"id":"1","name":"First Foo"}},{"id":"2","name":"Bar 2 for Foo 1","fooModel":{"id":"1","name":"First Foo"}},{"id":"3","name":"Bar 3 for Foo 1","fooModel":{"id":"1","name":"First Foo"}},{"id":"4","name":"Bar 1 for Foo 2","fooModel":{"id":"2","name":"Second Foo"}},{"id":"5","name":"Bar 2 for Foo 2","fooModel":{"id":"2","name":"Second Foo"}}]}}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Deadlock in AsyncDataloader with latest async versions

3 participants