refactor: Refactor Download Management to Prevent Task Starvation by Barbarella6666666 · Pull Request #1713 · Cyberdrop-DL/cyberdrop-dl

Barbarella6666666 · 2026-04-24T12:09:18Z

This is my take on trying to solve #1702

Replaces the TaskGroup in scrape_mapper.py with an unbounded asyncio. Queue and a background Task Dispatcher. This architectural shift ensures that file downloads are scheduled immediately, solving the starvation issues encountered when scraping high volumes of data.

Instead of batching downloads within a TaskGroup, we now pipe them into an internal queue. A dedicated dispatcher task consumes this queue in a loop, spawning a new independent asyncio.Task for every download.

Immediate Start: Downloads begin the moment the scraper identifies a file.
Fair Scheduling: Each download is an independent task, preventing the scraper loop from "starving" the download processes.
Controlled Concurrency: We still rely on the existing three-level semaphore system (server, domain, global) to handle rate limiting.
Graceful Shutdown: The system uses a None sentinel to signal the dispatcher to stop only after the queue is empty.
Minimal Footprint: Changes are localized to scrape_mapper.py.

Discarded Approaches

Fixed Worker Pool: Rejected because a few workers stuck on a single domain's semaphore would block downloads for all other domains (head-of-line blocking).
asyncio.PriorityQueue: Rejected because asyncio doesn't support native task-level priority. Reimplementing the scheduler would be overly invasive and could lead to scraper starvation.
Scraping Semaphore: Rejected because limiting scraping tasks reduces overall throughput without fixing the underlying scheduling conflict.
Thread Pool: Rejected because downloads are I/O-bound and already use aiohttp. Adding threads would break integration with existing asyncio semaphores and introduce unnecessary complexity.

Downloads were starved because the asyncio scheduler prioritized the many scraping tasks over download tasks in the same event loop. Introduce an asyncio.Queue as the handoff mechanism so download coroutines are dispatched independently of scraping pressure.

A single dispatcher task drains the asyncio.Queue and spawns each download as its own asyncio.Task. This preserves the original concurrency model where semaphores (server/domain/global) control parallelism, while ensuring downloads are no longer starved by scraping tasks.

Replace the downloads TaskGroup context manager with the new dispatcher task. On shutdown, a None sentinel is sent to the queue so the dispatcher drains remaining downloads before exiting. The TaskGroups dataclass no longer needs a downloads field.

wait_until_scrape_is_done is a UI notification, not a download. It was piggybacking on the downloads TaskGroup; now it runs as an independent asyncio.create_task so it does not occupy the download queue.

…o fix-immediate-downloads

…task`

NTFSvolume · 2026-04-25T00:54:38Z

Yeah, this is the approach I mentioned it on #1702.

I didn't explain why but the reason I didn't what to use a queue is cause we lose all the error handling that the task group does for us on cancellation (which we never do, but the user could hit Ctrl+C in the middle of a run) or error. It will also use significantly more memory cause we will have a bunch of free flying coros on the queue not attached to any task until the next loop iteration.

This will work though, so I'm not completely against it. I will do a proper review and give it a try later today

We will need to add logic to cancel all tasks in the queue on error to get similar behavior to a taskgroup (Maybe use a taskgroup within?) to prevent the classic Coroutine was never awaited error.

Ideally, we should use a queue of MediaItem objects instead of coros, and create the coro+task from within the queue consumer task, but that will require a bigger refactor.

NTFSvolume

This is still not getting the desired behavior for me. A bunch of pages get scraped before downloads start. I think we need eager tasks for this to work.

Also, as I expected, it is swallowing any `CTRL + C I hit

NTFSvolume · 2026-04-25T12:51:14Z

+        while True:
+            coro = await self._pending_downloads.get()
+            if coro is None:
+                break
+            task = asyncio.create_task(coro)
+            active.add(task)
+            task.add_done_callback(active.discard)
+
+        if active:
+            await asyncio.gather(*active, return_exceptions=True)


You should wait on done event. We can actually use a taskgroup

Suggested change

while True:

coro = await self._pending_downloads.get()

if coro is None:

break

task = asyncio.create_task(coro)

active.add(task)

task.add_done_callback(active.discard)

if active:

await asyncio.gather(*active, return_exceptions=True)

while not self._done.is_set():

coro = await self._pending_downloads.get()

if coro is None:

break

task = asyncio.create_task(coro)

active.add(task)

task.add_done_callback(active.discard)

if active:

async with asyncio.TaskGroup() as tg:

for pending in active:

tg.create_task(pending)

NTFSvolume · 2026-04-25T12:53:55Z

@@ -211,7 +227,8 @@ async def wait_until_scrape_is_done() -> None:
                    (crawler.DOMAIN, count) for crawler in self._factory if (count := len(crawler._scraped_items))
                )

-            self.create_download_task(wait_until_scrape_is_done())
+            task = asyncio.create_task(wait_until_scrape_is_done())
+            background_tasks.add(task)


We can move the hide_scrape_panel call to _download_dispatcher

NTFSvolume · 2026-04-25T13:09:47Z

Added a commit so we can simulate downloads without actually downloading anything.

I tested with https://boards.4chan.org/p/ but any supported site that has pagination should be good for testing. I got hundreds of pages scraped before any download begins.

Check self._done.is_set() in loop condition instead of while True. Drain remaining queued coroutines after loop exits. Use TaskGroup instead of asyncio.gather for final active tasks, restoring proper error propagation and Ctrl+C handling.

The dispatcher already knows when scraping is done (receives the None sentinel). Move the UI callback there instead of a standalone task. Collect url_count stats after the scrape loop in run().

On Python 3.12+, set asyncio.eager_task_factory on the event loop so that tasks created via asyncio.create_task() begin executing immediately instead of waiting for the next scheduler cycle. This is the key fix for downloads not starting while scraping is active. On Python 3.11, falls back to default behavior (no eager scheduling available).

On CancelledError or KeyboardInterrupt, cancel all active download tasks and close any unawaited coroutines still in the queue to avoid 'coroutine was never awaited' warnings.

asyncio.shield() returns a Future, not a coroutine, so tg.create_task(asyncio.shield(task)) raises TypeError. The active set contains Tasks already running, not coroutines, so we simply gather them directly.

The _fake_download method simulated download progress with a fixed 10GB size, preventing actual downloads from running. Removed the method, its call in _download, and the random import that was only used by it.

The finally block always drained all active downloads before exiting, effectively swallowing KeyboardInterrupt. Split into except/else/finally: on error or Ctrl+C, cancel the dispatcher and propagate immediately; on normal exit, drain the queue and wait for downloads to finish.

Instead of relying on eager_task_factory (Python 3.12+ only), spawn a fixed pool of download workers that consume from the queue. The number of workers is read from max_simultaneous_downloads config. This works on Python 3.11+ and respects the user's concurrency limit.

The download queue counter only counted items waiting on semaphores inside the downloader. Items still in the pending queue waiting for a worker were not counted, causing the UI to show the worker count instead of the actual queue size.

Barbarella6666666 · 2026-04-27T17:33:17Z

Replaced the single dispatcher loop with a pool of download workers that consume from the queue, so downloads start immediately. The number of workers matches max_simultaneous_downloads from the config.
Fixed Ctrl+C handling — split the finally block in __call__ into except/else/finally so the dispatcher is cancelled immediately on interrupt instead of draining all active downloads.
Removed leftover _fake_download test method from crawler.py.

NTFSvolume

We can't use workers either cause now the numbers of concurrent downloads will have the most relevance in the concurrency decision, when it should be the least relevant

For example, if we use config max of 10 concurrent downloads and the first 10 downloads CDL finds are from bunkr and they are all from the same bunkr CDN, CDL will only download 1 file at a time.

All the download workers will be blocked by the bunkr lock. Downloads from other sites won't start until all downloads from bunkr are finished.

NTFSvolume · 2026-05-01T04:08:46Z

This is the current logic to start downloads:

cyberdrop-dl/cyberdrop_dl/downloader/downloader.py

Lines 119 to 130 in d56554d

    
           server = (media_item.debrid_link or media_item.url).host 
        
           server_limit, domain_limit, global_limit = ( 
        
               self.client.server_limiter(media_item.domain, server), 
        
               self._semaphore, 
        
               self.manager.client_manager.global_download_slots, 
        
           ) 
        
           async with server_limit, domain_limit, global_limit: 
        
               self.processed_items.add(media_item.db_path) 
        
               self.waiting_items -= 1 
        
               yield

The locks go from most restrictive (narrow scope) to least restrictive (wide scope):
site server lock > site concurrent downloads limit -> config concurrent downloads limit

Putting all downloads on the same queue with a fixed number of workers will make downloads from different sites block each other

NTFSvolume · 2026-05-01T04:16:05Z

I thinking of just dropping python 3.11 support. We can force eager tasks with 3.12+ and just let the event loop manage them as best as it can, instead of trying to fix this ourselfs.

@jbsparrow, do think is OK if we drop p3.11?

jbsparrow · 2026-05-03T14:32:51Z

I thinking of just dropping python 3.11 support. We can force eager tasks with 3.12+ and just let the event loop manage them as best as it can, instead of trying to fix this ourselfs.

@jbsparrow, do think is OK if we drop p3.11?

Honestly I think that this is the best solution. It's unfortunate to drop support for a Python version, but I think that our users are fine upgrading. We don't want to implement some hacky solution for a few versions when we will inevitably be upgrading eventually anyways.

This solution is much simpler and easier for us. I think it should be implemented in v10.0, along with the database migration system. I will work on that more today and upload the branch.

Barbarella6666666 · 2026-05-04T15:18:01Z

I thinking of just dropping python 3.11 support. We can force eager tasks with 3.12+ and just let the event loop manage them as best as it can, instead of trying to fix this ourselfs.

@jbsparrow, do think is OK if we drop p3.11?

That is the best solution for sure.

Barbarella6666666 added 7 commits April 24, 2026 08:30

fix: run scrape-done UI callback as standalone task

508b8da

wait_until_scrape_is_done is a UI notification, not a download. It was piggybacking on the downloads TaskGroup; now it runs as an independent asyncio.create_task so it does not occupy the download queue.

Merge remote-tracking branch 'other-repo/fix/immediate-downloads' int…

83c6d81

…o fix-immediate-downloads

fix: RUF006 Store a reference to the return value of `asyncio.create_…

193c3e6

…task`

refactor: fix ruff formatting

17fcdf8

NTFSvolume self-requested a review April 25, 2026 12:34

NTFSvolume reviewed Apr 25, 2026

View reviewed changes

refactor: add smoke test

798d1e6

Barbarella6666666 added 9 commits April 27, 2026 09:12

refactor: move hide_scrape_panel into _download_dispatcher

5c7285b

The dispatcher already knows when scraping is done (receives the None sentinel). Move the UI callback there instead of a standalone task. Collect url_count stats after the scrape loop in run().

fix: handle cancellation and Ctrl+C in download dispatcher

7b7d895

On CancelledError or KeyboardInterrupt, cancel all active download tasks and close any unawaited coroutines still in the queue to avoid 'coroutine was never awaited' warnings.

fix: replace broken TaskGroup+shield drain with asyncio.gather

90a0282

asyncio.shield() returns a Future, not a coroutine, so tg.create_task(asyncio.shield(task)) raises TypeError. The active set contains Tasks already running, not coroutines, so we simply gather them directly.

fix: remove fake download that intercepted real downloads

5a5986c

The _fake_download method simulated download progress with a fixed 10GB size, preventing actual downloads from running. Removed the method, its call in _download, and the random import that was only used by it.

NTFSvolume self-assigned this Apr 28, 2026

NTFSvolume self-requested a review April 28, 2026 23:17

NTFSvolume reviewed May 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor: Refactor Download Management to Prevent Task Starvation#1713

refactor: Refactor Download Management to Prevent Task Starvation#1713
Barbarella6666666 wants to merge 17 commits into
Cyberdrop-DL:mainfrom
Barbarella6666666:fix-immediate-downloads

Barbarella6666666 commented Apr 24, 2026

Uh oh!

NTFSvolume commented Apr 25, 2026

Uh oh!

NTFSvolume left a comment

Uh oh!

NTFSvolume Apr 25, 2026

Uh oh!

NTFSvolume Apr 25, 2026

Uh oh!

NTFSvolume commented Apr 25, 2026

Uh oh!

Barbarella6666666 commented Apr 27, 2026

Uh oh!

NTFSvolume left a comment •

edited

Loading

Uh oh!

NTFSvolume commented May 1, 2026 •

edited

Loading

Uh oh!

NTFSvolume commented May 1, 2026

Uh oh!

jbsparrow commented May 3, 2026 •

edited

Loading

Uh oh!

Barbarella6666666 commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

Barbarella6666666 commented Apr 24, 2026

Discarded Approaches

Uh oh!

NTFSvolume commented Apr 25, 2026

Uh oh!

NTFSvolume left a comment

Choose a reason for hiding this comment

Uh oh!

NTFSvolume Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

NTFSvolume Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

NTFSvolume commented Apr 25, 2026

Uh oh!

Barbarella6666666 commented Apr 27, 2026

Uh oh!

NTFSvolume left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NTFSvolume commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NTFSvolume commented May 1, 2026

Uh oh!

jbsparrow commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Barbarella6666666 commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

NTFSvolume left a comment •

edited

Loading

NTFSvolume commented May 1, 2026 •

edited

Loading

jbsparrow commented May 3, 2026 •

edited

Loading