feat: Parallelise Model Loading #360

vovw · 2024-10-17T21:18:51Z

test using

exo --preload-models llama-3.2-1b,llama-3.1-8b

AlexCheema · 2024-10-18T01:09:21Z

Almost what I envisioned - only thing I would change is to preload after the preemptive download. We don't want to download all possible model shards, only the relevant one.

vovw · 2024-10-18T09:51:48Z

@AlexCheema I think I got it, can you review the changes ??

exo/main.py

vovw · 2024-10-19T23:08:37Z

tested on a m3 pro

AlexCheema · 2024-11-15T07:33:45Z

exo/main.py

+      current_shard = preemptively_start_download(str(uuid.uuid4()), json.dumps({
+        "type": "node_status",
+        "status": "start_process_prompt",
+        "shard": shard.to_dict()
+      }))


doesn't this need to be awaited?

AlexCheema · 2024-11-15T07:34:01Z

exo/main.py

+        "shard": shard.to_dict()
+      }))
+      if current_shard:
+        await node.preload_models([current_shard])


doesn't preemptively_start_download already call preload_models?

vovw · 2024-11-16T15:43:49Z

closed this one because it has been >month for the last pr, synced my fork and the new pr is at #466

vovw mentioned this pull request Oct 17, 2024

[BOUNTY - $100] Parallelise Model Loading #202

Open

AlexCheema requested changes Oct 18, 2024

View reviewed changes

exo/main.py Outdated Show resolved Hide resolved

vovw requested a review from AlexCheema October 19, 2024 23:08

AlexCheema requested changes Nov 15, 2024

View reviewed changes

vovw closed this Nov 16, 2024

vovw force-pushed the main branch from e914db5 to 3d0e2f1 Compare November 16, 2024 13:23

vovw mentioned this pull request Nov 16, 2024

Parallelise Model Loading #466

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Parallelise Model Loading #360

feat: Parallelise Model Loading #360

vovw commented Oct 17, 2024 •

edited

Loading

AlexCheema commented Oct 18, 2024

vovw commented Oct 18, 2024

vovw commented Oct 19, 2024

AlexCheema Nov 15, 2024

AlexCheema Nov 15, 2024

vovw commented Nov 16, 2024

feat: Parallelise Model Loading #360

feat: Parallelise Model Loading #360

Conversation

vovw commented Oct 17, 2024 • edited Loading

AlexCheema commented Oct 18, 2024

vovw commented Oct 18, 2024

vovw commented Oct 19, 2024

AlexCheema Nov 15, 2024

Choose a reason for hiding this comment

AlexCheema Nov 15, 2024

Choose a reason for hiding this comment

vovw commented Nov 16, 2024

vovw commented Oct 17, 2024 •

edited

Loading