feat: add multi-GPU–friendly default for vLLM/Unsloth engine setup #483
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
• Summary
Default vLLM/Unsloth tensor_parallel_size to the number of visible GPUs (respects CUDA_VISIBLE_DEVICES) so multi-GPU setups don’t start with an unset/invalid TP world size.
Still allows explicit overrides via _internal_config["engine_args"].
Motivation
On multi-GPU H100 machines, vLLM was crashing during init when TP wasn’t configured; setting a sensible default prevents a TP=0/None startup.
Details
In get_model_config, set engine_args["tensor_parallel_size"] = torch.cuda.device_count() (or 1 if CUDA unavailable) before merging user overrides.