Skip to content

Conversation

@WarningRan
Copy link
Collaborator

Summary

Update the inference driver to resolve TypeError caused by the deprecation of the prompt_token_ids keyword argument in recent vLLM versions.

Changes

  • Input Refactoring: Switched from passing raw token lists via prompt_token_ids=... to the modern prompts parameter using the PromptDict format: [{"prompt_token_ids": [...]}, ...].
  • Generator Update: Enhanced RandomGenerator to directly yield vLLM-compliant data structures, reducing overhead during benchmarking.
  • API Alignment: Aligned LLM.generate calls with the Sequence[PromptType] specification to ensure future-proof compatibility.

Comparison

Feature Before (Legacy) After (Modern)
Argument prompt_token_ids=inputs prompts=[{"prompt_token_ids": i} for i in inputs]
Data Type NumPy Array / Raw List List of Dictionaries (PromptType)
Stability Throws TypeError in new vLLM Fully compatible with 0.6.0+

Notes

  • Tested Environment: This change has been verified on the Spyre vLLM image.
  • Compatibility: While optimized for Spyre, the logic follows standard vLLM protocols and is expected to be portable to other backends.

GitHub Issue(s) to reference

Reminders

  • should this PR noted in the Changelog?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants