Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change ChatCompletionChunk to align with "OpenAI Chat Completions str… #3003

Closed
wants to merge 2 commits into from

Conversation

sywangyi
Copy link
Contributor

…eaming API"

When stream_options: {"include_usage": true} is included, choices is None only for the last chunk, and usage is always None except for the last chunk.

see https://cookbook.openai.com/examples/how_to_stream_completions#4-how-to-get-token-usage-data-for-streamed-chat-completion-response
@OlivierDehaene OR @Narsil

…eaming API"

When stream_options: {"include_usage": true} is included, choices is None only for the last chunk, and usage is always None except for the last chunk.

Signed-off-by: Wang, Yi A <[email protected]>
@sywangyi
Copy link
Contributor Author

details please refer discussion in vllm-project/vllm#12394

Signed-off-by: Wang, Yi A <[email protected]>
@drbh drbh mentioned this pull request Feb 10, 2025
@Narsil
Copy link
Collaborator

Narsil commented Feb 10, 2025

What's broken otherwise ?

Our baseline is if it breaks the OpenAI client we try to fix it (depending on how much it's tying to internals it's sometimes hard to be ISO compatible), and otherwise we try to minimize the amount of bits we send (that are not useful) and therefore we don't send what's not needed.

We don't look that much as the actual payload they send/receive as they tend to change them.

@Narsil
Copy link
Collaborator

Narsil commented Mar 10, 2025

Closing this as TGI doesn't need to output None all the time. The OpenAI client will put them back, and adding unecessary values (and there's already some that are still there) is just wasted bandwidth.

@Narsil
Copy link
Collaborator

Narsil commented Mar 10, 2025

This has been merged in a different form in the follow-up PR: #3007

The core idea is the same, but lots of testing code was modified to make the tests more consistent a checking for the actual behavior of this flag (plus simplifying some internal structures).

@Narsil Narsil closed this Mar 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants