-
Notifications
You must be signed in to change notification settings - Fork 46
Mike/ait token streaming OpenAI sdk #3074
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mike/ait token streaming OpenAI sdk #3074
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
a06f5d1 to
52b2e41
Compare
52b2e41 to
3411525
Compare
3411525 to
b7116ee
Compare
Uses numbered steps in a tutorial-like format. Simplifies the code and descriptions of the event streaming model, restricting to the relevant context for this tutorial. Improves copy and code for readability.
Indicates use of the message per token pattern.
paddybyers
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
| </Code> | ||
|
|
||
| <Aside data-type="note"> | ||
| This guide uses version 4.x of the OpenAI SDK. Some details of interacting with the OpenAI SDK may diverge from those given here if using a different major version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| This guide uses version 4.x of the OpenAI SDK. Some details of interacting with the OpenAI SDK may diverge from those given here if using a different major version. | |
| This guide uses version 4.x of the OpenAI SDK. Some details of interacting with the OpenAI SDK may differ from those given here if using a different major version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| </Code> | ||
|
|
||
| <Aside data-type="note"> | ||
| This is only a representative example for a simple "text in, text out" use case and may not reflect the exact sequence of events that you observe from the OpenAI API. It also does not describe response generation errors or refusals. For complete details on all event types and their properties, see [OpenAI Streaming events](https://platform.openai.com/docs/api-reference/responses-streaming/response). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| This is only a representative example for a simple "text in, text out" use case and may not reflect the exact sequence of events that you observe from the OpenAI API. It also does not describe response generation errors or refusals. For complete details on all event types and their properties, see [OpenAI Streaming events](https://platform.openai.com/docs/api-reference/responses-streaming/response). | |
| This is only an illustrative example for a simple "text in, text out" use case and may not reflect the exact sequence of events that you observe from the OpenAI API. It also does not describe response generation errors or refusals. For complete details on all event types and their properties, see [OpenAI Streaming events](https://platform.openai.com/docs/api-reference/responses-streaming/response). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Publishes a `stop` event when the response completes | ||
|
|
||
| <Aside data-type="note"> | ||
| Ably messages are published without `await` to maximize throughput. Ably maintains message ordering even without awaiting each publish. For more information, see [Publishing tokens](/docs/ai-transport/features/token-streaming/message-per-token#publishing). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we recommend checking for error responses from the publish?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a separate ticket to address this: https://ably.atlassian.net/browse/AIT-238
|
|
||
| ### Publishing concurrent responses <a id="multiple-publishers"/> | ||
|
|
||
| The implementation uses `responseId` in message extras to correlate tokens with their originating response. This enables multiple publishers to stream different responses concurrently on the same channel, with each subscriber correctly tracking all responses independently. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| The implementation uses `responseId` in message extras to correlate tokens with their originating response. This enables multiple publishers to stream different responses concurrently on the same channel, with each subscriber correctly tracking all responses independently. | |
| The implementation uses `responseId` in message `extras` to correlate tokens with their originating response. This enables multiple publishers to stream different responses concurrently on the same channel, with each subscriber correctly tracking all responses independently. |
(Should we be quoting extras, name etc with backticks?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| This guide shows you how to stream AI responses from OpenAI's [Responses API](https://platform.openai.com/docs/api-reference/responses) over Ably using the [message-per-token pattern](/docs/ai-transport/features/token-streaming/message-per-token). Specifically, it implements the [explicit start/stop events approach](/docs/ai-transport/features/token-streaming/message-per-token#explicit-events), which publishes each response token as an individual message, along with explicit lifecycle events to signal when responses begin and end. | ||
|
|
||
| Using Ably to distribute tokens from the OpenAI SDK enables you to broadcast AI responses to thousands of concurrent subscribers with reliable message delivery and ordering guarantees. This approach decouples your AI inference from client connections, enabling you to scale independently and handle reconnections gracefully while ensuring each client receives the complete response stream with all tokens delivered in order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Using Ably to distribute tokens from the OpenAI SDK enables you to broadcast AI responses to thousands of concurrent subscribers with reliable message delivery and ordering guarantees. This approach decouples your AI inference from client connections, enabling you to scale independently and handle reconnections gracefully while ensuring each client receives the complete response stream with all tokens delivered in order. | |
| Using Ably to distribute tokens from the OpenAI SDK enables you to broadcast AI responses to thousands of concurrent subscribers with reliable message delivery and ordering guarantees, ensuring that each client receives the complete response stream with all tokens delivered in order. This approach decouples your AI inference from client connections, enabling you to scale each independently and handle reconnections gracefully. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| - [`response.created`](https://platform.openai.com/docs/api-reference/responses-streaming/response/created): Signals the start of a response. Contains `response.id` to correlate subsequent events. | ||
|
|
||
| - [`response.output_item.added`](https://platform.openai.com/docs/api-reference/responses-streaming/response/output_item/added): Indicates a new output item. If `item.type === "message"` the item contains model response text; other types may be specified, such as `"reasoning"` for internal reasoning tokens. The `item.id` can be used to filter which tokens to stream. The `output_index` indicates the position of this item in the response's output array. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
item.idcan be used to filter which tokens to stream.
I don't quite get this line. I think you're saying "Theitem.idwill be present on on all events relating to this item, so you can use it to filter which tokens are streamed to the clients" but there's probably a better way of saying it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Description
This PR refactors the OpenAI SDK guide for the message-per-token streaming pattern.
Checklist