-
Notifications
You must be signed in to change notification settings - Fork 46
Mike/ait token streaming OpenAI sdk #3074
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 7 commits
85109fc
2fd0330
b7116ee
aac77ad
cc361c8
1c06919
53b0736
878d248
ba71b9a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,381 @@ | ||||||
| --- | ||||||
| title: "Guide: Stream OpenAI responses using the message-per-token pattern" | ||||||
| meta_description: "Stream tokens from the OpenAI Responses API over Ably in realtime." | ||||||
| meta_keywords: "AI, token streaming, OpenAI, Responses API, AI transport, Ably, realtime" | ||||||
| --- | ||||||
|
|
||||||
| This guide shows you how to stream AI responses from OpenAI's [Responses API](https://platform.openai.com/docs/api-reference/responses) over Ably using the [message-per-token pattern](/docs/ai-transport/features/token-streaming/message-per-token). Specifically, it implements the [explicit start/stop events approach](/docs/ai-transport/features/token-streaming/message-per-token#explicit-events), which publishes each response token as an individual message, along with explicit lifecycle events to signal when responses begin and end. | ||||||
|
|
||||||
| Using Ably to distribute tokens from the OpenAI SDK enables you to broadcast AI responses to thousands of concurrent subscribers with reliable message delivery and ordering guarantees. This approach decouples your AI inference from client connections, enabling you to scale independently and handle reconnections gracefully while ensuring each client receives the complete response stream with all tokens delivered in order. | ||||||
|
|
||||||
| <Aside data-type="note"> | ||||||
| To discover other approaches to token streaming, including the [message-per-response](/docs/ai-transport/features/token-streaming/message-per-response) pattern, see the [token streaming](/docs/ai-transport/features/token-streaming) documentation. | ||||||
| </Aside> | ||||||
|
|
||||||
| ## Prerequisites <a id="prerequisites"/> | ||||||
|
|
||||||
| To follow this guide, you need: | ||||||
| - Node.js 20 or higher | ||||||
| - An OpenAI API key | ||||||
| - An Ably API key | ||||||
|
|
||||||
| Useful links: | ||||||
| - [OpenAI developer quickstart](https://platform.openai.com/docs/quickstart) | ||||||
| - [Ably JavaScript SDK getting started](/docs/getting-started/javascript) | ||||||
|
|
||||||
| Create a new NPM package, which will contain the publisher and subscriber code: | ||||||
|
|
||||||
| <Code> | ||||||
| ```shell | ||||||
| mkdir ably-openai-example && cd ably-openai-example | ||||||
| npm init -y | ||||||
| ``` | ||||||
| </Code> | ||||||
|
|
||||||
| Install the required packages using NPM: | ||||||
|
|
||||||
| <Code> | ||||||
| ```shell | ||||||
| npm install openai@^4 ably@^2 | ||||||
| ``` | ||||||
| </Code> | ||||||
|
|
||||||
| <Aside data-type="note"> | ||||||
| This guide uses version 4.x of the OpenAI SDK. Some details of interacting with the OpenAI SDK may diverge from those given here if using a different major version. | ||||||
|
||||||
| This guide uses version 4.x of the OpenAI SDK. Some details of interacting with the OpenAI SDK may diverge from those given here if using a different major version. | |
| This guide uses version 4.x of the OpenAI SDK. Some details of interacting with the OpenAI SDK may differ from those given here if using a different major version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
item.idcan be used to filter which tokens to stream.
I don't quite get this line. I think you're saying "Theitem.idwill be present on on all events relating to this item, so you can use it to filter which tokens are streamed to the clients" but there's probably a better way of saying it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| This is only a representative example for a simple "text in, text out" use case and may not reflect the exact sequence of events that you observe from the OpenAI API. It also does not describe response generation errors or refusals. For complete details on all event types and their properties, see [OpenAI Streaming events](https://platform.openai.com/docs/api-reference/responses-streaming/response). | |
| This is only an illustrative example for a simple "text in, text out" use case and may not reflect the exact sequence of events that you observe from the OpenAI API. It also does not describe response generation errors or refusals. For complete details on all event types and their properties, see [OpenAI Streaming events](https://platform.openai.com/docs/api-reference/responses-streaming/response). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we recommend checking for error responses from the publish?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a separate ticket to address this: https://ably.atlassian.net/browse/AIT-238
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| The implementation uses `responseId` in message extras to correlate tokens with their originating response. This enables multiple publishers to stream different responses concurrently on the same channel, with each subscriber correctly tracking all responses independently. | |
| The implementation uses `responseId` in message `extras` to correlate tokens with their originating response. This enables multiple publishers to stream different responses concurrently on the same channel, with each subscriber correctly tracking all responses independently. |
(Should we be quoting extras, name etc with backticks?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ba71b9a