Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ while (page) {
</Code>

<Aside data-type="note">
Live messages may arrive via the subscription while you are still processing historical messages. Your agent should handle this by queueing live messages and processing them only after all historical messages have been processed.
Live messages can arrive via the subscription while you are still processing historical messages. Your agent should handle this by queueing live messages and processing them only after all historical messages have been processed.
</Aside>

This pattern provides guaranteed continuity between historical and live message processing by ensuring that:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ title: Message per response
meta_description: "Stream individual tokens from AI models into a single message over Ably."
---

Token streaming with message-per-response is a pattern where every token generated by your model is appended to a single Ably message. Each complete AI response then appears as one message in the channel history while delivering live tokens in realtime. This uses [Ably Pub/Sub](/docs/basics) for realtime communication between agents and clients.
Token streaming with message-per-response is a pattern where every token generated by your model for a given response is appended to a single Ably message. Each complete AI response then appears as one message in the channel history while delivering live tokens in realtime. This uses [Ably Pub/Sub](/docs/basics) for realtime communication between agents and clients.

This pattern is useful for chat-style applications where you want each complete AI response stored as a single message in history, making it easy to retrieve and display multi-response conversation history. Each agent response becomes a single message that grows as tokens are appended, allowing clients joining mid-stream to catch up efficiently without processing thousands of individual tokens.

Expand All @@ -22,10 +22,10 @@ Standard Ably message [size limits](/docs/platform/pricing/limits#message) apply

## Enable appends <a id="enable"/>

Message append functionality requires the "Message annotations, updates, deletes, and appends" [channel rule](/docs/channels#rules) enabled for your channel or [namespace](/docs/channels#namespaces).
Message append functionality requires "Message annotations, updates, deletes and appends" to be enabled in a [channel rule](/docs/channels#rules) associated with the channel.

<Aside data-type="important">
When the "Message updates and deletes" channel rule is enabled, messages are persisted regardless of whether or not persistence is enabled, in order to support the feature. This may increase your usage since [we charge for persisting messages](https://faqs.ably.com/how-does-ably-count-messages).
When the "Message updates and deletes" channel rule is enabled, messages are persisted irrespective of whether or not persistence has also been explicitly enabled. This will be reflected in increased usage since [we charge for persisting messages](https://faqs.ably.com/how-does-ably-count-messages).
</Aside>

To enable the channel rule:
Expand All @@ -34,7 +34,7 @@ To enable the channel rule:
2. Navigate to the "Configuration" > "Rules" section from the left-hand navigation bar.
3. Choose "Add new rule".
4. Enter a channel name or namespace pattern (e.g. `ai:*` for all channels starting with `ai:`).
5. Select the "Message annotations, updates, deletes, and appends" rule from the list.
5. Select the "Message annotations, updates, deletes and appends" option from the list.
6. Click "Create channel rule".

The examples on this page use the `ai:` namespace prefix, which assumes you have configured the rule for `ai:*`.
Expand Down Expand Up @@ -102,7 +102,7 @@ Subscribers receive different message actions depending on when they join and ho

- `message.create`: Indicates a new response has started (i.e. a new message was created). The message `data` contains the initial content (often empty or the first token). Store this as the beginning of a new response using `serial` as the identifier.
- `message.append`: Contains a single token fragment to append. The message `data` contains only the new token, not the full concatenated response. Append this token to the existing response identified by `serial`.
- `message.update`: Contains the complete response up to that point. The message `data` contains the full concatenated text so far. Replace the entire response content with this data for the message identified by `serial`. This action occurs when the channel needs to resynchronize the full message state, such as after a client [resumes](/docs/connect/states#resume) from a transient disconnection.
- `message.update`: Contains the whole response up to that point. The message `data` contains the full concatenated text so far. Replace the entire response content with this data for the message identified by `serial`. This action occurs when the channel needs to resynchronize the full message state, such as after a client [resumes](/docs/connect/states#resume) from a transient disconnection.

<Code>
```javascript
Expand Down Expand Up @@ -139,7 +139,7 @@ When clients connect or reconnect, such as after a page refresh, they often need
The message per response pattern enables efficient client state hydration without needing to process every individual token and supports seamlessly transitioning from historical responses to live tokens.

<Aside data-type="note">
For temporary disconnections, Ably's automatic [connection recovery](docs/connect/states#connection-state-recovery) ensures that clients receive all missed tokens in order.
For brief disconnections, Ably's automatic [connection recovery](docs/connect/states#connection-state-recovery) ensures that clients receive all missed tokens in order immediately on reconnection, and no additional client action is needed. The sections that follow describe actions that can be taken in response to longer disconnections, or situations such as a page refresh where client state is lost.
</Aside>

### Using rewind for recent history <a id="rewind"/>
Expand Down Expand Up @@ -275,7 +275,7 @@ for await (const event of stream) {
</Code>

<Aside data-type="note">
When appending tokens, include the `extras` with all headers to preserve them on the message. If you omit `extras` from an append operation, any existing headers will be removed. If you include `extras`, the headers completely replace any previous headers. This is the same [mixin behavior](/docs/messages/updates-deletes) used for message updates and deletes.
When appending tokens, include the `extras` with all headers to preserve them on the message. If you omit `extras` from an append operation, any existing headers will be removed. If you include `extras`, the headers completely supersede any previous headers. This is the same [mixin behavior](/docs/messages/updates-deletes) used for message updates and deletes.
</Aside>

#### Hydrate using rewind
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,12 @@ title: Message per token
meta_description: "Stream individual tokens from AI models as separate messages over Ably."
---

Token streaming with message-per-token is a pattern where every token generated by your model is published as its own Ably message. Each token then appears as one message in the channel history. This uses [Ably Pub/Sub](/docs/basics) for realtime communication between agents and clients.
Token streaming with message-per-token is a pattern where every token generated by your model is published as an independent Ably message. Each token then appears as one message in the channel history. This uses [Ably Pub/Sub](/docs/basics) for realtime communication between agents and clients.

This pattern is useful when clients only care about the most recent part of a response and you are happy to treat the channel history as a short sliding window rather than a full conversation log. For example:

- **Backend-stored responses**: The backend writes complete responses to a database and clients load those full responses from there, while Ably is used only to deliver live tokens for the current in-progress response.
- **Live transcription, captioning, or translation**: A viewer who joins a live stream only needs the last few tokens for the current "frame" of subtitles, not the entire transcript so far.
- **Live transcription, captioning, or translation**: A viewer who joins a live stream only needs sufficient tokens for the current "frame" of subtitles, not the entire transcript so far.
- **Code assistance in an editor**: Streamed tokens become part of the file on disk as they are accepted, so past tokens do not need to be replayed from Ably.
- **Autocomplete**: A fresh response is streamed for each change a user makes to a document, with only the latest suggestion being relevant.

Expand Down Expand Up @@ -116,7 +116,7 @@ for await (const event of stream) {

#### Subscribe to tokens

Use the `responseId` header in message extras to correlate tokens. The `responseId` allows you to group tokens belonging to the same response and correctly handle token delivery for multiple responses, even when delivered concurrently.
Use the `responseId` header in message extras to correlate tokens. The `responseId` allows you to group tokens belonging to the same response and correctly handle token delivery for distinct responses, even when delivered concurrently.

<Code>
```javascript
Expand Down Expand Up @@ -236,7 +236,7 @@ await channel.subscribe('stop', (message) => {
When clients connect or reconnect, such as after a page refresh, they often need to catch up on tokens that were published while they were offline or before they joined. Ably provides several approaches to hydrate client state depending on your application's requirements.

<Aside data-type="note">
For temporary disconnections, Ably's automatic [connection recovery](docs/connect/states#connection-state-recovery) ensures that clients receive all missed tokens in order.
For brief disconnections, Ably's automatic [connection recovery](docs/connect/states#connection-state-recovery) ensures that clients receive all missed tokens in order immediately on reconnection, and no additional client action is needed. The sections that follow describe actions that can be taken in response to longer disconnections, or situations such as a page refresh where client state is lost.
</Aside>

<Aside data-type="note">
Expand Down