Skip to content

Commit db135e2

Browse files
committed
Docs add NVIDIA Magpie Multilingual tts service
1 parent 79dc6c9 commit db135e2

File tree

2 files changed

+181
-18
lines changed

2 files changed

+181
-18
lines changed

server/services/supported-services.mdx

+19-18
Original file line numberDiff line numberDiff line change
@@ -52,24 +52,25 @@ description: "AI services integrated with Pipecat and their setup requirements"
5252

5353
## Text-to-Speech
5454

55-
| Service | Setup |
56-
| -------------------------------------------------- | -------------------------------------- |
57-
| [Amazon Polly](/server/services/tts/aws) | `pip install "pipecat-ai[aws]"` |
58-
| [Azure](/server/services/tts/azure) | `pip install "pipecat-ai[azure]"` |
59-
| [Cartesia](/server/services/tts/cartesia) | `pip install "pipecat-ai[cartesia]"` |
60-
| [Deepgram](/server/services/tts/deepgram) | `pip install "pipecat-ai[deepgram]"` |
61-
| [ElevenLabs](/server/services/tts/elevenlabs) | `pip install "pipecat-ai[elevenlabs]"` |
62-
| [Fish](/server/services/tts/fish) | `pip install "pipecat-ai[fish]"` |
63-
| [Google](/server/services/tts/google) | `pip install "pipecat-ai[google]"` |
64-
| [Groq](/server/services/tts/groq) | `pip install "pipecat-ai[groq]"` |
65-
| [LMNT](/server/services/tts/lmnt) | `pip install "pipecat-ai[lmnt]"` |
66-
| [Neuphonic](/server/services/tts/neuphonic) | `pip install "pipecat-ai[neuphonic]"` |
67-
| [NVIDIA FastPitch](/server/services/tts/fastpitch) | `pip install "pipecat-ai[riva]"` |
68-
| [OpenAI](/server/services/tts/openai) | `pip install "pipecat-ai[openai]"` |
69-
| [Piper](/server/services/tts/piper) | No dependencies required |
70-
| [PlayHT](/server/services/tts/playht) | `pip install "pipecat-ai[playht]"` |
71-
| [Rime](/server/services/tts/rime) | `pip install "pipecat-ai[rime]"` |
72-
| [XTTS](/server/services/tts/xtts) | `pip install "pipecat-ai[xtts]"` |
55+
| Service | Setup |
56+
| --------------------------------------------------------- | -------------------------------------- |
57+
| [Amazon Polly](/server/services/tts/aws) | `pip install "pipecat-ai[aws]"` |
58+
| [Azure](/server/services/tts/azure) | `pip install "pipecat-ai[azure]"` |
59+
| [Cartesia](/server/services/tts/cartesia) | `pip install "pipecat-ai[cartesia]"` |
60+
| [Deepgram](/server/services/tts/deepgram) | `pip install "pipecat-ai[deepgram]"` |
61+
| [ElevenLabs](/server/services/tts/elevenlabs) | `pip install "pipecat-ai[elevenlabs]"` |
62+
| [Fish](/server/services/tts/fish) | `pip install "pipecat-ai[fish]"` |
63+
| [Google](/server/services/tts/google) | `pip install "pipecat-ai[google]"` |
64+
| [Groq](/server/services/tts/groq) | `pip install "pipecat-ai[groq]"` |
65+
| [LMNT](/server/services/tts/lmnt) | `pip install "pipecat-ai[lmnt]"` |
66+
| [Neuphonic](/server/services/tts/neuphonic) | `pip install "pipecat-ai[neuphonic]"` |
67+
| [NVIDIA FastPitch](/server/services/tts/fastpitch) | `pip install "pipecat-ai[riva]"` |
68+
| [NVIDIA Magpie Multilingual](/server/services/tts/magpie) | `pip install "pipecat-ai[riva]"` |
69+
| [OpenAI](/server/services/tts/openai) | `pip install "pipecat-ai[openai]"` |
70+
| [Piper](/server/services/tts/piper) | No dependencies required |
71+
| [PlayHT](/server/services/tts/playht) | `pip install "pipecat-ai[playht]"` |
72+
| [Rime](/server/services/tts/rime) | `pip install "pipecat-ai[rime]"` |
73+
| [XTTS](/server/services/tts/xtts) | `pip install "pipecat-ai[xtts]"` |
7374

7475
## Speech-to-Speech
7576

server/services/tts/magpie.mdx

+162
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,162 @@
1+
---
2+
title: "NVIDIA Magpie"
3+
description: "Text-to-speech service implementation using NVIDIA’s Magpie model"
4+
---
5+
6+
## Overview
7+
8+
`MagpieTTSService` converts text to speech using NVIDIA's Riva `magpie-tts-multilingual` model. It provides high-quality text-to-speech synthesis with configurable voice options, including multilingual voices.
9+
10+
## Installation
11+
12+
To use `MagpieTTSService`, install the required dependencies:
13+
14+
```bash
15+
pip install "pipecat-ai[riva]"
16+
```
17+
18+
You'll also need to set up your NVIDIA API key as an environment variable: `NVIDIA_API_KEY`
19+
20+
## Configuration
21+
22+
### Constructor Parameters
23+
24+
<ParamField path="api_key" type="str" required>
25+
Your NVIDIA API key
26+
</ParamField>
27+
28+
<ParamField path="server" type="str" default="grpc.nvcf.nvidia.com:443">
29+
NVIDIA Riva server address
30+
</ParamField>
31+
32+
<ParamField path="voice_id" type="str" default="English-US.Female-1">
33+
Voice identifier to use for synthesis
34+
</ParamField>
35+
36+
<ParamField path="sample_rate" type="int" default="None">
37+
Output audio sample rate in Hz
38+
</ParamField>
39+
40+
<ParamField
41+
path="function_id"
42+
type="str"
43+
default="0149dedb-2be8-4195-b9a0-e57e0e14f972"
44+
>
45+
NVIDIA function identifier for the TTS service
46+
</ParamField>
47+
48+
<ParamField path="params" type="InputParams" default="InputParams()">
49+
Additional configuration parameters (language and quality)
50+
</ParamField>
51+
52+
### InputParams
53+
54+
<ParamField path="language" type="Language" default="Language.EN_US">
55+
The language for TTS generation
56+
</ParamField>
57+
58+
<ParamField path="quality" type="int" default="20">
59+
Quality level for the generated audio
60+
</ParamField>
61+
62+
## Input
63+
64+
The service accepts text input through its TTS pipeline.
65+
66+
## Output Frames
67+
68+
### TTSStartedFrame
69+
70+
Signals the start of audio generation.
71+
72+
### TTSAudioRawFrame
73+
74+
Contains generated audio data:
75+
76+
<ParamField path="audio" type="bytes">
77+
Raw audio data chunk
78+
</ParamField>
79+
80+
<ParamField path="sample_rate" type="int">
81+
Audio sample rate
82+
</ParamField>
83+
84+
<ParamField path="num_channels" type="int">
85+
Number of audio channels (1 for mono)
86+
</ParamField>
87+
88+
### TTSStoppedFrame
89+
90+
Signals the completion of audio generation.
91+
92+
## Methods
93+
94+
See the [TTS base class methods](/server/base-classes/speech#ttsservice) for additional functionality.
95+
96+
## Language Support
97+
98+
Magpie TTS primarily supports English with various regional accents:
99+
100+
| Language Code | Description | Service Codes |
101+
| ---------------- | --------------- | ------------- |
102+
| `Language.EN_US` | English (US) | `en-US` |
103+
| `Language.ES-US` | Spanish (US) | `es-US` |
104+
| `Language.FR-FR` | French (France) | `fr-FR` |
105+
106+
## Usage Example
107+
108+
```python
109+
from pipecat.services.riva.tts import MagpieTTSService
110+
from pipecat.transcriptions.language import Language
111+
112+
# Configure service
113+
tts = MagpieTTSService(
114+
api_key="your-nvidia-api-key",
115+
voice_id="Magpie-Multilingual.ES-US.Male.Male-1",
116+
params=MagpieTTSService.InputParams(
117+
language=Language.ES_US,
118+
quality=20
119+
)
120+
)
121+
122+
# Use in pipeline
123+
pipeline = Pipeline([
124+
...,
125+
llm,
126+
tts,
127+
transport.output(),
128+
])
129+
```
130+
131+
## Frame Flow
132+
133+
```mermaid
134+
graph TD
135+
A[TextFrame] --> B[MagpieTTSService]
136+
B --> C[TTSStartedFrame]
137+
B --> D[TTSAudioRawFrame]
138+
B --> E[TTSStoppedFrame]
139+
B --> F[ErrorFrame]
140+
```
141+
142+
## Metrics Support
143+
144+
The service supports metrics collection:
145+
146+
- Time to First Byte (TTFB)
147+
- TTS usage metrics
148+
- Processing duration
149+
150+
## Audio Processing
151+
152+
- Processes audio through the Riva API
153+
- Generates mono audio output
154+
- Handles asynchronous audio streaming
155+
- Configurable sampling rate
156+
157+
## Notes
158+
159+
- Uses NVIDIA's Riva AI Services platform
160+
- Streams audio in chunks
161+
- Requires valid NVIDIA API key
162+
- Thread-safe processing with asyncio

0 commit comments

Comments
 (0)