Commit 4da154b
authored
feat: audio speech to text partition (#4264)
## Summary
Enables partitioning of WAV audio files into document elements by
transcribing with an optional speech-to-text (STT) agent, defaulting to
Whisper.
Closes #4029
## Changes:
- New partition_audio() and routing for FileType.WAV so partition()
supports audio.
- Pluggable STT layer: SpeechToTextAgent interface and
SpeechToTextAgentWhisper implementation.
- Optional extra audio in pyproject.toml (openai-whisper); all-docs
includes audio.
- Config: STT_AGENT (and STT_AGENT_MODULES_WHITELIST) for choosing the
STT implementation.
## Usage
pip install "unstructured[audio]" then partition("file.wav") or
partition_audio("file.wav", language="en").1 parent 6aeb74f commit 4da154b
File tree
21 files changed
+1015
-41
lines changed- test_unstructured
- file_utils
- partition
- unstructured
- documents
- file_utils
- partition
- common
- utils
- speech_to_text
- staging
21 files changed
+1015
-41
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
1 | 6 | | |
2 | 7 | | |
3 | 8 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
111 | 111 | | |
112 | 112 | | |
113 | 113 | | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
114 | 118 | | |
115 | | - | |
| 119 | + | |
116 | 120 | | |
117 | 121 | | |
118 | 122 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
169 | 169 | | |
170 | 170 | | |
171 | 171 | | |
| 172 | + | |
| 173 | + | |
172 | 174 | | |
173 | 175 | | |
174 | 176 | | |
175 | 177 | | |
176 | 178 | | |
| 179 | + | |
177 | 180 | | |
178 | 181 | | |
179 | 182 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
56 | 56 | | |
57 | 57 | | |
58 | 58 | | |
59 | | - | |
| 59 | + | |
60 | 60 | | |
61 | 61 | | |
62 | 62 | | |
| |||
76 | 76 | | |
77 | 77 | | |
78 | 78 | | |
79 | | - | |
| 79 | + | |
80 | 80 | | |
81 | 81 | | |
82 | 82 | | |
| |||
98 | 98 | | |
99 | 99 | | |
100 | 100 | | |
101 | | - | |
| 101 | + | |
102 | 102 | | |
103 | 103 | | |
104 | 104 | | |
| |||
119 | 119 | | |
120 | 120 | | |
121 | 121 | | |
122 | | - | |
| 122 | + | |
123 | 123 | | |
124 | 124 | | |
125 | 125 | | |
| |||
163 | 163 | | |
164 | 164 | | |
165 | 165 | | |
| 166 | + | |
166 | 167 | | |
167 | 168 | | |
168 | 169 | | |
169 | 170 | | |
170 | 171 | | |
171 | | - | |
172 | | - | |
173 | | - | |
| 172 | + | |
174 | 173 | | |
175 | 174 | | |
176 | 175 | | |
| |||
189 | 188 | | |
190 | 189 | | |
191 | 190 | | |
| 191 | + | |
192 | 192 | | |
193 | 193 | | |
194 | 194 | | |
195 | 195 | | |
196 | 196 | | |
197 | 197 | | |
198 | 198 | | |
199 | | - | |
200 | | - | |
201 | | - | |
| 199 | + | |
202 | 200 | | |
203 | 201 | | |
204 | 202 | | |
| |||
217 | 215 | | |
218 | 216 | | |
219 | 217 | | |
| 218 | + | |
220 | 219 | | |
221 | 220 | | |
222 | 221 | | |
| |||
0 commit comments