v0.16.0 #433

ZachNagengast · 2026-03-03T02:49:23Z

ZachNagengast
Mar 3, 2026
Maintainer

Highlights

This release introduces TTSKit - a brand-new optional library that brings high-quality text-to-speech capabilities on-device using the latest CoreML features such as MLState and MLTensors for optimal inference on the Apple Neural Engine.

With this first release, we're launching Qwen3-TTS CustomVoice models 0.6b and 1.7b with instruction control, with more to come in future releases (including voice cloning).

Download, load, generate, and stream playback in 3 lines of code:

import TTSKit

let ttsKit = try await TTSKit()
try await ttsKit.play(text: "Hello from TTSKit!")

Key Features

Real-time adaptive streaming
- Plays audio while it's still generating for the fastest time from text input to first audio buffer output
- .auto mode adapts based on the inference speed of the device for consistent, smooth playback.
9 built-in voices
10 languages
Style instruction support (1.7B model only)
Automatic chunking for long form inputs
Audio file exports in wav/m4a format with optional metadata.
Modular protocol-based architecture (6 swappable Core ML components) for easy customization and future model adoption.

See the new TTSKit section in the README.md for full API docs, model selection, and advanced usage.

CLI

Try it out with the following command:

swift run -c release whisperkit-cli tts --text "Hello from TTSKit" --play

Also available via Homebrew upon release:

brew install whisperkit-cli
whisperkit-cli tts --text "Hello from TTSKit" --play

Gives full control over speaker, language, model variant, style, temperature, chunking strategy, compute units, seed for reproducibility, and more.

Example App

Along with the CLI, we're also releasing a new example app for developers to reference when building TTSKit into their apps. It features real-time waveform visualization, model management, persistent audio file history with metadata, and multi-platform support. Here's a screenshot:

More info about running this app in the example's README.md

Architecture Changes

New shared ArgmaxCore target for common utilities
TTSKit ships as an optional product in the same Swift package (no breaking changes to existing WhisperKit code).

.target(
    name: "YourApp",
    dependencies: [
        "WhisperKit", // speech-to-text
        "TTSKit",     // text-to-speech
    ]
),

The repo will be renamed to reflect the new multi-kit architecture in an upcoming release.

Thank you to @naykutguven and @shura-v for the excellent improvements packaged with this release prior to TTSKit listed below 🚀

What's Changed

Update doc for prewarm by @chen-argmax in Update doc for prewarm #387
Pin Xcode version as 26 for Github workflows by @naykutguven in Pin Xcode version as 26 for Github workflows #386
AudioProcessor: fix teardown to avoid StartIO/thread warnings on some Bluetooth devices by @shura-v in AudioProcessor: fix teardown to avoid StartIO/thread warnings on some Bluetooth devices #402
Mute-style input suppression without pausing AVAudioEngine by @shura-v in Mute-style input suppression without pausing AVAudioEngine #401
Add TTSKit with Qwen3-TTS support by @ZachNagengast in Add TTSKit with Qwen3-TTS support #425

New Contributors

@shura-v made their first contribution in AudioProcessor: fix teardown to avoid StartIO/thread warnings on some Bluetooth devices #402

Full Changelog: v0.15.0...v0.16.0

This discussion was created from the release v0.16.0.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.16.0 #433

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

v0.16.0 #433

Uh oh!

ZachNagengast Mar 3, 2026 Maintainer

Highlights

Key Features

CLI

Example App

Architecture Changes

What's Changed

New Contributors

Replies: 0 comments

ZachNagengast
Mar 3, 2026
Maintainer