Conversation
|
|
||
| @MainActor | ||
| @Observable | ||
| final class ViewModel: @unchecked Sendable { |
There was a problem hiding this comment.
I would break this down to smaller viewmodels if it goes too long, e.g DownloadViewModel vs. TTSViewModel
| @@ -1,5 +1,5 @@ | |||
| { | |||
There was a problem hiding this comment.
why do we need to change this?
There was a problem hiding this comment.
Had an old swift-transformers resolved
| /// Thin wrapper around `os_unfair_lock` that exposes a Swift-friendly | ||
| /// `withLock` helper. This lock is non-reentrant and optimized for low | ||
| /// contention, matching the semantics of Core Foundation's unfair lock. | ||
| public final class UnfairLock: @unchecked Sendable { |
There was a problem hiding this comment.
I think we want to make this class name generic for future proof with swift6, seems os_unfair_lock is not the recommended way to lock in swift 6.
probably rename it Mutext so we can reimp it with actual Swift.Mutext later
now
public final class Mutex: @unchecked Sendable {
private let lock = OSAllocatedUnfairLock()
public init() {}
@inlinable
public func withLock<T>(_ body: () throws -> T) rethrows -> T {
try lock.withLock(body)
}
}
later
public final class Mutex: Sendable {
private let mutex: Swift.Mutex
public init(_ value: Value) {
self.mutex = Mutex(value)
}
public func withLock<T>(_ body: (inout Value) throws -> T) rethrows -> T {
try mutex.withLock(body)
}
}
There was a problem hiding this comment.
should we consider adding another package under ArgmaxCore? like ArgmaxCore/CoreML
| /// | ||
| /// Downloads only the files matching the configured component variants. | ||
| /// Files are cached locally by the Hub library. | ||
| open class func download( |
There was a problem hiding this comment.
should we decouple model download from TTSKit? ArgmaxCore could provide a downloader for this
There was a problem hiding this comment.
Yep have some todos relating to this
Sources/TTSKit/TTSModels.swift
Outdated
| // Copyright © 2026 Argmax, Inc. All rights reserved. | ||
|
|
||
| import Accelerate | ||
| @_exported import ArgmaxCore |
| ) | ||
|
|
||
| XCTAssertGreaterThan(result.audio.count, 0, "Audio samples should be non-empty") | ||
| XCTAssertGreaterThan(result.audioDuration, 1.0, "Expect at least 1s of speech") |
There was a problem hiding this comment.
will seed guarantee the audio length is always deterministic?
There was a problem hiding this comment.
Yup, apple docs recommend using this method https://developer.apple.com/documentation/swift/randomnumbergenerator#Conforming-to-the-RandomNumberGenerator-Protocol
| // For licensing see accompanying LICENSE.md file. | ||
| // Copyright © 2024 Argmax, Inc. All rights reserved. | ||
|
|
||
| import ArgmaxCore |
There was a problem hiding this comment.
I think we would want to break these test down to isolated class test.
e.g1 TTSKitTest.swift that injects a Config with mocked components, and verify
TTSKitTest.generateSpeech interacts with the components correctly, tasks created etc.
e.g2 Qwen3TTSGenerateTaskTest.swfit that inejcts mocked components. verify run interacts with them correctly
| /// owns its own sampler (derived seed) so concurrent tasks don't share RNG state. | ||
| /// Model components are shared read-only references - `MLModel.prediction()` is | ||
| /// thread-safe. The class is `@unchecked Sendable` to permit `open` subclassing. | ||
| open class TTSGenerateTask: @unchecked Sendable, TTSGenerating { |
There was a problem hiding this comment.
Should the class be renamed to Qwen3TTSGenerateTask ? ditto to other files under Qwen3TTS
| /// Serializes access to a value with an `os_unfair_lock` so mutation stays | ||
| /// thread-safe. Useful for properties on types marked `@unchecked Sendable`. | ||
| @propertyWrapper | ||
| public struct PropertyLock<Value: Codable & Sendable>: Sendable, Codable { |
There was a problem hiding this comment.
TLDR; @ZachNagengast @chen-argmax guys, this doesn't make reference or value type properties truly thread safe.
I was playing around with this and trying to move Sendable and Codable conformances outside. Did some verifications on the current implementation and mine. Ran the snippet below with different variations
- Reference type property
Ref - Value type property
Ref - Plain property of type
Int
None of them was safe. Locking accessors isn't enough. We need to wrap mutations with locks
final class Ref: Codable, @unchecked Sendable {
var count: Int
init(count: Int = 0) { self.count = count }
enum CodingKeys: String, CodingKey { case count }
required init(from decoder: Decoder) throws {
let c = try decoder.container(keyedBy: CodingKeys.self)
self.count = try c.decode(Int.self, forKey: .count)
}
func encode(to encoder: Encoder) throws {
var c = encoder.container(keyedBy: CodingKeys.self)
try c.encode(count, forKey: .count)
}
}
final class Holder: @unchecked Sendable {
@TranscriptionPropertyLock var ref = Ref()
}
@main
struct Main {
static func main() async {
let workers = max(2, ProcessInfo.processInfo.activeProcessorCount * 2)
let perWorker = 50_000
let expected = workers * perWorker
print("workers=\(workers), perWorker=\(perWorker), expected=\(expected)")
for run in 1...10 {
let holder = Holder()
await withTaskGroup(of: Void.self) { group in
for _ in 0..<workers {
group.addTask {
for _ in 0..<perWorker {
holder.ref.count += 1
}
}
}
}
let final = holder.ref.count
print("run \(run): expected=\(expected) actual=\(final)")
}
}
}There was a problem hiding this comment.
The approach used in AudioProcessor PR (also the WhisperKit PR) works. Snippet below:
import Foundation
import os.lock
@usableFromInline
final class UnfairLock: @unchecked Sendable {
@usableFromInline
var lock = os_unfair_lock()
@inlinable
func withLock<T>(_ body: () throws -> T) rethrows -> T {
os_unfair_lock_lock(&lock)
defer { os_unfair_lock_unlock(&lock) }
return try body()
}
}
final class Ref {
var count = 0
}
final class HolderInt: @unchecked Sendable {
private let stateLock = UnfairLock()
private var countStorage = 0
var count: Int {
get {
stateLock.withLock { countStorage }
}
set {
stateLock.withLock { countStorage = newValue }
}
}
func increment() {
stateLock.withLock {
countStorage += 1
}
}
}
final class HolderRef: @unchecked Sendable {
private let stateLock = UnfairLock()
private let refStorage = Ref()
var refCount: Int {
stateLock.withLock { refStorage.count }
}
func incrementRef() {
stateLock.withLock {
refStorage.count += 1
}
}
}
@main
struct Main {
static func main() async {
let workers = max(2, ProcessInfo.processInfo.activeProcessorCount * 2)
let perWorker = 50_000
let expected = workers * perWorker
print("[Int] workers=\(workers), perWorker=\(perWorker), expected=\(expected)")
for run in 1...10 {
let holder = HolderInt()
await withTaskGroup(of: Void.self) { group in
for _ in 0..<workers {
group.addTask {
for _ in 0..<perWorker {
holder.increment()
}
}
}
}
let final = holder.count
print("[Int] run \(run): expected=\(expected) actual=\(final)")
}
print("[Ref] workers=\(workers), perWorker=\(perWorker), expected=\(expected)")
for run in 1...10 {
let holder = HolderRef()
await withTaskGroup(of: Void.self) { group in
for _ in 0..<workers {
group.addTask {
for _ in 0..<perWorker {
holder.incrementRef()
}
}
}
}
let final = holder.refCount
print("[Ref] run \(run): expected=\(expected) actual=\(final)")
}
}
}There was a problem hiding this comment.
This is a valid concern, essentially if the property wrapped property has another property, read/write wont' be thread safe.
e.g this is thread safe
holder.ref = otherRef
this is not thread safe
holder.ref.count += 1
@ZachNagengast we may want to add document for this wrapper.
There was a problem hiding this comment.
I think it isn't safe to use it for pure value type properties e.g. Int either. we probably need to use _modify instead of set.
I am checking these resources:
There was a problem hiding this comment.
Making a note in this PR but will leave the fix to a followup 👍
|
I am trying to run the 1.7B model on macbook air m1, and although the 0.6B version worked fine, in the 1.7B, It first specialize the model for the device, than loading and when it was generating, it stopped and throws this error :- Unable to compute the prediction using ML Program. It can be an invalid input data or broken/unsupported model. |
chen-argmax
left a comment
There was a problem hiding this comment.
approved with a comment to add doc toPropertyLock
WhisperKit is expanding into text-to-speech!
TTSKit adds a new library for on-device text-to-speech using Core ML-accelerated Qwen3-TTS models (CustomVoice 0.6B and 1.7B in this first release) with real-time streaming playback on Apple Silicon. In this first PR, we're introducing the library into the WhisperKit package (WhisperKit will be renamed to reflect the new multi-Kit nature of Argmax Open-source SDK) as an optional import to add real-time TTS capabilities with a state-of-the-art open-source model, either on its own or as a complement to WhisperKit speech-to-text.
This PR is still in the final phases of development, but here are a few highlights:
TTSKit Library
TextProjecting,CodeEmbedding,MultiCodeEmbedding,CodeDecoding,MultiCodeDecoding,SpeechDecoding) for plugging in new model backends.TTSPlaybackStrategy.auto) that measures first-step latency to pre-buffer just enough audio.Example usage playing audio in real-time out of the default speaker:
New target: ArgmaxCore
CLI
ttsthat can be used like this:swift run whisperkit-cli tts --text "Hello from TTSKit" --playTTSKit Example app
Roadmap
We plan to continue to add support for state-of-the-art models and improve inference latency for TTSKit over the next few weeks. The immediate follow-up is the voice cloning feature from Qwen3-TTS and a 2x reduction in time-to-first-byte (TTFB) so this on-device project achieves a consistent sub-100 ms, providing a latency edge over cloud deployments of the same model. In the meantime, we encourage anyone reading this to check out this PR, give it a spin, and let us know how it goes!