-
Notifications
You must be signed in to change notification settings - Fork 351
migrate from tool to event for LLM prefill progress #1459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughAdds an LLMEvent enum and wires an event-emission pathway from the local-LLM server to the Tauri frontend. ServerState now accepts and stores an emitter, propagating progress via LLMEvent::Progress. start_server injects the emitter. Specta is configured to include the event type. Streaming no longer carries progress; it’s emitted separately. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Frontend
participant TauriApp as Tauri App
participant Ext as start_server (ext.rs)
participant Server as ServerState (server.rs)
participant Provider as LocalProvider
participant Model as Model Runtime
participant EventBus as tauri_specta::Event
Frontend->>TauriApp: Invoke start_server
TauriApp->>Ext: start_server()
note right of Ext: Create emitter closure that calls<br/>EventBus.emit(LLMEvent)
Ext->>Server: ServerState::new(emitter, model_manager)
Server->>Provider: Construct with emitter
Frontend->>TauriApp: Request chat completions
TauriApp->>Server: chat_completions(...)
Server->>Provider: build_chat_completion_response(progress_fn=emitter, ...)
Provider->>Model: Generate tokens with progress callback
loop Progress updates
Model-->>Provider: progress(f64)
Provider-->>Server: progress_fn(f64)
Server-->>EventBus: emit LLMEvent::Progress(value)
EventBus-->>Frontend: progress event
end
Model-->>Provider: final result
Provider-->>Server: stream deltas (no progress in stream)
Server-->>TauriApp: streaming response
TauriApp-->>Frontend: streaming response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Pre-merge checks (1 passed, 2 warnings)❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Flag potential breaking changes that are not documented:
1. Identify changes to public APIs/exports, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints (including removed/renamed items and changes to types, required params, return values, defaults, or behavior).
2. Ignore purely internal/private changes (e.g., code not exported from package entry points or marked internal).
3. Verify documentation exists: a "Breaking Change" section in the PR description and updates to CHANGELOG.md. Please share your feedback with us on this Discord post. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
plugins/local-llm/src/server.rs (1)
444-497
: Tool-call chunk index is incorrect; use a per-tool-call counter, not the stream item index.
.enumerate()
counts all events (text, progress, tool), so the index you emit can jump or overflow spec expectations. Track a local counter for tool calls instead.Apply this diff:
- let stream = Box::pin( - source_stream - .enumerate() - .map(move |(index, event)| { + let mut tool_idx: u32 = 0; + let stream = Box::pin( + source_stream + .map(move |event| { let delta_template = empty_stream_response_delta.clone(); let response_template = base_stream_response_template.clone(); match event { StreamEvent::Response(llama_response) => match llama_response { hypr_llama::Response::TextDelta(chunk) => { Some(Ok(CreateChatCompletionStreamResponse { choices: vec![ChatChoiceStream { index: 0, delta: ChatCompletionStreamResponseDelta { content: Some(chunk), ..delta_template }, finish_reason: None, logprobs: None, }], ..response_template })) } hypr_llama::Response::Reasoning(_) => None, hypr_llama::Response::ToolCall { name, arguments } => { + // OpenAI spec: index is per-tool-call sequence within the message, starting at 0. + let idx = tool_idx; + tool_idx = tool_idx.saturating_add(1); Some(Ok(CreateChatCompletionStreamResponse { choices: vec![ChatChoiceStream { index: 0, delta: ChatCompletionStreamResponseDelta { tool_calls: Some(vec![ ChatCompletionMessageToolCallChunk { - index: index.try_into().unwrap_or(0), + index: idx, id: Some(uuid::Uuid::new_v4().to_string()), r#type: Some(ChatCompletionToolType::Function), function: Some(FunctionCallStream { name: Some(name), arguments: Some( serde_json::to_string(&arguments) .unwrap_or_default(), ), }), }, ]), ..delta_template }, finish_reason: None, logprobs: None, }], ..response_template })) } }, StreamEvent::Progress(progress) => { progress_fn(progress); None } } }) .filter_map(|x| async move { x }), );
🧹 Nitpick comments (3)
plugins/local-llm/src/events.rs (1)
34-38
: Event shape looks good; consider documenting expected range.If progress is normalized (0.0–1.0), add a short doc comment so frontend semantics stay consistent across providers.
#[derive(serde::Serialize, Clone, specta::Type, tauri_specta::Event)] pub enum LLMEvent { + /// Normalized generation progress in [0.0, 1.0]. #[serde(rename = "progress")] Progress(f64), }
plugins/local-llm/src/server.rs (2)
236-244
: Avoid unbounded progress buffering; coalesce withwatch
channel.Progress can be high-frequency;
mpsc::unbounded_channel
risks unbounded growth.watch
keeps only the latest value.- let (progress_sender, mut progress_receiver) = mpsc::unbounded_channel::<f64>(); + let (progress_sender, mut progress_receiver) = tokio::sync::watch::channel::<f64>(0.0); let (response_stream, cancellation_token) = model.generate_stream_with_callback( request, Box::new(move |v| { - let _ = progress_sender.send(v); + let _ = progress_sender.send(v); }), )?; let mixed_stream = async_stream::stream! { tokio::pin!(response_stream); loop { tokio::select! { response_result = response_stream.next() => { match response_result { Some(response) => yield StreamEvent::Response(response), None => break, } }, - progress_result = progress_receiver.recv() => { - match progress_result { - Some(progress) => yield StreamEvent::Progress(progress), - None => {} - } - } + _ = progress_receiver.changed() => { + yield StreamEvent::Progress(*progress_receiver.borrow()); + } } } };Also applies to: 245-264, 240-243, 256-261
169-179
: Progress events lack correlation; add a request id to avoid ambiguity under concurrency.If multiple chats run concurrently,
LLMEvent::Progress(f64)
is ambiguous. Consider emittingLLMEvent::Progress { id, value }
whereid
matches the response id.Would you like a follow-up patch that threads a
Uuid
frombuild_chat_completion_response
intoprogress_fn
and extendsLLMEvent
accordingly?Also applies to: 327-335
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
plugins/local-llm/src/events.rs
(1 hunks)plugins/local-llm/src/ext.rs
(2 hunks)plugins/local-llm/src/lib.rs
(1 hunks)plugins/local-llm/src/server.rs
(9 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{js,ts,tsx,rs}
⚙️ CodeRabbit configuration file
**/*.{js,ts,tsx,rs}
: 1. Do not add any error handling. Keep the existing one.
2. No unused imports, variables, or functions.
3. For comments, keep it minimal. It should be about "Why", not "What".
Files:
plugins/local-llm/src/events.rs
plugins/local-llm/src/lib.rs
plugins/local-llm/src/ext.rs
plugins/local-llm/src/server.rs
🧬 Code graph analysis (2)
plugins/local-llm/src/lib.rs (2)
plugins/local-stt/src/lib.rs (1)
tauri_specta
(34-58)plugins/template/src/lib.rs (1)
tauri_specta
(26-32)
plugins/local-llm/src/ext.rs (1)
plugins/local-llm/src/server.rs (2)
new
(47-56)new
(151-156)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: ci (windows, windows-latest)
- GitHub Check: ci (macos, macos-14)
🔇 Additional comments (5)
plugins/local-llm/src/lib.rs (1)
43-44
: Specta events wiring LGTM.Type-gen for LLMEvent will now flow to TS. No further changes needed here.
plugins/local-llm/src/server.rs (2)
41-56
: Emitter plumbing is clean and decouples progress from the SSE stream.Passing an Arc<Fn(LLMEvent)> through ServerState and LocalProvider is a solid separation for side-channel progress.
Also applies to: 150-156, 166-178
218-229
: Filtering outprogress_update
tool calls at source is the right move.Prevents polluting the streamed delta with internal control signals.
plugins/local-llm/src/ext.rs (2)
8-8
: Import for trait method resolution looks correct.Bringing
tauri_specta::Event
into scope is required soevent.emit(&handle)
resolves; no unused import here.
198-204
: Emitter closure already satisfies Send + Sync + ’static
AppHandle in this Tauri version implements Send and Sync (as asserted in the crate’s tests and docs), so the closure meetsFn(LLMEvent) + Send + Sync + 'static
. No changes needed.
No description provided.