Skip to content

Conversation

yujonglee
Copy link
Contributor

No description provided.

Copy link

coderabbitai bot commented Sep 9, 2025

📝 Walkthrough

Walkthrough

Adds an LLMEvent enum and wires an event-emission pathway from the local-LLM server to the Tauri frontend. ServerState now accepts and stores an emitter, propagating progress via LLMEvent::Progress. start_server injects the emitter. Specta is configured to include the event type. Streaming no longer carries progress; it’s emitted separately.

Changes

Cohort / File(s) Summary of Changes
Event type introduction
plugins/local-llm/src/events.rs
New public enum LLMEvent (derives Serialize, Clone, specta::Type, tauri_specta::Event) with variant Progress(f64) serialized as "progress".
Server emitter and progress flow
plugins/local-llm/src/server.rs
ServerState now stores emitter: Arc<dyn Fn(LLMEvent)+Send+Sync>; ServerState::new signature updated to accept emitter. LocalProvider receives emitter; progress routed via LLMEvent::Progress. build_chat_completion_response adds progress_fn param; streaming no longer embeds progress. Mock updated with no-op progress.
Tauri integration and wiring
plugins/local-llm/src/ext.rs
Adds tauri_specta::Event import. Captures app handle and defines closure to emit LLMEvent to frontend; passes emitter into ServerState::new. Control flow unchanged otherwise.
Specta builder configuration
plugins/local-llm/src/lib.rs
Specta builder now registers events::LLMEvent via events(...) to generate event types. No public API changes here.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Frontend
  participant TauriApp as Tauri App
  participant Ext as start_server (ext.rs)
  participant Server as ServerState (server.rs)
  participant Provider as LocalProvider
  participant Model as Model Runtime
  participant EventBus as tauri_specta::Event

  Frontend->>TauriApp: Invoke start_server
  TauriApp->>Ext: start_server()
  note right of Ext: Create emitter closure that calls<br/>EventBus.emit(LLMEvent)
  Ext->>Server: ServerState::new(emitter, model_manager)
  Server->>Provider: Construct with emitter
  Frontend->>TauriApp: Request chat completions
  TauriApp->>Server: chat_completions(...)
  Server->>Provider: build_chat_completion_response(progress_fn=emitter, ...)
  Provider->>Model: Generate tokens with progress callback
  loop Progress updates
    Model-->>Provider: progress(f64)
    Provider-->>Server: progress_fn(f64)
    Server-->>EventBus: emit LLMEvent::Progress(value)
    EventBus-->>Frontend: progress event
  end
  Model-->>Provider: final result
  Provider-->>Server: stream deltas (no progress in stream)
  Server-->>TauriApp: streaming response
  TauriApp-->>Frontend: streaming response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • LLM inference input token process tracking #1086 — Also adds progress event handling in local-LLM server paths (introduces a Progress event and updates response-building signatures), closely mirroring the emitter-based progress propagation in this PR.

Pre-merge checks (1 passed, 2 warnings)

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The pull request description is empty and does not provide any information about the changes, leaving reviewers without context for the modifications introduced. Please add a summary of the key changes, such as introducing the LLMEvent enum, wiring the event emitter in start_server, updating ServerState and LocalProvider to emit progress events, and adjusting the Specta builder configuration.
Docstring Coverage ⚠️ Warning Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (1 passed)
Check name Status Explanation
Title Check ✅ Passed The title succinctly summarizes the primary change of migrating the LLM prefill progress mechanism from a tool-based approach to an event-driven model, aligning directly with the core modifications in the changeset.

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
		  - name: "Undocumented Breaking Changes"
			  mode: "warning"
			  instructions: |
				  Flag potential breaking changes that are not documented:
				  1. Identify changes to public APIs/exports, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints (including removed/renamed items and changes to types, required params, return values, defaults, or behavior).
				  2. Ignore purely internal/private changes (e.g., code not exported from package entry points or marked internal).
				  3. Verify documentation exists: a "Breaking Change" section in the PR description and updates to CHANGELOG.md.

Please share your feedback with us on this Discord post.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch llm-prefill-event

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
plugins/local-llm/src/server.rs (1)

444-497: Tool-call chunk index is incorrect; use a per-tool-call counter, not the stream item index.

.enumerate() counts all events (text, progress, tool), so the index you emit can jump or overflow spec expectations. Track a local counter for tool calls instead.

Apply this diff:

-        let stream = Box::pin(
-            source_stream
-                .enumerate()
-                .map(move |(index, event)| {
+        let mut tool_idx: u32 = 0;
+        let stream = Box::pin(
+            source_stream
+                .map(move |event| {
                     let delta_template = empty_stream_response_delta.clone();
                     let response_template = base_stream_response_template.clone();

                     match event {
                         StreamEvent::Response(llama_response) => match llama_response {
                             hypr_llama::Response::TextDelta(chunk) => {
                                 Some(Ok(CreateChatCompletionStreamResponse {
                                     choices: vec![ChatChoiceStream {
                                         index: 0,
                                         delta: ChatCompletionStreamResponseDelta {
                                             content: Some(chunk),
                                             ..delta_template
                                         },
                                         finish_reason: None,
                                         logprobs: None,
                                     }],
                                     ..response_template
                                 }))
                             }
                             hypr_llama::Response::Reasoning(_) => None,
                             hypr_llama::Response::ToolCall { name, arguments } => {
+                                // OpenAI spec: index is per-tool-call sequence within the message, starting at 0.
+                                let idx = tool_idx;
+                                tool_idx = tool_idx.saturating_add(1);
                                 Some(Ok(CreateChatCompletionStreamResponse {
                                     choices: vec![ChatChoiceStream {
                                         index: 0,
                                         delta: ChatCompletionStreamResponseDelta {
                                             tool_calls: Some(vec![
                                                 ChatCompletionMessageToolCallChunk {
-                                                    index: index.try_into().unwrap_or(0),
+                                                    index: idx,
                                                     id: Some(uuid::Uuid::new_v4().to_string()),
                                                     r#type: Some(ChatCompletionToolType::Function),
                                                     function: Some(FunctionCallStream {
                                                         name: Some(name),
                                                         arguments: Some(
                                                             serde_json::to_string(&arguments)
                                                                 .unwrap_or_default(),
                                                         ),
                                                     }),
                                                 },
                                             ]),
                                             ..delta_template
                                         },
                                         finish_reason: None,
                                         logprobs: None,
                                     }],
                                     ..response_template
                                 }))
                             }
                         },
                         StreamEvent::Progress(progress) => {
                             progress_fn(progress);
                             None
                         }
                     }
                 })
                 .filter_map(|x| async move { x }),
         );
🧹 Nitpick comments (3)
plugins/local-llm/src/events.rs (1)

34-38: Event shape looks good; consider documenting expected range.

If progress is normalized (0.0–1.0), add a short doc comment so frontend semantics stay consistent across providers.

 #[derive(serde::Serialize, Clone, specta::Type, tauri_specta::Event)]
 pub enum LLMEvent {
+    /// Normalized generation progress in [0.0, 1.0].
     #[serde(rename = "progress")]
     Progress(f64),
 }
plugins/local-llm/src/server.rs (2)

236-244: Avoid unbounded progress buffering; coalesce with watch channel.

Progress can be high-frequency; mpsc::unbounded_channel risks unbounded growth. watch keeps only the latest value.

-        let (progress_sender, mut progress_receiver) = mpsc::unbounded_channel::<f64>();
+        let (progress_sender, mut progress_receiver) = tokio::sync::watch::channel::<f64>(0.0);

         let (response_stream, cancellation_token) = model.generate_stream_with_callback(
             request,
             Box::new(move |v| {
-                let _ = progress_sender.send(v);
+                let _ = progress_sender.send(v);
             }),
         )?;

         let mixed_stream = async_stream::stream! {
             tokio::pin!(response_stream);

             loop {
                 tokio::select! {
                     response_result = response_stream.next() => {
                         match response_result {
                             Some(response) => yield StreamEvent::Response(response),
                             None => break,
                         }
                     },
-                    progress_result = progress_receiver.recv() => {
-                        match progress_result {
-                            Some(progress) => yield StreamEvent::Progress(progress),
-                            None => {}
-                        }
-                    }
+                    _ = progress_receiver.changed() => {
+                        yield StreamEvent::Progress(*progress_receiver.borrow());
+                    }
                 }
             }
         };

Also applies to: 245-264, 240-243, 256-261


169-179: Progress events lack correlation; add a request id to avoid ambiguity under concurrency.

If multiple chats run concurrently, LLMEvent::Progress(f64) is ambiguous. Consider emitting LLMEvent::Progress { id, value } where id matches the response id.

Would you like a follow-up patch that threads a Uuid from build_chat_completion_response into progress_fn and extends LLMEvent accordingly?

Also applies to: 327-335

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between af3e094 and 13307d2.

📒 Files selected for processing (4)
  • plugins/local-llm/src/events.rs (1 hunks)
  • plugins/local-llm/src/ext.rs (2 hunks)
  • plugins/local-llm/src/lib.rs (1 hunks)
  • plugins/local-llm/src/server.rs (9 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{js,ts,tsx,rs}

⚙️ CodeRabbit configuration file

**/*.{js,ts,tsx,rs}: 1. Do not add any error handling. Keep the existing one.
2. No unused imports, variables, or functions.
3. For comments, keep it minimal. It should be about "Why", not "What".

Files:

  • plugins/local-llm/src/events.rs
  • plugins/local-llm/src/lib.rs
  • plugins/local-llm/src/ext.rs
  • plugins/local-llm/src/server.rs
🧬 Code graph analysis (2)
plugins/local-llm/src/lib.rs (2)
plugins/local-stt/src/lib.rs (1)
  • tauri_specta (34-58)
plugins/template/src/lib.rs (1)
  • tauri_specta (26-32)
plugins/local-llm/src/ext.rs (1)
plugins/local-llm/src/server.rs (2)
  • new (47-56)
  • new (151-156)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: ci (windows, windows-latest)
  • GitHub Check: ci (macos, macos-14)
🔇 Additional comments (5)
plugins/local-llm/src/lib.rs (1)

43-44: Specta events wiring LGTM.

Type-gen for LLMEvent will now flow to TS. No further changes needed here.

plugins/local-llm/src/server.rs (2)

41-56: Emitter plumbing is clean and decouples progress from the SSE stream.

Passing an Arc<Fn(LLMEvent)> through ServerState and LocalProvider is a solid separation for side-channel progress.

Also applies to: 150-156, 166-178


218-229: Filtering out progress_update tool calls at source is the right move.

Prevents polluting the streamed delta with internal control signals.

plugins/local-llm/src/ext.rs (2)

8-8: Import for trait method resolution looks correct.

Bringing tauri_specta::Event into scope is required so event.emit(&handle) resolves; no unused import here.


198-204: Emitter closure already satisfies Send + Sync + ’static
AppHandle in this Tauri version implements Send and Sync (as asserted in the crate’s tests and docs), so the closure meets Fn(LLMEvent) + Send + Sync + 'static. No changes needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant