migrate from tool to event for LLM prefill progress #1459

yujonglee · 2025-09-09T05:19:54Z

No description provided.

coderabbitai · 2025-09-09T05:19:59Z

📝 Walkthrough

Walkthrough

Adds an LLMEvent enum and wires an event-emission pathway from the local-LLM server to the Tauri frontend. ServerState now accepts and stores an emitter, propagating progress via LLMEvent::Progress. start_server injects the emitter. Specta is configured to include the event type. Streaming no longer carries progress; it’s emitted separately.

Changes

Cohort / File(s)	Summary of Changes
Event type introduction `plugins/local-llm/src/events.rs`	New public enum `LLMEvent` (derives Serialize, Clone, specta::Type, tauri_specta::Event) with variant `Progress(f64)` serialized as `"progress"`.
Server emitter and progress flow `plugins/local-llm/src/server.rs`	`ServerState` now stores `emitter: Arc<dyn Fn(LLMEvent)+Send+Sync>`; `ServerState::new` signature updated to accept emitter. `LocalProvider` receives emitter; progress routed via `LLMEvent::Progress`. `build_chat_completion_response` adds `progress_fn` param; streaming no longer embeds progress. Mock updated with no-op progress.
Tauri integration and wiring `plugins/local-llm/src/ext.rs`	Adds `tauri_specta::Event` import. Captures app handle and defines closure to emit `LLMEvent` to frontend; passes emitter into `ServerState::new`. Control flow unchanged otherwise.
Specta builder configuration `plugins/local-llm/src/lib.rs`	Specta builder now registers `events::LLMEvent` via `events(...)` to generate event types. No public API changes here.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Frontend
  participant TauriApp as Tauri App
  participant Ext as start_server (ext.rs)
  participant Server as ServerState (server.rs)
  participant Provider as LocalProvider
  participant Model as Model Runtime
  participant EventBus as tauri_specta::Event

  Frontend->>TauriApp: Invoke start_server
  TauriApp->>Ext: start_server()
  note right of Ext: Create emitter closure that calls<br/>EventBus.emit(LLMEvent)
  Ext->>Server: ServerState::new(emitter, model_manager)
  Server->>Provider: Construct with emitter
  Frontend->>TauriApp: Request chat completions
  TauriApp->>Server: chat_completions(...)
  Server->>Provider: build_chat_completion_response(progress_fn=emitter, ...)
  Provider->>Model: Generate tokens with progress callback
  loop Progress updates
    Model-->>Provider: progress(f64)
    Provider-->>Server: progress_fn(f64)
    Server-->>EventBus: emit LLMEvent::Progress(value)
    EventBus-->>Frontend: progress event
  end
  Model-->>Provider: final result
  Provider-->>Server: stream deltas (no progress in stream)
  Server-->>TauriApp: streaming response
  TauriApp-->>Frontend: streaming response

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

LLM inference input token process tracking #1086 — Also adds progress event handling in local-LLM server paths (introduces a Progress event and updates response-building signatures), closely mirroring the emitter-based progress propagation in this PR.

Pre-merge checks (1 passed, 2 warnings)

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description Check	⚠️ Warning	The pull request description is empty and does not provide any information about the changes, leaving reviewers without context for the modifications introduced.	Please add a summary of the key changes, such as introducing the LLMEvent enum, wiring the event emitter in start_server, updating ServerState and LocalProvider to emit progress events, and adjusting the Specta builder configuration.
Docstring Coverage	⚠️ Warning	Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title succinctly summarizes the primary change of migrating the LLM prefill progress mechanism from a tool-based approach to an event-driven model, aligning directly with the core modifications in the changeset.

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
		  - name: "Undocumented Breaking Changes"
			  mode: "warning"
			  instructions: |
				  Flag potential breaking changes that are not documented:
				  1. Identify changes to public APIs/exports, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints (including removed/renamed items and changes to types, required params, return values, defaults, or behavior).
				  2. Ignore purely internal/private changes (e.g., code not exported from package entry points or marked internal).
				  3. Verify documentation exists: a "Breaking Change" section in the PR description and updates to CHANGELOG.md.

Please share your feedback with us on this Discord post.

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch llm-prefill-event

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

plugins/local-llm/src/server.rs (1)

444-497: Tool-call chunk index is incorrect; use a per-tool-call counter, not the stream item index.

.enumerate() counts all events (text, progress, tool), so the index you emit can jump or overflow spec expectations. Track a local counter for tool calls instead.

Apply this diff:

-        let stream = Box::pin(
-            source_stream
-                .enumerate()
-                .map(move |(index, event)| {
+        let mut tool_idx: u32 = 0;
+        let stream = Box::pin(
+            source_stream
+                .map(move |event| {
                     let delta_template = empty_stream_response_delta.clone();
                     let response_template = base_stream_response_template.clone();

                     match event {
                         StreamEvent::Response(llama_response) => match llama_response {
                             hypr_llama::Response::TextDelta(chunk) => {
                                 Some(Ok(CreateChatCompletionStreamResponse {
                                     choices: vec![ChatChoiceStream {
                                         index: 0,
                                         delta: ChatCompletionStreamResponseDelta {
                                             content: Some(chunk),
                                             ..delta_template
                                         },
                                         finish_reason: None,
                                         logprobs: None,
                                     }],
                                     ..response_template
                                 }))
                             }
                             hypr_llama::Response::Reasoning(_) => None,
                             hypr_llama::Response::ToolCall { name, arguments } => {
+                                // OpenAI spec: index is per-tool-call sequence within the message, starting at 0.
+                                let idx = tool_idx;
+                                tool_idx = tool_idx.saturating_add(1);
                                 Some(Ok(CreateChatCompletionStreamResponse {
                                     choices: vec![ChatChoiceStream {
                                         index: 0,
                                         delta: ChatCompletionStreamResponseDelta {
                                             tool_calls: Some(vec![
                                                 ChatCompletionMessageToolCallChunk {
-                                                    index: index.try_into().unwrap_or(0),
+                                                    index: idx,
                                                     id: Some(uuid::Uuid::new_v4().to_string()),
                                                     r#type: Some(ChatCompletionToolType::Function),
                                                     function: Some(FunctionCallStream {
                                                         name: Some(name),
                                                         arguments: Some(
                                                             serde_json::to_string(&arguments)
                                                                 .unwrap_or_default(),
                                                         ),
                                                     }),
                                                 },
                                             ]),
                                             ..delta_template
                                         },
                                         finish_reason: None,
                                         logprobs: None,
                                     }],
                                     ..response_template
                                 }))
                             }
                         },
                         StreamEvent::Progress(progress) => {
                             progress_fn(progress);
                             None
                         }
                     }
                 })
                 .filter_map(|x| async move { x }),
         );

🧹 Nitpick comments (3)

plugins/local-llm/src/events.rs (1)

34-38: Event shape looks good; consider documenting expected range.

If progress is normalized (0.0–1.0), add a short doc comment so frontend semantics stay consistent across providers.
 #[derive(serde::Serialize, Clone, specta::Type, tauri_specta::Event)]
 pub enum LLMEvent {
+    /// Normalized generation progress in [0.0, 1.0].
     #[serde(rename = "progress")]
     Progress(f64),
 }

plugins/local-llm/src/server.rs (2)

236-244: Avoid unbounded progress buffering; coalesce with watch channel.

Progress can be high-frequency; mpsc::unbounded_channel risks unbounded growth. watch keeps only the latest value.

-        let (progress_sender, mut progress_receiver) = mpsc::unbounded_channel::<f64>();
+        let (progress_sender, mut progress_receiver) = tokio::sync::watch::channel::<f64>(0.0);

         let (response_stream, cancellation_token) = model.generate_stream_with_callback(
             request,
             Box::new(move |v| {
-                let _ = progress_sender.send(v);
+                let _ = progress_sender.send(v);
             }),
         )?;

         let mixed_stream = async_stream::stream! {
             tokio::pin!(response_stream);

             loop {
                 tokio::select! {
                     response_result = response_stream.next() => {
                         match response_result {
                             Some(response) => yield StreamEvent::Response(response),
                             None => break,
                         }
                     },
-                    progress_result = progress_receiver.recv() => {
-                        match progress_result {
-                            Some(progress) => yield StreamEvent::Progress(progress),
-                            None => {}
-                        }
-                    }
+                    _ = progress_receiver.changed() => {
+                        yield StreamEvent::Progress(*progress_receiver.borrow());
+                    }
                 }
             }
         };

Also applies to: 245-264, 240-243, 256-261

169-179: Progress events lack correlation; add a request id to avoid ambiguity under concurrency.

If multiple chats run concurrently, LLMEvent::Progress(f64) is ambiguous. Consider emitting LLMEvent::Progress { id, value } where id matches the response id.

Would you like a follow-up patch that threads a Uuid from build_chat_completion_response into progress_fn and extends LLMEvent accordingly?

Also applies to: 327-335

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between af3e094 and 13307d2.

📒 Files selected for processing (4)

plugins/local-llm/src/events.rs (1 hunks)
plugins/local-llm/src/ext.rs (2 hunks)
plugins/local-llm/src/lib.rs (1 hunks)
plugins/local-llm/src/server.rs (9 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.{js,ts,tsx,rs}

⚙️ CodeRabbit configuration file

**/*.{js,ts,tsx,rs}: 1. Do not add any error handling. Keep the existing one.
2. No unused imports, variables, or functions.
3. For comments, keep it minimal. It should be about "Why", not "What".

Files:

plugins/local-llm/src/events.rs
plugins/local-llm/src/lib.rs
plugins/local-llm/src/ext.rs
plugins/local-llm/src/server.rs

🧬 Code graph analysis (2)

plugins/local-llm/src/lib.rs (2)

plugins/local-stt/src/lib.rs (1)

tauri_specta (34-58)

plugins/template/src/lib.rs (1)

tauri_specta (26-32)

plugins/local-llm/src/ext.rs (1)

plugins/local-llm/src/server.rs (2)

new (47-56)

new (151-156)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: ci (windows, windows-latest)
GitHub Check: ci (macos, macos-14)

🔇 Additional comments (5)

plugins/local-llm/src/lib.rs (1)

43-44: Specta events wiring LGTM.

Type-gen for LLMEvent will now flow to TS. No further changes needed here.

plugins/local-llm/src/server.rs (2)

41-56: Emitter plumbing is clean and decouples progress from the SSE stream.

Passing an Arc<Fn(LLMEvent)> through ServerState and LocalProvider is a solid separation for side-channel progress.

Also applies to: 150-156, 166-178

218-229: Filtering out progress_update tool calls at source is the right move.

Prevents polluting the streamed delta with internal control signals.

plugins/local-llm/src/ext.rs (2)

8-8: Import for trait method resolution looks correct.

Bringing tauri_specta::Event into scope is required so event.emit(&handle) resolves; no unused import here.

198-204: Emitter closure already satisfies Send + Sync + ’static
AppHandle in this Tauri version implements Send and Sync (as asserted in the crate’s tests and docs), so the closure meets Fn(LLMEvent) + Send + Sync + 'static. No changes needed.

compile

13307d2

coderabbitai bot reviewed Sep 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

migrate from tool to event for LLM prefill progress #1459

migrate from tool to event for LLM prefill progress #1459

Uh oh!

yujonglee commented Sep 9, 2025

Uh oh!

coderabbitai bot commented Sep 9, 2025 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

migrate from tool to event for LLM prefill progress #1459

Are you sure you want to change the base?

migrate from tool to event for LLM prefill progress #1459

Uh oh!

Conversation

yujonglee commented Sep 9, 2025

Uh oh!

coderabbitai bot commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Pre-merge checks (1 passed, 2 warnings)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot commented Sep 9, 2025 •

edited

Loading