Skip to content

feat: add renderer diagnostics observability#392

Open
Astro-Han wants to merge 11 commits intodevfrom
codex/i389-renderer-diagnostics
Open

feat: add renderer diagnostics observability#392
Astro-Han wants to merge 11 commits intodevfrom
codex/i389-renderer-diagnostics

Conversation

@Astro-Han
Copy link
Copy Markdown
Owner

@Astro-Han Astro-Han commented May 2, 2026

Summary

Adds a default-on renderer diagnostics pipeline for session rendering issues:

  • Records sanitized renderer diagnostics from the session UI into a bounded Electron main-process JSONL log.
  • Adds diagnostics export through Help -> Export Diagnostics Log..., session export, and problem reports.
  • Emits session submit, view-state, timeline, scroll, composer dock, visibility, perf, and derived incident events.
  • Adds truncation/failure handling so diagnostics never block app behavior, session export, or problem report creation.
  • Adds unit and E2E coverage for the diagnostics flow and send/scroll stability.

Why

Issue #389 needs better observability for recurring session jump-to-top and render flicker bugs. The goal is to stop relying only on Computer Use and visual inspection by giving manually shared problem reports/session exports enough structured renderer evidence to diagnose these failures.

Related Issue

Closes #389

Human Review Status

Pending. A human should make the final merge decision after reviewing the final diff and verification evidence.

Review Focus

Please focus on:

  • Privacy/sanitization boundaries for renderer diagnostics fields.
  • Whether diagnostics export and problem report attachment fail open correctly.
  • Whether the renderer instrumentation is scoped enough and does not perturb session scrolling/rendering.
  • Desktop menu copy/i18n and platform impact on macOS/Windows.

Risk Notes

  • New local diagnostic JSONL is written under Electron user data and retained with size/time caps.
  • The Help menu has one visible native-menu copy change: Export Diagnostics Log... / 导出诊断日志....
  • The saved diagnostics file is user-chosen via native save dialog; problem reports/session exports can include a capped diagnostics slice.
  • No dependencies, migrations, credentials, deletion behavior, or generated source files were introduced.
  • Labels: maintainer labeling requested.

How To Verify

Desktop focused tests: 74 passed, 0 failed
bun --cwd packages/desktop-electron test src/main/renderer-diagnostics.test.ts src/main/server-client.test.ts src/main/problem-report.test.ts src/main/feedback.test.ts src/main/ipc-window-config.test.ts src/main/menu-template.test.ts src/main/menu-labels.test.ts

App focused tests: 676 passed, 0 failed
bun --cwd packages/app test src/context/renderer-diagnostics.test.ts src/pages/session/use-session-scroll-dock.test.ts src/pages/session/session-view-controller.test.ts

E2E focused tests: 2 passed, 0 failed
bun --cwd packages/app playwright test e2e/session/session-renderer-diagnostics.spec.ts e2e/session/session-send-stability.spec.ts --reporter=line

Desktop typecheck: passed
bun --cwd packages/desktop-electron typecheck

App typecheck: passed
bun --cwd packages/app typecheck

Screenshots or Recordings

Not captured. The visible UI change is a native menu item and native save dialog title; coverage is via menu/i18n tests. Manual visual follow-up is still useful on macOS/Windows native menus.

Checklist

  • Human review status is stated above as pending, approved, or not required
  • I linked the related issue, or stated why there is no issue
  • This PR has type, scope, and priority labels, or I requested maintainer labeling
  • I described the review focus and any meaningful risks
  • I listed the relevant verification steps and the key result for each
  • I did not introduce unrelated refactors, dependencies, generated files, or file changes beyond the stated scope
  • I manually checked visible UI or copy changes when needed, with screenshots or recordings
  • I considered macOS and Windows impact for desktop, packaging, updater, signing, paths, shell, or permissions changes
  • I called out docs, release notes, dependencies, permissions, credentials, deletion behavior, generated content, or local file changes when relevant
  • I reviewed the final diff for unrelated changes and suspicious dependency changes
  • I am targeting dev, and my PR title and commit messages use Conventional Commits in English

Summary by CodeRabbit

  • New Features

    • "Export Diagnostics Log" menu command and renderer API to record/export renderer diagnostics.
  • Enhancements

    • Diagnostics included in problem reports and session export payloads.
    • Session view now emits scroll, timeline, visibility and performance samples plus derived incident signals.
    • Composer/dock resize and scroll metrics are reported for improved session telemetry.
  • Tests

    • New end-to-end and unit tests covering capture, sanitization, retention, slicing, export, incident detection, and related UI behaviors.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 2, 2026

📝 Walkthrough

Walkthrough

Adds end-to-end renderer diagnostics: browser emitter, incident detection and perf sampling, component instrumentation (session/timeline/scroll/dock/submit), preload + IPC wiring, a main-process sanitizer/recorder with retention/slicing/export, session-export/problem-report integration, menu/export UI, and E2E + unit tests.

Changes

Renderer diagnostics end-to-end

Layer / File(s) Summary
Types & API contracts
packages/app/src/context/platform.tsx, packages/app/src/app.tsx, packages/app/src/desktop-api.ts, packages/desktop-electron/src/preload/types.ts
Introduce RendererDiagnosticInput / RendererDiagnosticsExportResult; extend Platform / Window.api / ElectronAPI with emitRendererDiagnostic and exportDiagnosticsLog.
Browser diagnostics emitter & incident detection
packages/app/src/context/renderer-diagnostics.ts, packages/app/src/context/renderer-diagnostics.test.ts
Add emitter that injects monotonic_ms, forwards to window.api.emitRendererDiagnostic if available, detects incidents (scroll-jump, timeline remount/visible-clear), and provides session perf sampling (FPS/jank/longtasks/layout-shift/heap) with cleanup.
App instrumentation
packages/app/src/components/prompt-input/submit.ts, packages/app/src/pages/session.tsx, packages/app/src/pages/session/message-timeline.tsx
Emit diagnostics: session.action.submit, session.view.state, session.timeline.mount/unmount/visible, and session.scroll.sample with message/part/viewport/scroll metrics.
Scroll-dock wiring
packages/app/src/pages/session/use-session-scroll-dock.ts, packages/app/src/pages/session/use-session-timeline-interaction.ts, packages/app/src/pages/session/use-session-scroll-dock.test.ts
Add optional onDockHeightChange callback; compute previous/current dock height plus scrollTop/distanceFromBottom and forward via timeline interaction to emit diagnostics; tests capture and assert events.
E2E capture probe
packages/app/e2e/session/session-renderer-diagnostics.spec.ts
Playwright E2E test injects a page-init shim to capture renderer emits, seeds session turns, scrolls timeline, sends a visible prompt, asserts scroll-anchor stability, viewport metric bounds, and diagnostic event presence/counts.
Preload & IPC wiring
packages/desktop-electron/src/preload/index.ts, packages/desktop-electron/src/main/ipc.ts, packages/desktop-electron/src/main/ipc-window-config.test.ts
Expose emitRendererDiagnostic and exportDiagnosticsLog on preload API; add IPC handlers (renderer-diagnostics:record, renderer-diagnostics:export, rendererDiagnosticsSlice) and forward windowID context.
Main-process recorder & utilities
packages/desktop-electron/src/main/renderer-diagnostics.ts, packages/desktop-electron/src/main/renderer-diagnostics.test.ts
New recorder: allowlist sanitizer, per-event byte limits, URL/token scrubbing, protected-event preservation, high-frequency rate-limiting, JSONL persistence, retention trimming (time+bytes), serialized writes, slicing/selectors and export generation.
Export & session integration
packages/desktop-electron/src/main/server-client.ts, packages/desktop-electron/src/main/server-client.test.ts, packages/desktop-electron/src/main/index.ts
Attach diagnostics slice into session export JSON when applicable; initialize recorder at app launch; add menu-driven diagnostics export flow and IPC/export handlers.
Problem report & feedback integration
packages/desktop-electron/src/main/problem-report.ts, packages/desktop-electron/src/main/problem-report.test.ts, packages/desktop-electron/src/main/feedback.ts, packages/desktop-electron/src/main/feedback.test.ts
Include renderer diagnostics slice in problem report payload/summary; implement truncation of non-protected diagnostic events to meet max-bytes and handle fetch/write failures with fallbacks.
Menu labels & templates
packages/desktop-electron/src/main/menu-labels.ts, packages/desktop-electron/src/main/menu-template.ts, packages/desktop-electron/src/main/menu-template.test.ts, packages/desktop-electron/src/main/menu-labels.test.ts
Add localized Export Diagnostics Log... menu entries and wire menu template to call exportDiagnosticsLog.

Sequence Diagram

sequenceDiagram
    participant Renderer as Browser Renderer
    participant Emitter as Renderer Emitter
    participant Preload as Preload API (window.api)
    participant IPC as IPC Main
    participant Recorder as Diagnostics Recorder
    participant File as Local JSONL Storage
    participant Report as Problem Report Builder

    Renderer->>Emitter: emit(session.timeline.mount / scroll.sample / perf.sample / action.submit)
    Emitter->>Emitter: normalize(monotonic_ms), detect incidents
    Emitter->>Preload: window.api.emitRendererDiagnostic(event)
    Preload->>IPC: invoke "renderer-diagnostics:record" (includes windowID)
    IPC->>Recorder: record(sanitized event)
    Recorder->>File: append JSONL line (serialize + retention flush)
    Recorder-->>Recorder: rate-limit / cap by bytes / retention

    Note over Report,IPC: During session export or problem report
    Report->>IPC: request renderer diagnostics slice (sessionID, windowID, maxBytes)
    IPC->>Recorder: slice(...)
    Recorder->>Report: RendererDiagnosticsSlice (ok/truncated/missing/expired)
    Report->>File: include slice in payload and save report
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested Labels

enhancement, P2, app, desktop

Poem

🐇 I hopped and watched the timeline scroll,

I counted mounts and frames that toll,
Tiny signals, stamped in time,
Saved for when a bug must climb,
Logs snugly tucked — a rabbit's scroll.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: add renderer diagnostics observability' clearly and concisely describes the main change—adding a diagnostics observability pipeline for renderer issues, following Conventional Commits format.
Description check ✅ Passed The PR description covers all required template sections: clear summary, motivation (issue #389), related issue link, human review status, review focus, risk notes, verification steps with test results, and completed checklist.
Linked Issues check ✅ Passed The PR fulfills all in-scope coding objectives from issue #389: implements JSONL logging, captures session UI diagnostics, integrates into session export and problem reports, provides export UI, enforces privacy via sanitization, and includes comprehensive unit and E2E test coverage.
Out of Scope Changes check ✅ Passed All changes are scoped to renderer diagnostics implementation. The PR respects out-of-scope constraints: no screenshots/DOM snapshots, no text/credentials, no cloud telemetry, and maintains backward-compatible single-file session export.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/i389-renderer-diagnostics

Review rate limit: 8/10 reviews remaining, refill in 7 minutes and 56 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive renderer diagnostics system for the desktop application, enabling the capture and recording of performance metrics, scroll behavior, and session state transitions. The implementation includes a main-process recorder that persists events to a JSONL log with retention and byte-capping logic, alongside a renderer-side emitter that integrates with SolidJS components. Feedback focuses on critical performance issues in the log pruning logic, specifically the $O(N^2)$ complexity of the retention flush and the excessive disk I/O caused by pruning on every event. Additionally, the data sanitizer regex is noted to be overly aggressive, potentially stripping valid technical identifiers like domain-based provider names.

Comment thread packages/desktop-electron/src/main/renderer-diagnostics.ts Outdated
Comment thread packages/desktop-electron/src/main/renderer-diagnostics.ts Outdated
Comment thread packages/desktop-electron/src/main/renderer-diagnostics.ts Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/desktop-electron/src/main/problem-report.ts (1)

491-499: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Validate the new truncation field in isTruncation().

parseProblemReportPayload() can now return a Payload whose truncation.omittedRendererDiagnosticsBytes is missing or non-numeric, because the guard never checks it.

Suggested fix
 function isTruncation(value: unknown): value is Payload["truncation"] {
   if (!isRecord(value)) return false
   return (
     isFiniteNumber(value.omittedMessages) &&
     isFiniteNumber(value.omittedLogBytes) &&
     isFiniteNumber(value.omittedSessionInfoBytes) &&
     isFiniteNumber(value.omittedFailedExportErrorBytes) &&
+    isFiniteNumber(value.omittedRendererDiagnosticsBytes) &&
     isFiniteNumber(value.omittedDiagnosticsBytes)
   )
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/desktop-electron/src/main/problem-report.ts` around lines 491 - 499,
The type guard isTruncation currently omits checking
omittedRendererDiagnosticsBytes, so update isTruncation (the function checking
Payload["truncation"]) to also require
isFiniteNumber(value.omittedRendererDiagnosticsBytes); ensure the guard returns
false if that property is missing or non-numeric by adding it alongside the
other isFiniteNumber checks so parseProblemReportPayload can't produce a Payload
with an invalid truncation field.
🧹 Nitpick comments (2)
packages/desktop-electron/src/main/ipc-window-config.test.ts (1)

20-26: ⚡ Quick win

Avoid locking this test to src/main/ipc.ts as the wiring anchor.

This test currently enshrines string-based channel registration inside ipc.ts, which makes future migration to module-level register*Ipc() bootstrap wiring harder. Prefer asserting the bootstrap integration point instead of file-content substrings.

Based on learnings: In Astro-Han/pawwork (packages/desktop-electron/src/main/), the bootstrap entry (packages/desktop-electron/src/main/index.ts) should directly call each module’s exported register*Ipc() function and should not centralize sub-module IPC registrations through src/main/ipc.ts.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/desktop-electron/src/main/ipc-window-config.test.ts` around lines 20
- 26, The test is brittle because it asserts string substrings from
src/main/ipc.ts instead of verifying the bootstrap wiring; update
packages/desktop-electron/src/main/ipc-window-config.test.ts to assert that the
bootstrap entry (src/main/index.ts) invokes the module-level registration
function(s) (e.g., registerRendererDiagnosticsIpc or the generic register*Ipc
exports) rather than checking for specific channel strings; modify the test to
import the bootstrap (index.ts) and spy/mock the exported register*Ipc
function(s) from the diagnostics module (or assert that index.ts imports and
calls registerRendererDiagnosticsIpc) so the test validates integration at the
bootstrap level and not file content.
packages/app/src/app.tsx (1)

50-51: ⚡ Quick win

Align Window.api typing with diagnostics export surface.

Line 91 adds emitRendererDiagnostic, but the paired exportDiagnosticsLog method (already exposed on preload API) is missing from this interface. Adding it here keeps app-side typing consistent and avoids drift.

♻️ Suggested typing alignment
-import type { RendererDiagnosticInput } from "@/context/platform"
+import type { RendererDiagnosticInput, RendererDiagnosticsExportResult } from "@/context/platform"
...
     api?: {
       setDesktopContext?: (context: DesktopContext) => Promise<void>
       emitRendererDiagnostic?: (event: RendererDiagnosticInput) => Promise<void>
+      exportDiagnosticsLog?: () => Promise<RendererDiagnosticsExportResult>
       getAboutInfo?: () => Promise<AboutInfo>

Also applies to: 89-99

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/app/src/app.tsx` around lines 50 - 51, The Window.api type is
missing the exportDiagnosticsLog method causing app-side typing drift; update
the Window.api interface (the same area that currently declares
emitRendererDiagnostic) to also include exportDiagnosticsLog with the same
signature as the preload API—use the existing RendererDiagnosticInput type if
appropriate and ensure the method name and parameter/return types match the
exported preload surface so both emitRendererDiagnostic and exportDiagnosticsLog
are declared consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/app/e2e/session/session-renderer-diagnostics.spec.ts`:
- Around line 39-44: Replace the outright replacement of
win.api.emitRendererDiagnostic with a wrapper that delegates to the existing
implementation: capture the original function (e.g., const originalEmit =
win.api?.emitRendererDiagnostic), then assign a new async emitRendererDiagnostic
that calls await originalEmit?.call(win.api, event) (if present) and still
pushes a deep-cloned event into win.__pawwork_renderer_diagnostics; keep other
win.api properties untouched and preserve async/return behavior so a broken
preload/IPC handler is still exercised by the test while recording the event.

In `@packages/app/src/components/prompt-input/submit.ts`:
- Around line 523-540: The diagnostics are reading mutable state after calling
removeCommentItems() and clearInput(), and the fire-and-forget
emitRendererDiagnostic(...) silently drops rejections; capture metrics into
local variables (e.g., const commentCount = input.commentCount(), promptLen =
input.promptLength(currentPrompt), imageCount = images.length) before calling
removeCommentItems() / clearInput(), then call emitRendererDiagnostic with those
captured values and handle async failures (await the promise and wrap in
try/catch or attach a .catch to log/report errors) so failures aren’t ignored;
update references to messageID, session.id, model.providerID/model.modelID as
before when building the payload.

In `@packages/app/src/context/renderer-diagnostics.ts`:
- Around line 169-185: The emitted fps should be the real frames-per-second over
the sample period, not raw frameCount; in flush() replace fps: frameCount with a
computed value framesPerSecond = Math.round(frameCount / (sampleDurationMs /
1000)), where sampleDurationMs is the elapsed milliseconds since the previous
flush (track a lastFlushTimestamp set when diagnostics start and update it at
the end of flush(), initialize it on start); use framesPerSecond in the event
payload (and keep rounding/undefined handling consistent with other metrics).

In `@packages/app/src/pages/session/message-timeline.tsx`:
- Around line 739-755: The scroll diagnostic currently calls
emitRendererDiagnostic on every scroll event (hot path), causing excessive IPC
and jank; change it to throttle via requestAnimationFrame (or an RAF-backed
boolean guard) so emitRendererDiagnostic(...) runs at most once per animation
frame instead of per event, and clamp the computed distance_from_bottom (max -
el.scrollTop) to a non-negative value before including it in data; update the
same call site that uses emitRendererDiagnostic, visibleRangeData(),
props.hasScrollGesture(), props.scroll, and staging.isStaging() to use the
RAF-throttled emitter and the clamped distance metric.

In `@packages/app/src/pages/session/use-session-scroll-dock.ts`:
- Around line 172-179: The diagnostics hook input.onDockHeightChange is
currently invoked directly and can throw, breaking resize/scroll recovery; wrap
the invocation of input.onDockHeightChange (inside the dockHeight !==
previousDockHeight branch) in a try/catch so any exception is caught and does
not propagate, and log the error (e.g., via console.error or existing logger)
while keeping the conditional call (input.onDockHeightChange?.(...)) behavior
intact so diagnostics failures are fail-open.

In `@packages/desktop-electron/src/main/feedback.ts`:
- Around line 221-226: Wrap the await deps.rendererDiagnostics(context) call in
a timeout guard (e.g., Promise.race) so it cannot hang indefinitely; if the
timeout elapses, treat it as a handled error by calling
deps.onHandledError?.("renderer diagnostics slice timed out", timeoutError) and
set rendererDiagnostics = emptyRendererDiagnosticsSlice("write_failed", new
Date(generatedAt)) — mirror the fail-open behavior used for session export; use
a short configurable timeout constant (e.g., TIMEOUT_MS) and reference
deps.rendererDiagnostics, deps.onHandledError, emptyRendererDiagnosticsSlice,
and generatedAt when implementing.

In `@packages/desktop-electron/src/main/renderer-diagnostics.ts`:
- Around line 446-468: The record() function still writes events to disk even
when options.disabled is true; add an early check at the top of record (before
sanitizing/enqueueing) to honor options.disabled and immediately return { ok:
false, reason: "disabled" } (matching slice() behavior) so no further processing
(sanitizeRendererDiagnosticEvent, enqueueWrite, appendFile, flushRetention)
occurs when disabled.

---

Outside diff comments:
In `@packages/desktop-electron/src/main/problem-report.ts`:
- Around line 491-499: The type guard isTruncation currently omits checking
omittedRendererDiagnosticsBytes, so update isTruncation (the function checking
Payload["truncation"]) to also require
isFiniteNumber(value.omittedRendererDiagnosticsBytes); ensure the guard returns
false if that property is missing or non-numeric by adding it alongside the
other isFiniteNumber checks so parseProblemReportPayload can't produce a Payload
with an invalid truncation field.

---

Nitpick comments:
In `@packages/app/src/app.tsx`:
- Around line 50-51: The Window.api type is missing the exportDiagnosticsLog
method causing app-side typing drift; update the Window.api interface (the same
area that currently declares emitRendererDiagnostic) to also include
exportDiagnosticsLog with the same signature as the preload API—use the existing
RendererDiagnosticInput type if appropriate and ensure the method name and
parameter/return types match the exported preload surface so both
emitRendererDiagnostic and exportDiagnosticsLog are declared consistently.

In `@packages/desktop-electron/src/main/ipc-window-config.test.ts`:
- Around line 20-26: The test is brittle because it asserts string substrings
from src/main/ipc.ts instead of verifying the bootstrap wiring; update
packages/desktop-electron/src/main/ipc-window-config.test.ts to assert that the
bootstrap entry (src/main/index.ts) invokes the module-level registration
function(s) (e.g., registerRendererDiagnosticsIpc or the generic register*Ipc
exports) rather than checking for specific channel strings; modify the test to
import the bootstrap (index.ts) and spy/mock the exported register*Ipc
function(s) from the diagnostics module (or assert that index.ts imports and
calls registerRendererDiagnosticsIpc) so the test validates integration at the
bootstrap level and not file content.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: fc7e3280-b9b7-4318-ad00-ed3ff5d3994d

📥 Commits

Reviewing files that changed from the base of the PR and between a689708 and dc1a25b.

📒 Files selected for processing (29)
  • packages/app/e2e/session/session-renderer-diagnostics.spec.ts
  • packages/app/src/app.tsx
  • packages/app/src/components/prompt-input/submit.ts
  • packages/app/src/context/platform.tsx
  • packages/app/src/context/renderer-diagnostics.test.ts
  • packages/app/src/context/renderer-diagnostics.ts
  • packages/app/src/desktop-api.ts
  • packages/app/src/pages/session.tsx
  • packages/app/src/pages/session/message-timeline.tsx
  • packages/app/src/pages/session/use-session-scroll-dock.test.ts
  • packages/app/src/pages/session/use-session-scroll-dock.ts
  • packages/app/src/pages/session/use-session-timeline-interaction.ts
  • packages/desktop-electron/src/main/feedback.test.ts
  • packages/desktop-electron/src/main/feedback.ts
  • packages/desktop-electron/src/main/index.ts
  • packages/desktop-electron/src/main/ipc-window-config.test.ts
  • packages/desktop-electron/src/main/ipc.ts
  • packages/desktop-electron/src/main/menu-labels.test.ts
  • packages/desktop-electron/src/main/menu-labels.ts
  • packages/desktop-electron/src/main/menu-template.test.ts
  • packages/desktop-electron/src/main/menu-template.ts
  • packages/desktop-electron/src/main/problem-report.test.ts
  • packages/desktop-electron/src/main/problem-report.ts
  • packages/desktop-electron/src/main/renderer-diagnostics.test.ts
  • packages/desktop-electron/src/main/renderer-diagnostics.ts
  • packages/desktop-electron/src/main/server-client.test.ts
  • packages/desktop-electron/src/main/server-client.ts
  • packages/desktop-electron/src/preload/index.ts
  • packages/desktop-electron/src/preload/types.ts

Comment thread packages/app/e2e/session/session-renderer-diagnostics.spec.ts
Comment thread packages/app/src/components/prompt-input/submit.ts Outdated
Comment thread packages/app/src/context/renderer-diagnostics.ts
Comment thread packages/app/src/pages/session/message-timeline.tsx Outdated
Comment thread packages/app/src/pages/session/use-session-scroll-dock.ts
Comment thread packages/desktop-electron/src/main/feedback.ts
Comment thread packages/desktop-electron/src/main/renderer-diagnostics.ts
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/desktop-electron/src/main/renderer-diagnostics.ts`:
- Around line 266-279: parseEventLine currently returns any object-shaped data
from disk; update parseEventLine (the function returning
RendererDiagnosticEvent) to re-sanitize fields before returning: validate
event.name and level against the known allowlists (e.g., permitted event name
strings and allowed level values) and reject if not in those sets, and rebuild
the returned data object by copying only the allowed keys from the original
value.data (using your established allowlist for data fields) so no unexpected
properties are passed through to exportRendererDiagnosticsLog() or slice(); keep
all other existing shape checks and return undefined on any mismatch.
- Around line 510-517: The high-frequency throttling uses Date.now(), causing
divergence from the recorder's injected clock; change the timestamp source in
the block that computes `current` (where `key`, `current`, `previous`,
`lastHighFrequency`, and `highFrequencyIntervalMs` are used) to use the
recorder's injected time function (e.g., `options.now()` or the same `now` used
elsewhere) instead of `Date.now()` so rate limiting and retention follow the
same clock as the rest of the recorder.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: a37a79ff-b098-402b-92c8-1d8837496a2b

📥 Commits

Reviewing files that changed from the base of the PR and between a7f1aec and 1373c4b.

📒 Files selected for processing (11)
  • packages/app/e2e/session/session-renderer-diagnostics.spec.ts
  • packages/app/src/app.tsx
  • packages/app/src/components/prompt-input/submit.ts
  • packages/app/src/context/renderer-diagnostics.ts
  • packages/app/src/pages/session/message-timeline.tsx
  • packages/app/src/pages/session/use-session-scroll-dock.ts
  • packages/desktop-electron/src/main/feedback.test.ts
  • packages/desktop-electron/src/main/feedback.ts
  • packages/desktop-electron/src/main/problem-report.ts
  • packages/desktop-electron/src/main/renderer-diagnostics.test.ts
  • packages/desktop-electron/src/main/renderer-diagnostics.ts
✅ Files skipped from review due to trivial changes (3)
  • packages/app/src/app.tsx
  • packages/desktop-electron/src/main/renderer-diagnostics.test.ts
  • packages/app/src/components/prompt-input/submit.ts
🚧 Files skipped from review as they are similar to previous changes (6)
  • packages/app/src/pages/session/use-session-scroll-dock.ts
  • packages/desktop-electron/src/main/feedback.ts
  • packages/app/e2e/session/session-renderer-diagnostics.spec.ts
  • packages/desktop-electron/src/main/feedback.test.ts
  • packages/desktop-electron/src/main/problem-report.ts
  • packages/app/src/context/renderer-diagnostics.ts

Comment on lines +266 to +279
function parseEventLine(line: string): RendererDiagnosticEvent | undefined {
try {
const value = JSON.parse(line) as RendererDiagnosticEvent
if (!value || typeof value !== "object") return undefined
if (typeof value.time !== "string") return undefined
if (typeof value["event.name"] !== "string") return undefined
if (typeof value.app_launch_id !== "string") return undefined
if (typeof value.window_id !== "string") return undefined
if (!value.data || typeof value.data !== "object" || Array.isArray(value.data)) return undefined
return value
} catch {
return undefined
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Re-sanitize JSONL lines before re-exporting them.

parseEventLine() currently trusts any parsed object with object-shaped data, so a tampered or stale log line can bypass the ingest allowlist here and get copied into exportRendererDiagnosticsLog() / slice() verbatim. Revalidate event.name and level, and rebuild data from the allowlist before returning an event read from disk.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/desktop-electron/src/main/renderer-diagnostics.ts` around lines 266
- 279, parseEventLine currently returns any object-shaped data from disk; update
parseEventLine (the function returning RendererDiagnosticEvent) to re-sanitize
fields before returning: validate event.name and level against the known
allowlists (e.g., permitted event name strings and allowed level values) and
reject if not in those sets, and rebuild the returned data object by copying
only the allowed keys from the original value.data (using your established
allowlist for data fields) so no unexpected properties are passed through to
exportRendererDiagnosticsLog() or slice(); keep all other existing shape checks
and return undefined on any mismatch.

Comment on lines +510 to +517
if (highFrequencyEvents.has(sanitized["event.name"])) {
const key = `${sanitized.window_id}:${sanitized["event.name"]}`
const current = Date.now()
const previous = lastHighFrequency.get(key)
if (previous !== undefined && current - previous < highFrequencyIntervalMs) {
return { ok: false as const, reason: "rate_limited" as const }
}
lastHighFrequency.set(key, current)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use the injected clock for high-frequency throttling too.

Line 512 uses Date.now() while the rest of the recorder uses options.now. That makes timestamps/retention and rate limiting diverge under fake clocks or any non-default time source, so high-frequency events can be dropped or accepted at the wrong times.

Suggested fix
-        const current = Date.now()
+        const current = now().getTime()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (highFrequencyEvents.has(sanitized["event.name"])) {
const key = `${sanitized.window_id}:${sanitized["event.name"]}`
const current = Date.now()
const previous = lastHighFrequency.get(key)
if (previous !== undefined && current - previous < highFrequencyIntervalMs) {
return { ok: false as const, reason: "rate_limited" as const }
}
lastHighFrequency.set(key, current)
if (highFrequencyEvents.has(sanitized["event.name"])) {
const key = `${sanitized.window_id}:${sanitized["event.name"]}`
const current = now().getTime()
const previous = lastHighFrequency.get(key)
if (previous !== undefined && current - previous < highFrequencyIntervalMs) {
return { ok: false as const, reason: "rate_limited" as const }
}
lastHighFrequency.set(key, current)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/desktop-electron/src/main/renderer-diagnostics.ts` around lines 510
- 517, The high-frequency throttling uses Date.now(), causing divergence from
the recorder's injected clock; change the timestamp source in the block that
computes `current` (where `key`, `current`, `previous`, `lastHighFrequency`, and
`highFrequencyIntervalMs` are used) to use the recorder's injected time function
(e.g., `options.now()` or the same `now` used elsewhere) instead of `Date.now()`
so rate limiting and retention follow the same clock as the rest of the
recorder.

Copy link
Copy Markdown
Owner Author

@Astro-Han Astro-Han left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR #392 Code Review: Renderer Diagnostics Observability

Overall this is a high-quality PR with clear architecture, comprehensive tests, and solid fail-open design. The main concerns are around type safety boundaries, performance details, and code organization.

P0 — Blocking (must fix)

  1. eventMatchesSession accesses data.*_session_id fields not in the allowlist for non-transition events — silent logic failure.
  2. PerformanceObserver uses deprecated entryTypes parameter — will trigger warnings in newer Chromium.
  3. flushRetention has a race condition when called concurrently (not serialized through enqueueWrite).

P1 — Serious (strongly recommended)

  1. stringField URL regex is overly broad and drops legitimate endpoint_kind values.
  2. capEvents has O(n²) JSON stringify overhead during truncation loops.
  3. session.timeline.visible effect dependency tracking creates large strings on every render.
  4. requestAnimationFrame in onScroll may fire after unmount.
  5. rendererDiagnosticsSlice IPC handler declares unused directory parameter.
  6. Truncation loop can delete protected events if all remaining are protected.

P2 — Moderate (recommended)

  1. highFrequencyIntervalMs default 250ms may miss fast scroll jumps.
  2. FPS calculation is inaccurate when tab is backgrounded.
  3. emitRendererDiagnostic silently swallows all errors even in dev.
  4. Menu placement of Export Diagnostics is inconsistent with conditional Report Problem.
  5. SliceInput.now is required in type but omitted in public API.
  6. route_ready and visible_ready always have identical values.

P3 — Minor / Style

16–24. Code organization (566-line file), test splitting, indentation consistency, threshold tuning, unit assumptions, and type safety nits.

See inline comments for specific locations and suggestions.

function parseEventLine(line: string): RendererDiagnosticEvent | undefined {
try {
const value = JSON.parse(line) as RendererDiagnosticEvent
if (!value || typeof value !== "object") return undefined
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0: The allowlist for data fields only includes from/to_*_session_id on session.identity.transition events. For other event types, accessing data.from_route_session_id etc. returns undefined silently. This means eventMatchesSession silently fails to match for non-transition events that happen to carry these fields in their raw input (which get stripped by sanitizeData).

Fix: Either add these fields to all event allowlists that need session matching, or guard with Object.hasOwn(data, key) before comparing.

const before = visibleCounts.get(sessionKey) ?? 0
const during = renderedCount(event)
visibleCounts.set(sessionKey, during)
if (before > 0 && during === 0) {
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0: PerformanceObserver.observe({ entryTypes: ["longtask"] }) uses the deprecated entryTypes parameter. Modern browsers recommend type: "longtask" with buffered: true. This will trigger console warnings in newer Chromium and may be removed in the future.

Fix: Change to observe({ type: "longtask", buffered: true }) and same for layout-shift. Keep the try/catch for fallback.

diagnostics: {
status,
event_count: capped.events.length,
incident_count: capped.events.filter(isIncident).length,
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0: flushRetention is not serialized through enqueueWrite. If called concurrently (e.g. external direct call + internal record trigger), two reads can race against each other and the later write can overwrite the earlier one's retained events.

Fix: Route flushRetention through enqueueWrite or add an explicit mutex so only one flush runs at a time.

"jank_count",
"long_task_max_ms",
"long_task_block_ms",
"cls",
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: The URL regex /\b[a-z0-9.-]+\.[a-z]{2,}(?:\/|\?|:)/i is overly broad. It drops legitimate endpoint_kind values like api.example.com/v1 even though they are in the allowlist. Users with custom endpoints will silently lose this field.

Fix: Either restrict endpoint_kind to a known enum ("prompt", "continue", "edit") at the call site, or make the URL regex more specific (require protocol or www. prefix).

export function emptyRendererDiagnosticsSlice(status: RendererDiagnosticsStatus, now: Date): RendererDiagnosticsSlice {
return {
status,
source: "renderer-diagnostics",
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: capEvents calls jsonBytes(selected) on every loop iteration, which internally does JSON.stringify. This is O(n²) stringification overhead when truncating large event lists.

Fix: Pre-compute per-event JSON byte counts and maintain a running total. Only recalculate when an event is removed.

})
})

onCleanup(() => {
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: rendered().join("\u0000") creates a large string on every render to track dependency changes. With 1000+ messages this string concatenation is non-trivial overhead and runs frequently.

Fix: Use on(rendered, ...) to track the array reference directly, or if content-level comparison is truly needed, use a lightweight hash (e.g. rendered().length) instead of full stringification.

name: "session.scroll.sample",
route_session_id: params.id,
visible_session_id: props.sessionID,
timeline_session_id: props.sessionID,
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: The requestAnimationFrame callback can fire after component unmount. onCleanup at line 261 cancels the frame, but if unmount happens between onScroll setting scrollSampleFrame and the RAF firing, the callback still executes and accesses potentially stale props/params.

Fix: Add a mounted ref checked inside the RAF callback, or use an AbortController tied to component lifecycle.

maxBytes: number
}) => Promise<RendererDiagnosticsSlice>
}

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: rendererDiagnosticsSlice declares a directory: string parameter but it is never used in the implementation (see index.ts:1269-1274). This creates a misleading contract for callers.

Fix: Remove the unused directory parameter from the type, or if it is meant for future use, add a TODO comment explaining its intended purpose.

while (bytes(output) > maxBytes && events.length > 0) {
const removeIndex = events.findIndex((event) => !isProtectedRendererDiagnosticEvent(event))
events.splice(removeIndex >= 0 ? removeIndex : 0, 1)
omittedRendererDiagnosticsBytes = Math.max(0, jsonBytes(original.events) - jsonBytes(events))
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: If all remaining events are protected (incidents + transitions), removeIndex becomes -1 and events.splice(0, 1) deletes a protected event. The loop then continues until events.length === 0, potentially discarding all incidents.

Fix: When no non-protected events remain, break the loop and mark status as truncated with a note that protected events were also removed.

path: string
destination: string
maxBytes?: number
now?: Date
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Default highFrequencyIntervalMs of 250ms may miss fast scroll jumps that resolve within a single animation frame (16ms). A jump-to-top followed by immediate auto-scroll correction could complete before the next sample window.

Fix: Consider lowering to 100ms for scroll events, or switch to event-driven sampling (record on significant scroll delta, not just time).

}

export function createSessionPerformanceDiagnostics(input: {
routeSessionID: Accessor<string | undefined>
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: When the tab is backgrounded, requestAnimationFrame is throttled to ~1fps. The FPS calculation will report artificially low values, potentially triggering false jank/incident alerts.

Fix: Skip the flush sample when document.visibilityState === 'hidden', or tag the sample with visibility: hidden so downstream analysis can filter it out.

image_count: submittedImageCount,
comment_count: submittedCommentCount,
},
}).catch(() => {})
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: void emitRendererDiagnostic(...).catch(() => {}) silently swallows all errors, including misconfigured IPC channels or missing APIs. In dev mode this could hide integration issues for a long time.

Fix: Add a one-time console.warn in dev mode when the API is missing or the call throws, so developers notice misconfiguration early.

@@ -157,6 +159,7 @@ export function buildWindowsMenuTemplate(options: BuildMenuOptions): MenuItemTem
if (feedbackEnabled) {
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: exportDiagnosticsLog is unconditionally added to the Help menu, while reportProblem is conditional on feedbackEnabled. If feedback is disabled, the menu shows Export Diagnostics + GitHub Issue without Report Problem, which feels inconsistent.

Fix: Consider placing Export Diagnostics next to Report Problem (both conditional), or in a separate Tools submenu.

source: "renderer-diagnostics",
generated_at: (input.now ?? new Date()).toISOString(),
diagnostics: {
status,
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: SliceInput requires now: Date but the public slice() method uses Omit<SliceInput, "appLaunchID" | "now">. This creates a confusing type where the internal type has a required field that callers never provide.

Fix: Split into SliceInput (public, without now/appLaunchID) and InternalSliceInput (extends with required internals), or rename for clarity.

emit: emitRendererDiagnostic,
})

createEffect(() => {
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: route_ready and visible_ready both use timelineMessagesReady(), so they are always identical. If the intent is to distinguish route-level readiness from visible/timeline readiness, this does not achieve that.

Fix: Either use different signals (e.g. routeReady() vs timelineMessagesReady()) or collapse to a single ready field.

@@ -0,0 +1,566 @@
import { appendFile, mkdir, readFile, rename, rm, stat, writeFile } from "node:fs/promises"
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: This file is 566 lines and contains four distinct concerns: sanitization, recording, slicing, and export. Per AGENTS.md guidance, files growing past a few hundred lines should prompt extraction into cohesive helpers.

Fix: Split into renderer-diagnostics/sanitizer.ts, recorder.ts, slice.ts, and exporter.ts.

"incident.session_timeline_remount": ["timeline_mount_count", "timeline_unmount_count"],
"incident.session_visible_messages_cleared": ["before_count", "during_count", "after_count"],
"incident.session_layout_shift": ["cls", "phase"],
"incident.session_jank_burst": ["long_task_max_ms", "frame_gap_ms", "phase"],
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: numberField accepts any finite number without range validation. Negative monotonic_ms or extremely large values (e.g. 1e308) will pass through and could corrupt downstream analytics.

Fix: Add a reasonable range guard: value >= 0 && value < 1e15.

distance_from_bottom: distanceFromBottom,
client_height: clientHeight,
user_scrolled: userScrolled ?? false,
},
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: detectSessionScrollJumpToTop uses Math.max(100, clientHeight / 2) as threshold. On a 4K display (clientHeight ~1200), the threshold becomes 600px. A user scrolled to 500px from bottom (common in long conversations) would be falsely flagged as a jump-to-top.

Fix: Use a relative threshold capped at a reasonable maximum, e.g. Math.min(200, clientHeight * 0.3).

routeSessionID: Accessor<string | undefined>
visibleSessionID: Accessor<string | undefined>
timelineSessionID: Accessor<string | undefined>
emit?: (event: RendererDiagnosticInput) => Promise<void> | void
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: usedJSHeapSize / 1024 / 1024 assumes the value is in bytes. The performance.memory API is Chrome-specific and documented as bytes, but this assumption should be explicit.

Fix: Add a comment confirming the byte assumption, or record raw bytes to avoid ambiguity.

clearTimeout(timer)
}
}

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: New code uses 4-space indentation while the rest of this file uses 2-space. The project disables Biome formatter, so this inconsistency will persist.

Fix: Match the existing 2-space indentation in server-client.ts.

@@ -477,6 +541,9 @@ function wireMenu() {
reportProblem: () => {
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: menuLabel(focusedMenuLocale(), ...) assumes focusedMenuLocale() always returns a valid MenuLocale. While unlikely, the type system allows undefined.

Fix: Add a fallback: menuLabel(focusedMenuLocale() ?? 'en', ...).

},
rendererDiagnostics: {
status: "ok" as const,
source: "renderer-diagnostics" as const,
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: The test data uses ts instead of time for the event timestamp field. While the test passes because parseProblemReportPayload is lenient, copying this data for other tests could cause type mismatches.

Fix: Use time to match the RendererDiagnosticEvent type definition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Task] Add renderer diagnostics for session jump and flicker observability

1 participant