fix(parser): keep last occurrence of streaming message id within file (#110)#155
fix(parser): keep last occurrence of streaming message id within file (#110)#155ousamabenyounes wants to merge 1 commit intogetagentseal:mainfrom
Conversation
…getagentseal#110) Claude Code streams an assistant response across several JSONL writes that share the same `message.id`: an early `message_start` (empty content), optional mid- stream updates, and a final `message_stop` carrying the `tool_use` blocks plus authoritative usage. `groupIntoTurns` deduplicates by id keeping the FIRST occurrence, so `tool_use` (MCP, Agent, EnterPlanMode, …) and final token counts in the last entry were silently dropped. Add a within-file pre-pass `dedupeStreamingMessageIds` in `parseSessionFile` that keeps the LAST occurrence of each message id. Cross-file dedup against `seenMsgIds` in `groupIntoTurns` stays keep-first-seen (it serves a different purpose: avoiding double-counting when the same session appears under several project dirs). Adds `tests/parser-streaming-dedup.test.ts` covering streaming dedup, mixed user/assistant entries, no-id passthrough, and ordering between distinct ids. Co-Authored-By: Ora Studio <[email protected]>
|
Hi, dedupeStreamingMessageIds keeps only the last occurrence of each message.id, which makes parseApiCall/groupIntoTurns use the later streaming update’s timestamp (typically message_stop) as the call timestamp. This can incorrectly include/exclude turns in dateRange filtering and shift day bucketing near midnight because the code explicitly uses the first assistant call timestamp as the cost-incurrence time. Severity: action required | Category: correctness How to fix: Preserve first timestamp per id Agent prompt to fix - you can give this to your LLM of choice:
Found by Qodo code review |
Summary
Fixes #110.
Claude Code writes the same
message.idmultiple times to the session JSONL as a response streams in. Only the final write carries thetool_useblocks (MCP servers, Agent, EnterPlanMode, …) and the authoritative token counts. The existing dedup ingroupIntoTurnskeeps the first occurrence by id, so for every streamed turn:tool_useblocks for Agent / EnterPlanMode / Bash are droppedFix
Add a within-file pre-pass
dedupeStreamingMessageIdsinparseSessionFilethat keeps the last occurrence of eachmessage.id. The cross-file dedup ingroupIntoTurns(againstseenMsgIds) is correct for what it does — avoiding double-counting when the same session appears under multiple project dirs — and stays keep-first-seen.The two concerns are now handled separately:
parseSessionFile(new)groupIntoTurns(unchanged)Verification
tests/parser-streaming-dedup.test.tsfails on the unfixed code (verified by stashing the fix and re-running)npx tsc --noEmitclean{}-init maps insrc/parser.tsorsrc/providers/(semgrep guard)Files changed
src/parser.tsdedupeStreamingMessageIds; called fromparseSessionFilebeforegroupIntoTurnstests/parser-streaming-dedup.test.tsVibe Coded by Ousama Ben Younes
Developed With Ora Studio (Claude Code)