feat(logging): server-side client log + crash capture (#106) by jaylfc · Pull Request #1436 · jaylfc/taOS

jaylfc · 2026-06-25T11:59:37Z

What

A backend sink for browser/PWA logs and crashes, since a PWA has no console anyone can read.

ClientLogStore (new): bounded SQLite store (ring-buffer prune to the most recent rows; message/stack length-capped).
POST /api/client-logs (authenticated): record a client-side log line (level in fatal/error/warn/info/debug, message, source, url, stack; user_agent captured server-side).
GET /api/client-logs (admin only): most recent entries, optional level filter (logs may carry stack traces + URLs).

Why

Jay: 'I cant read console logs from a pwa and neither will other users be able to. we need full logging and debug so we can chase the errors.' This is the substrate; once it lands and the Pi updates, a crash like the Messages app failure records itself and is readable server-side instead of needing a console.

Scope

Backend only. The front-end wiring (error-boundary + global handler POSTing here) is a deliberate follow-up PR. Wired into app.py (construct/init/close/state) + routes + the conftest client fixture, mirroring feedback_store.

Tests

10 new (store: create/list/level-filter/truncation/ring-buffer cap; routes: 201 post, admin list, level filter, invalid level, empty message, non-admin can post but not read). Feedback + observatory route suites still green (18).

A PWA has no readable console, so a front-end crash is invisible to the user and to us. Add a backend sink: POST /api/client-logs (authenticated) records a browser-side error/warn/info/debug/fatal line; GET /api/client-logs (admin) returns the most recent entries, optionally filtered by level. ClientLogStore is bounded (ring-buffer prune to the most recent rows; message/stack length-capped). This is the substrate for chasing crashes like the Messages app failure; the front-end error-boundary wiring is a follow-up.

qodo-code-review · 2026-06-25T11:59:43Z

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

coderabbitai · 2026-06-25T11:59:46Z

Warning

Review limit reached

@jaylfc, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 16 minutes and 27 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 9b34cbb2-e3ab-4413-98e2-a36430aa1ee5

📥 Commits

Reviewing files that changed from the base of the PR and between b95bb9b and b2dc838.

📒 Files selected for processing (7)

tests/conftest.py
tests/test_client_log_store.py
tests/test_routes_client_logs.py
tinyagentos/app.py
tinyagentos/client_log_store.py
tinyagentos/routes/__init__.py
tinyagentos/routes/client_logs.py

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/client-log-capture

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

gitar-bot · 2026-06-25T12:01:29Z

+    ) -> dict:
+        assert self._db is not None
+        item_id = str(uuid.uuid4())
+        created_at = datetime.now(timezone.utc).isoformat()


💡 Edge Case: Ring-buffer prune/ordering relies on created_at with no tie-breaker

created_at is a datetime.now(timezone.utc).isoformat() string and is the sole sort key used both for the ring-buffer prune (ORDER BY created_at DESC LIMIT MAX_ROWS) and for list_recent ordering. Two inserts that land in the same microsecond (plausible under the documented "crash loop posting on every render" scenario) produce identical created_at values, making the relative ordering of those rows undefined. At the MAX_ROWS boundary this means the prune can drop an arbitrary one of the tied rows, and list_recent can return tied rows in an unstable order. Functionally minor since the cap is approximate, but it weakens the "keep the most recent N" guarantee.

Suggested fix: add the primary key as a secondary sort key (e.g. ORDER BY created_at DESC, id DESC) in both the prune subquery and list_recent, or store a monotonically increasing INTEGER PRIMARY KEY / rowid and order by that.

_{Was this helpful? React with 👍 / 👎}

gitar-bot · 2026-06-25T12:01:30Z

+        # Ring-buffer prune: drop everything older than the newest MAX_ROWS.
+        await self._db.execute(
+            """
+            DELETE FROM client_logs WHERE id NOT IN (
+                SELECT id FROM client_logs ORDER BY created_at DESC LIMIT ?
+            )
+            """,
+            (MAX_ROWS,),
+        )
+        await self._db.commit()


💡 Performance: Ring-buffer prune runs a full-table DELETE on every insert

create() issues the DELETE FROM client_logs WHERE id NOT IN (SELECT id ... ORDER BY created_at DESC LIMIT ?) on every single insert, even when the table is well below MAX_ROWS (the common case). The NOT IN subquery materializes and scans the table on each write, which is pure overhead for the typical small-table case and adds write amplification exactly in the crash-loop scenario this endpoint is meant to absorb. Since any authenticated user can POST and there is no rate limiting, this amplifies the cost of a flood.

Suggested fix: only prune occasionally rather than every insert (e.g. probabilistically, every Nth insert, or when a cheap COUNT(*) exceeds MAX_ROWS by some slack), or prune by timestamp/rowid threshold instead of NOT IN.

_{Was this helpful? React with 👍 / 👎}

gitar-bot · 2026-06-25T12:01:31Z

Note

Your trial team has used its Gitar budget, so automatic reviews are paused. Upgrade now to unlock full capacity. Comment "Gitar review" to trigger a review manually.
Learn more about usage limits

Code Review 👍 Approved with suggestions 0 resolved / 2 findings

Adds a backend SQLite-based log sink for browser crash capture and telemetry. Add a tie-breaker to the ring-buffer pruning and optimize the deletion logic to avoid full-table scans on every insert.

💡 Edge Case: Ring-buffer prune/ordering relies on created_at with no tie-breaker

📄 tinyagentos/client_log_store.py:67 📄 tinyagentos/client_log_store.py:92-99 📄 tinyagentos/client_log_store.py:113 📄 tinyagentos/client_log_store.py:118

created_at is a datetime.now(timezone.utc).isoformat() string and is the sole sort key used both for the ring-buffer prune (ORDER BY created_at DESC LIMIT MAX_ROWS) and for list_recent ordering. Two inserts that land in the same microsecond (plausible under the documented "crash loop posting on every render" scenario) produce identical created_at values, making the relative ordering of those rows undefined. At the MAX_ROWS boundary this means the prune can drop an arbitrary one of the tied rows, and list_recent can return tied rows in an unstable order. Functionally minor since the cap is approximate, but it weakens the "keep the most recent N" guarantee.

Suggested fix: add the primary key as a secondary sort key (e.g. ORDER BY created_at DESC, id DESC) in both the prune subquery and list_recent, or store a monotonically increasing INTEGER PRIMARY KEY / rowid and order by that.

💡 Performance: Ring-buffer prune runs a full-table DELETE on every insert

📄 tinyagentos/client_log_store.py:91-100

create() issues the DELETE FROM client_logs WHERE id NOT IN (SELECT id ... ORDER BY created_at DESC LIMIT ?) on every single insert, even when the table is well below MAX_ROWS (the common case). The NOT IN subquery materializes and scans the table on each write, which is pure overhead for the typical small-table case and adds write amplification exactly in the crash-loop scenario this endpoint is meant to absorb. Since any authenticated user can POST and there is no rate limiting, this amplifies the cost of a flood.

Suggested fix: only prune occasionally rather than every insert (e.g. probabilistically, every Nth insert, or when a cheap COUNT(*) exceeds MAX_ROWS by some slack), or prune by timestamp/rowid threshold instead of NOT IN.

🤖 Prompt for agents

Code Review: Adds a backend SQLite-based log sink for browser crash capture and telemetry. Add a tie-breaker to the ring-buffer pruning and optimize the deletion logic to avoid full-table scans on every insert.

1. 💡 Edge Case: Ring-buffer prune/ordering relies on created_at with no tie-breaker
   Files: tinyagentos/client_log_store.py:67, tinyagentos/client_log_store.py:92-99, tinyagentos/client_log_store.py:113, tinyagentos/client_log_store.py:118

   `created_at` is a `datetime.now(timezone.utc).isoformat()` string and is the sole sort key used both for the ring-buffer prune (`ORDER BY created_at DESC LIMIT MAX_ROWS`) and for `list_recent` ordering. Two inserts that land in the same microsecond (plausible under the documented "crash loop posting on every render" scenario) produce identical `created_at` values, making the relative ordering of those rows undefined. At the MAX_ROWS boundary this means the prune can drop an arbitrary one of the tied rows, and `list_recent` can return tied rows in an unstable order. Functionally minor since the cap is approximate, but it weakens the "keep the most recent N" guarantee.
   
   Suggested fix: add the primary key as a secondary sort key (e.g. `ORDER BY created_at DESC, id DESC`) in both the prune subquery and `list_recent`, or store a monotonically increasing INTEGER PRIMARY KEY / rowid and order by that.

2. 💡 Performance: Ring-buffer prune runs a full-table DELETE on every insert
   Files: tinyagentos/client_log_store.py:91-100

   `create()` issues the `DELETE FROM client_logs WHERE id NOT IN (SELECT id ... ORDER BY created_at DESC LIMIT ?)` on every single insert, even when the table is well below MAX_ROWS (the common case). The NOT IN subquery materializes and scans the table on each write, which is pure overhead for the typical small-table case and adds write amplification exactly in the crash-loop scenario this endpoint is meant to absorb. Since any authenticated user can POST and there is no rate limiting, this amplifies the cost of a flood.
   
   Suggested fix: only prune occasionally rather than every insert (e.g. probabilistically, every Nth insert, or when a cheap `COUNT(*)` exceeds MAX_ROWS by some slack), or prune by timestamp/rowid threshold instead of `NOT IN`.

Options

Display: compact → Showing less information.

Comment with these commands to change:

`Compact`
`gitar display:verbose`

Important

Your trial ends in 2 days — upgrade now to keep code review, CI analysis, auto-apply, custom automations, and more.

_{Was this helpful? React with 👍 / 👎 | Gitar}

kilo-code-bot · 2026-06-25T12:01:52Z

+        await self._db.execute(
+            """
+            DELETE FROM client_logs WHERE id NOT IN (
+                SELECT id FROM client_logs ORDER BY created_at DESC LIMIT ?


WARNING: Ring-buffer prune ORDER BY created_at DESC LIMIT ? has no tiebreaker. created_at is datetime.now(timezone.utc).isoformat() (microsecond precision), so concurrent inserts, or any two writes in the same microsecond, share a value and SQLite returns rows in arbitrary order from the inner SELECT. The DELETE then keeps a non-deterministic subset of the 2000 "newest" rows — under load the cap effectively becomes "2000 rows in some order". Add a deterministic tiebreaker (e.g. ORDER BY created_at DESC, id DESC LIMIT ?).

Reply with @kilocode-bot fix it to have Kilo Code address this issue.

kilo-code-bot · 2026-06-25T12:01:52Z

+
+
+class ClientLogIn(BaseModel):
+    level: str


WARNING: ClientLogIn.level: str is unconstrained — any string is accepted at the Pydantic layer. The actual whitelist check (level in VALID_LEVELS) happens in the handler at line 33, which is fine, but a missing/typed-differently level (e.g. an int, or a client sending null) will 500 inside body.level.strip().lower() rather than producing a clean 400. Use Literal["fatal","error","warn","info","debug"] (or a Pydantic field_validator) so validation happens in the model and the contract is explicit.

Reply with @kilocode-bot fix it to have Kilo Code address this issue.

kilo-code-bot · 2026-06-25T12:01:52Z

+    )
+    return rec
+
+


SUGGESTION: Two minor inconsistencies on the GET handler:

level: str | None = None is unconstrained — a query like ?level[]=error will reach lvl.strip() and raise AttributeError → 500. FastAPI will accept a list, so guard it (e.g. level: str = "" plus the empty check, or Query(...) with the right type).

An empty ?level= is silently treated as "no filter" ("".strip().lower() is falsy, so the if lvl and lvl not in VALID_LEVELS check is skipped), while ?level=bogus returns 400. Either reject empty explicitly or document the behavior so it's not a surprise.

Reply with @kilocode-bot fix it to have Kilo Code address this issue.

kilo-code-bot · 2026-06-25T12:01:52Z

+    stack: str = ""
+
+
+@router.post("/api/client-logs", status_code=201)


WARNING: POST /api/client-logs is authenticated but not rate-limited, and the sink is a shared 2000-row ring buffer. Any logged-in user can flood the endpoint to evict other users' recent errors / crashes from the table before an admin can read them — a deliberate-log-spam DoS against debuggability. Consider a per-user rate limit (and/or a per-user partition in the ring buffer, since the admin view is keyed by level not by user_id).

Reply with @kilocode-bot fix it to have Kilo Code address this issue.

kilo-code-bot · 2026-06-25T12:01:52Z

+import pytest
+
+from tinyagentos.auth_context import CurrentUser, current_user
+


SUGGESTION: There is no end-to-end test verifying that the User-Agent request header is captured server-side and persisted on the row. The store accepts and truncates user_agent (see ClientLogStore.create and test_long_message_and_stack_are_truncated in the store tests), and the route passes request.headers.get("user-agent", "") — but no test exercises that path. Add an assertion that user_agent round-trips through the POST.

Reply with @kilocode-bot fix it to have Kilo Code address this issue.

kilo-code-bot · 2026-06-25T12:01:53Z

+
+
+@pytest.mark.asyncio
+async def test_long_message_and_stack_are_truncated(store):


SUGGESTION: test_long_message_and_stack_are_truncated only covers message and stack. The store also caps source (200), url (1000), and user_agent (500) — none of those truncation behaviors are tested. Add a sibling assertion that over-long source / url / user_agent are truncated to their respective MAX_*_LEN caps.

Reply with @kilocode-bot fix it to have Kilo Code address this issue.

kilo-code-bot · 2026-06-25T12:02:04Z

Code Review Summary

Status: 2 Issues Found | Recommendation: Address before merge

Overview

Severity	Count
CRITICAL	0
WARNING	0
SUGGESTION	2

Issue Details (click to expand)

CRITICAL

None

WARNING

None

SUGGESTION

File	Line	Issue
`tests/test_routes_client_logs.py`	53	New `test_post_is_rate_limited_per_user` only exercises a single user, so it does not verify that `RateLimiter` buckets are independent per `user_id`
`tinyagentos/routes/client_logs.py`	22	Module-level `_post_limiter` singleton has an unbounded `RateLimiter._buckets` dict (one `TokenBucket` per distinct `user_id`, never evicted)

Previous Issues — Re-verified on Current HEAD

File	Line	Previous	Status
`tinyagentos/client_log_store.py`	95	Ring-buffer prune `ORDER BY created_at DESC` has no tiebreaker	Still open — unchanged in this increment
`tinyagentos/routes/client_logs.py`	26	`ClientLogIn.level: str` unconstrained	Still open — unchanged in this increment
`tinyagentos/routes/client_logs.py`	33	POST `/api/client-logs` not rate-limited	Resolved — `b2dc838b` adds per-user `_post_limiter` (capacity=30, refill 1/sec) and a 429 on exhaustion; covered by new `test_post_is_rate_limited_per_user`
`tinyagentos/routes/client_logs.py`	62	GET `level` param unconstrained + `?level=` silently no-ops	Still open — unchanged in this increment
`tests/test_routes_client_logs.py`	5	No end-to-end test that `User-Agent` header round-trips to `user_agent` column	Still open — unchanged in this increment
`tests/test_client_log_store.py`	39	Truncation test covers `message`/`stack` only	Still open — unchanged in this increment

Files Reviewed (2 files changed in this increment)

tinyagentos/routes/client_logs.py - 1 issue
tests/test_routes_client_logs.py - 1 issue

Fix these issues in Kilo Cloud

Previous Review Summary (commit f302ed1)

Current summary above is authoritative. Previous snapshots are kept for context only.

Previous review (commit `f302ed1`)

Status: 6 Issues Found | Recommendation: Address before merge

Overview

Severity	Count
CRITICAL	0
WARNING	3
SUGGESTION	3

Issue Details (click to expand)

WARNING

File	Line	Issue
`tinyagentos/client_log_store.py`	95	Ring-buffer prune `ORDER BY created_at DESC LIMIT ?` has no tiebreaker; concurrent inserts with identical microsecond `created_at` make the cap non-deterministic
`tinyagentos/routes/client_logs.py`	20	`ClientLogIn.level: str` is unconstrained; non-string values 500 inside `.strip().lower()` instead of returning 400
`tinyagentos/routes/client_logs.py`	27	POST is authenticated but not rate-limited; a single user can flood the shared 2000-row ring buffer to evict other users' recent errors (DoS against debuggability)

SUGGESTION

File	Line	Issue
`tinyagentos/routes/client_logs.py`	52	GET `level` param is unconstrained (list param 500s on `.strip()`) and `?level=` is silently treated as no-filter while `?level=invalid` returns 400
`tests/test_routes_client_logs.py`	5	No end-to-end test asserts that the `User-Agent` request header round-trips into the persisted `user_agent` column
`tests/test_client_log_store.py`	39	Truncation test covers `message`/`stack` only; `source` (200), `url` (1000), `user_agent` (500) caps are untested

Files Reviewed (7 files)

tests/conftest.py - 0 issues
tests/test_client_log_store.py - 1 issue
tests/test_routes_client_logs.py - 1 issue
tinyagentos/app.py - 0 issues
tinyagentos/client_log_store.py - 1 issue
tinyagentos/routes/__init__.py - 0 issues
tinyagentos/routes/client_logs.py - 3 issues

Fix these issues in Kilo Cloud

_{Reviewed by minimax-m3 · Input: 39.6K · Output: 6.2K · Cached: 382.5K}

Without a cap, a crash loop or one user could flood the shared 2000-row ring buffer and evict everyone else's recent errors (a DoS against debuggability). Add a per-user token bucket (burst ~30, ~1/sec sustained); only valid writes are limited, so malformed requests still get a clean 400 and never touch the buffer.

kilo-code-bot · 2026-06-25T12:10:14Z

+
+
+@pytest.mark.asyncio
+async def test_post_is_rate_limited_per_user(client, monkeypatch):


SUGGESTION: The test only exercises a single user, so it does not verify that buckets are independent per user. The whole point of RateLimiter(key=user.user_id) is that user A being throttled does not block user B — and the same _post_limiter singleton is shared across all users. Add a sibling assertion: with capacity=2, two posts as user A then two as user B should both succeed (and only a third post by either user is rejected).

Reply with @kilocode-bot fix it to have Kilo Code address this issue.

kilo-code-bot · 2026-06-25T12:10:14Z

+# Per-user token bucket: a crash loop (or one user) must not be able to flood the
+# shared ring buffer and evict everyone else's recent errors. Allows a burst of
+# ~30 lines (a noisy crash) then ~1/sec sustained.
+_post_limiter = RateLimiter(capacity=30, refill_per_second=1.0)


SUGGESTION: _post_limiter is a module-level singleton whose RateLimiter._buckets dict grows without bound — every distinct user.user_id adds a TokenBucket that is never evicted. For a self-hosted single-process controller with a stable user set this is fine in practice, but under churn (test users, ephemeral sessions, user-id rotation) the dict can grow unboundedly for the life of the process. Consider either (a) documenting this trade-off next to the singleton, or (b) bounding the registry (e.g. an LRU cap) to prevent the latent memory leak.

Reply with @kilocode-bot fix it to have Kilo Code address this issue.

inflight, auth incident #135, bot-budget)

) gitar Edge-Case on #1436: list ordering and the ring-buffer prune used the created_at ISO string as the sole sort key, so same-microsecond rows under a crash loop tied and made both non-deterministic. Order/retain by rowid (monotonic insert order) as the tie-breaker, and prune via an indexed rowid comparison instead of a per-insert full-table NOT IN scan (folds the deferred performance nit too). Adds a tie-broken prune+order test.

jaylfc enabled auto-merge (squash) June 25, 2026 11:59

jaylfc added this to TinyAgentOS Roadmap Jun 25, 2026

github-project-automation Bot moved this to Todo in TinyAgentOS Roadmap Jun 25, 2026

gitar-bot Bot reviewed Jun 25, 2026

View reviewed changes

kilo-code-bot Bot reviewed Jun 25, 2026

View reviewed changes

jaylfc added a commit that referenced this pull request Jun 25, 2026

docs(status): record 2026-06-25 session (#1433/#1434/#1435 merged, #1436

2a6515d

inflight, auth incident #135, bot-budget)

jaylfc mentioned this pull request Jun 25, 2026

feat(logging): front-end crash capture + in-OS Logs viewer (#106) #1437

Merged

jaylfc merged commit a0e44f1 into dev Jun 25, 2026
8 checks passed

github-project-automation Bot moved this from Todo to Done in TinyAgentOS Roadmap Jun 25, 2026

jaylfc mentioned this pull request Jun 25, 2026

fix(logging): rowid tie-breaker for client-log ring buffer (#137) #1438

Merged

		stack: str = ""


		@router.post("/api/client-logs", status_code=201)

		import pytest

		from tinyagentos.auth_context import CurrentUser, current_user



		@pytest.mark.asyncio
		async def test_long_message_and_stack_are_truncated(store):



		@pytest.mark.asyncio
		async def test_post_is_rate_limited_per_user(client, monkeypatch):

Uh oh!

Conversation

jaylfc commented Jun 25, 2026

What

Why

Scope

Tests

Uh oh!

qodo-code-review Bot commented Jun 25, 2026

Qodo reviews are paused for this user.

Uh oh!

coderabbitai Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Uh oh!

gitar-bot Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

gitar-bot Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

gitar-bot Bot commented Jun 25, 2026

Uh oh!

kilo-code-bot Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Summary

Overview

CRITICAL

WARNING

SUGGESTION

Previous review (commit f302ed1)

Overview

WARNING

SUGGESTION

Uh oh!

kilo-code-bot Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Jun 25, 2026 •

edited

Loading

kilo-code-bot Bot commented Jun 25, 2026 •

edited

Loading

Previous review (commit `f302ed1`)