feat: nanochat worker — Karpathy's LLM pipeline on iii-engine by rohitg00 · Pull Request #3 · iii-hq/workers

rohitg00 · 2026-03-29T22:19:20Z

Summary

Python worker that brings Karpathy's nanochat onto the III engine. 20 functions covering the full LLM pipeline, tested end-to-end: train a model, load it, generate text, chat, persist conversations, evaluate benchmarks, manage checkpoints.

Training runs the actual nanochat scripts via a pre-forked subprocess launcher (fork before iii connects, so the WebSocket stays intact). Inference, eval, and tokenization run in-process. All state through iii primitives. All handlers typed with Pydantic for auto schema extraction.

nanochat source included as a git submodule at nanochat-upstream/.

What's included

nanochat/worker.py - 1,016 lines, 20 functions, 20 HTTP triggers, 5 state scopes
nanochat/pyproject.toml - Dependencies with [build-system]
nanochat/README.md - Full documentation with E2E test results and architecture
nanochat/nanochat-upstream - Git submodule pointing to karpathy/nanochat
registry/index.json - Updated with nanochat entry
image-resize/src/manifest.rs - Fixed pre-existing test failure

Functions

Function	What it does
nanochat.chat.complete	Chat completion with session persistence
nanochat.chat.stream	Token-by-token generation
nanochat.chat.history	Conversation history from iii state
nanochat.model.load	Load checkpoint into memory
nanochat.model.status	Model config and parameter count
nanochat.model.sample	Raw text generation
nanochat.tokenizer.encode	Text to BPE tokens
nanochat.tokenizer.decode	Tokens to text
nanochat.tools.execute	Python code execution
nanochat.train.tokenizer	Train BPE tokenizer (runs scripts/tok_train.py)
nanochat.train.base	Pretrain GPT from scratch (runs scripts/base_train.py)
nanochat.train.sft	SFT with full task mixture (runs scripts/chat_sft.py)
nanochat.train.rl	GRPO on GSM8K (runs scripts/chat_rl.py)
nanochat.train.status	Training progress from iii state
nanochat.eval.core	CORE benchmark (calls evaluate_core)
nanochat.eval.loss	Bits-per-byte (calls evaluate_bpb)
nanochat.eval.chat	ChatCORE eval (calls run_chat_eval)
nanochat.checkpoint.save	Save model to disk
nanochat.checkpoint.list	List checkpoints
nanochat.health	Worker health check

E2E test results

Trained a 2-layer 1.9M param GPT on CPU (5 steps), loaded it through the worker, ran inference:

1. Load model   -> 1,966,134 params, 2 layers, 128 dim
2. Sample       -> generates text
3. Chat         -> completion with session tracking (26 tokens)
4. History      -> 1 session stored in iii state
5. Tokenizer    -> encode/decode roundtrip
6. Tools        -> print(42) = 42
7. Model status -> full config visible
8. Health       -> worker alive after all operations

8/8 passed

Architecture decisions

Pre-forked subprocess launcher. fork() from inside iii-sdk handlers corrupts the WebSocket on macOS. The worker forks a child process before connecting to iii. Training handlers send jobs to the child via a Pipe. The child runs Popen safely. Results come back with stdout lines that get parsed for metrics and pushed to iii state.

Async handlers with trigger_async. All handlers that read/write iii state use async def and await trigger_async. GPU operations run synchronously within the async handler.

GPUState.snapshot(). All handlers copy model/tokenizer/engine/meta under lock before using them. Prevents races during concurrent model.load calls.

safe() wrapper. Every handler is wrapped to catch exceptions, log server-side, and return clean error dicts. Prevents WebSocket crashes from unhandled exceptions.

SDK v0.10.0 patterns. Pydantic type hints for auto schema extraction. TelemetryOptions. Service hierarchy. No sleep between registrations.

proof is an iii worker that scans code changes and verifies them in a real Chromium browser using snapshot-driven accessibility testing. 25 registered functions: - 14 browser tools (navigate, snapshot, click, type, screenshot, console logs, network requests, performance metrics, raw Playwright exec, assertions, CDP discovery, cookie injection) - 11 pipeline functions (scan, coverage, execute, report, run, replay, flows, history, enqueue, cleanup) 8 HTTP endpoints for REST access. Uses iii primitives throughout: - All inter-function calls via iii.trigger() - State for reports and saved flows - Streams for real-time progress - Queue + DLQ for CI runs with auto-retry - Logger with OTel tracing Default mode: Claude Code or Codex as the agent (no API key). Automated mode: Anthropic API for headless CI (needs ANTHROPIC_API_KEY). 1,506 lines across 8 TypeScript files.

Wraps nanochat (tokenizer, pretraining, SFT, eval, inference, tool use) as 13 iii functions with typed Pydantic schemas, async handlers, and proper triggers. Any connected worker can train, evaluate, and chat with locally-trained GPT models. - 13 functions (chat, model, tokenizer, tools, eval, train, health) - 13 triggers (12 HTTP + 1 queue for long-running training) - 4 iii state scopes (sessions, models, training, evals) - Pydantic type hints for auto request/response schema extraction - Async handlers with trigger_async for state I/O - safe() wrapper preventing WebSocket crashes from handler exceptions - Tested 10/10 on live iii engine v0.10.0

coderabbitai · 2026-03-29T22:19:33Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds two new workers: a Python nanochat worker (model loading, chat, tokenizer, eval, SFT training, checkpoints, health) and a TypeScript proof worker (Playwright browser automation, Anthropic agent orchestration, git scan/coverage, replay, and persistence).

Changes

Cohort / File(s)	Summary
nanochat docs & packaging `nanochat/README.md`, `nanochat/pyproject.toml`, `.gitmodules`, `nanochat/nanochat-upstream`	Adds README describing APIs/state scopes/triggers, pyproject with package metadata/deps/entrypoint, git submodule config and upstream pointer update.
nanochat worker implementation `nanochat/worker.py`	New Python worker: GPUState, safe handler wrapper, model load/status/sample, chat (streaming & batch), session persistence, tokenizer encode/decode, sandbox exec, checkpoint save/list, eval (CORE/loss/chat), queued SFT training with state progress, health endpoint, and many Pydantic schemas.
registry `registry/index.json`	Adds `nanochat` registry entry with metadata, default_config, language, supported targets, and version.
proof docs & packaging `proof/README.md`, `proof/package.json`, `proof/tsconfig.json`	Adds proof README, npm package metadata (Playwright/iii-sdk deps, postinstall Chromium), scripts, and TypeScript config.
proof types & tooling `proof/src/types.ts`, `proof/src/tools.ts`	Introduces TS types (RunReport, StepResult, BrowserSession, etc.), tool definitions, and Anthropic tool mapping.
proof git scan & coverage `proof/src/context.ts`	Implements git diff scanning (unstaged/staged/branch/commit) and heuristic test-coverage analysis resolving imports to source files.
proof agent & prompt builder `proof/src/agent.ts`, `proof/src/prompt.ts`	Adds Anthropic-driven agent loop (`runAgent`) that parses step markers, invokes tools via iii triggers, streams progress, and builds user prompts from diffs/files/coverage.
proof browser automation & cookies `proof/src/browser.ts`, `proof/src/cookies.ts`	Playwright session management with ARIA snapshot ref mapping, navigate/click/type/select/press handlers, screenshot/console/network/perf capture, sandboxed exec, cookie extraction/injection for Chrome/Firefox, replay and cleanup.
proof orchestration worker `proof/src/worker.ts`	New TS worker registering browser lifecycle/tools, scan/coverage/execute/run/replay/report/flows/history/cleanup handlers, queue enqueueing, single-active-run enforcement, state/stream persistence, and HTTP triggers.
proof build/test & misc `proof/src/...`, `proof/` files	New TypeScript source tree, build/test scripts (tsc, vitest), dev deps, and CI/usage notes.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant III as iii Engine
    participant Nano as nanochat Worker
    participant GPU as GPUState
    participant Model as nanochat.engine

    Client->>III: POST /nanochat/chat.complete
    III->>Nano: invoke chat complete handler
    Nano->>GPU: acquire lock / ensure model loaded
    alt model not loaded
        GPU->>Model: load_model(source, device)
        Model-->>GPU: model + tokenizer
    end
    Nano->>Model: generate / stream tokens
    Model-->>Nano: tokens / final output
    Nano->>III: state::set session history
    Nano-->>Client: ChatCompleteOutput

sequenceDiagram
    participant Client
    participant III as iii Engine
    participant Proof as proof Worker
    participant Browser as Playwright
    participant Agent as Anthropic
    participant Tools as iii Tools

    Client->>III: POST /proof/run
    III->>Proof: trigger proof::run
    Proof->>III: trigger proof::scan
    alt no changes
        Proof-->>Client: {status: "skip"}
    else changes found
        Proof->>III: trigger proof::coverage
        Proof->>III: trigger proof::execute
        III->>Proof: launch browser session
        Proof->>Browser: new BrowserSession
        Proof->>Agent: runAgent(...) via trigger callback
        loop iterations
            Agent->>Anthropic: messages.create (SYSTEM_PROMPT + user)
            alt tool_use found
                Agent->>III: trigger browser tool fn
                III->>Browser: perform action (click/navigate/...)
                Browser-->>III: snapshot/result
                III-->>Agent: tool_result
            end
            Agent->>III: stream::set progress
        end
        Agent-->>Proof: RunReport
        Proof->>III: state::set report / save flow
        Proof->>Browser: closeAll()
        Proof-->>Client: RunReport
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 I hopped through diffs and code tonight,

nanochat chats and models take flight,
proof clicks pages, snapshots gleam,
agents and browsers chase the dream,
two new workers, nimble and bright.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 6.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat: nanochat worker — Karpathy's LLM pipeline on iii-engine' accurately and concisely summarizes the main change: adding a new nanochat worker that integrates Karpathy's LLM pipeline into the iii-engine.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/nanochat-worker

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Drop the SDK patterns section (internal detail users don't need), condense testing section to results and known issues only.

coderabbitai

Actionable comments posted: 11

🧹 Nitpick comments (4)

nanochat/worker.py (3)

55-64: Intentional broad exception handling is acceptable here.

The safe() wrapper's purpose is to prevent WebSocket crashes by converting any unhandled exception into an error response. This is a valid pattern for worker handlers where stability is critical.

Consider using functools.wraps for cleaner attribute preservation:

♻️ Optional: Use functools.wraps

+import functools
+
 def safe(fn):
     """Wrap async handler so unhandled exceptions return error dicts, never crash the WebSocket."""
+    `@functools.wraps`(fn)
     async def wrapper(data):
         try:
             return await fn(data)
         except Exception as e:
             return {"error": str(e), "traceback": traceback.format_exc()}
-    wrapper.__name__ = fn.__name__
-    wrapper.__annotations__ = fn.__annotations__
     return wrapper

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@nanochat/worker.py` around lines 55 - 64, The safe() wrapper currently copies
name and annotations manually; replace that with functools.wraps to preserve
metadata more robustly: import functools, then decorate wrapper with
`@functools.wraps`(fn) (keeping wrapper async and returning the same error dict
behavior), and remove the manual assignments to wrapper.__name__ and
wrapper.__annotations__; ensure traceback.format_exc() remains used so
error/traceback are still returned.

489-489: Prefix unused variable with underscore.

The meta variable from load_model() is unpacked but never used in this function.

♻️ Fix unused variable

-        model, tokenizer, meta = load_model(inp.source, device, "base", model_tag=inp.model_tag, step=inp.step)
+        model, tokenizer, _meta = load_model(inp.source, device, "base", model_tag=inp.model_tag, step=inp.step)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@nanochat/worker.py` at line 489, The unpacked return from load_model(...)
currently assigns to model, tokenizer, meta but meta is unused; change the
unpacking to model, tokenizer, _ = load_model(inp.source, device, "base",
model_tag=inp.model_tag, step=inp.step) (or model, tokenizer, _meta = ...) to
prefix the unused third value with an underscore so linters know it is
intentionally unused and no other code paths are affected.

430-432: Specify encoding when opening files.

For cross-platform consistency, explicitly specify UTF-8 encoding.

♻️ Add encoding

-    with open(tasks_yaml) as f:
+    with open(tasks_yaml, encoding="utf-8") as f:
         tasks = yaml.safe_load(f)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@nanochat/worker.py` around lines 430 - 432, When opening the tasks_yaml file
before calling yaml.safe_load, specify the file encoding for cross-platform
consistency by passing UTF-8 to the open call (i.e. update the open(tasks_yaml)
usage so the file is opened with encoding="utf-8"); keep the same with-context
and yaml.safe_load call and ensure variable names tasks and tasks_yaml remain
unchanged.

proof/README.md (1)

113-127: Add language specifier to code blocks.

Lines 113 and 230 have fenced code blocks without language specifiers, which triggers MD040 lint warnings. For ASCII diagrams, use text or plaintext.

♻️ Fix markdown lint warnings

-```
+```text
 proof::scan          git diff → changed files, commits

And similarly for line 230:

-```
+```text
 ┌──────────────────────────────────────────┐

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@proof/README.md` around lines 113 - 127, Update the two fenced ASCII diagram
blocks in README.md (the block that begins with "proof::scan          git diff →
changed files, commits" and the block that starts with the box drawing
"┌──────────────────────────────────────────┐") to include a language specifier
by adding "text" to the opening fence so the blocks become fenced as text code
blocks; this will silence the MD040 lint warnings while preserving the diagrams.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nanochat/pyproject.toml`:
- Around line 1-22: Create a package initializer file nanochat/__init__.py (e.g.
define __version__ = "0.1.0" or leave empty) so the nanochat module is a proper
package, and update the console script entry point in pyproject.toml from
"worker:main" to "nanochat.worker:main" to reference the worker module inside
the package; also add a PEP 517/518 [build-system] section to pyproject.toml
(include a minimal requires list such as setuptools and wheel and set
build-backend to setuptools.build_meta) so builds comply with build-system
metadata requirements.

In `@nanochat/README.md`:
- Around line 161-174: The fenced code block in README.md is missing a language
tag which causes markdownlint and syntax highlighting to fail; update the
opening fence from ``` to something like ```text or ```console (or ```log) so
the block is labeled (e.g., change the block showing nanochat
health/tokenizer/chat outputs to ```text) and save the file.
- Around line 78-80: Update the README to remove or reword the claim that
nanochat.tools.execute is "sandboxed" and instead explicitly state that the
worker runs arbitrary Python code using in-process exec() (referencing
nanochat.tools.execute and the worker implementation), note this does NOT
provide strong isolation, and add a clear operator warning recommending running
the worker in a hardened/isolated environment (e.g., separate VM/container,
unprivileged user, syscall filters) or replacing exec() with a true sandboxed
executor before exposing to untrusted inputs; apply this change to the
high-level description and the later implementation note that currently mentions
exec().

In `@nanochat/worker.py`:
- Line 657: The print statement using an unnecessary f-string prefix
(print(f"[nanochat] 13 functions, 13 triggers (12 HTTP + 1 queue)")) should be
changed to a regular string literal; locate the print call in worker.py (the
line that prints "[nanochat] 13 functions, 13 triggers (12 HTTP + 1 queue)") and
remove the leading f so it becomes print("[nanochat] 13 functions, 13 triggers
(12 HTTP + 1 queue)").

In `@proof/src/agent.ts`:
- Around line 35-36: The runStatus should not default to "pass" — change the
initialization of runStatus to a non-passing default (e.g., "error") and ensure
runStatus is only set to "pass" when a RUN_COMPLETED marker is observed;
explicitly set runStatus to "fail" when processing ASSERTION_FAILED or any tool
failure markers; update any code that synthesizes a fallback step (and usages
around recordedActions) to rely on the adjusted runStatus so a missing
RUN_COMPLETED does not yield a false "pass" (check all handling around
runStatus, RUN_COMPLETED, ASSERTION_FAILED, and recordedActions including the
other occurrences noted).

In `@proof/src/browser.ts`:
- Around line 129-149: The refs collapse when multiple elements share the same
role+name because refMap only stores {role,name} and resolveRef always uses
.first(); update the ref population (where refMap.set is called) to also capture
the element's ordinal (e.g., index among page.getByRole(role, { name })) or a
stable locator snapshot and store it in RefEntry (e.g., add an index/ordinal
property), and then change resolveRef to locate all matches and select the
stored ordinal (use locator.nth(entry.index) or equivalent) instead of .first();
keep a safe fallback if index is missing to preserve current behavior.
- Around line 237-249: The current handlePlaywrightExec uses AsyncFunction and
executes untrusted code in the worker process (see AsyncFunction and fn), so
change it to run the provided code inside the browser page context via
Playwright's page.evaluate to sandbox execution; serialize session.refMap
entries and pass them into page.evaluate, recreate the ref(id) helper inside the
browser context (using the serialized role/name entries) and then create and
invoke the async function from the code string inside page.evaluate (so no
host-side AsyncFunction or access to process/dynamic import), keeping function
name handlePlaywrightExec and ref as the identifying symbols to locate and
replace the implementation.

In `@proof/src/cookies.ts`:
- Around line 65-69: The SQLite SELECT that builds cookies (currently using
WHERE host_key LIKE '%${domain...}') is too permissive and should instead match
either the exact host or any subdomain (e.g., host_key = 'example.com' OR
host_key LIKE '%.example.com'); update the query in cookies.ts where
execFileAsync is called (the SELECT name, value, host_key as domain, ...) to use
host_key = '<domain>' OR host_key LIKE '.<domain>' pattern (escaping domain
properly), and apply the identical predicate change to the corresponding Firefox
query block referenced around lines 130-134 so both Chromium and Firefox imports
only match exact hosts or their subdomains.

In `@proof/src/worker.ts`:
- Around line 125-133: The handlers registered for "proof::scan" and
"proof::coverage" forward an untrusted input.cwd into
scanChanges/analyzeTestCoverage which is later passed to simpleGit, recursive
walks and file reads; change these handlers to ignore or validate input.cwd and
always resolve to a server-controlled safe root (e.g., repository root or a
configured workspace base) before calling scanChanges/analyzeTestCoverage.
Specifically, in the registerFunction blocks for id "proof::scan" and
"proof::coverage" (and the other similar blocks at the noted locations), replace
use of input.cwd with a sanitized value: compute path.resolve(baseRoot,
input.cwd ?? ".") and then assert the resolved path startsWith(baseRoot) (or
simply use baseRoot/untrusted cwd removed), and pass that sanitizedRoot to
scanChanges/analyzeTestCoverage; also update context.ts usage around
simpleGit/file reads to accept only validated roots.
- Around line 39-58: acquireRun() and releaseRun() must be exception-safe: wrap
the launch/interaction sequence in a try/finally so releaseRun() always runs
even if launchBrowser, cookie injection, autoDiscoverCdp(), or any subsequent
step throws; move cookie injection and any other pre-launch work inside the
protected block after acquireRun() and call releaseRun() from the finally; do
the same for the close path so closeBrowser(input.runId) is invoked inside try
and releaseRun() is called in finally to avoid leaving activeRunId set; adjust
the implementations around launchBrowser, closeBrowser, acquireRun, releaseRun
and any cookie injection logic accordingly.

---

Nitpick comments:
In `@nanochat/worker.py`:
- Around line 55-64: The safe() wrapper currently copies name and annotations
manually; replace that with functools.wraps to preserve metadata more robustly:
import functools, then decorate wrapper with `@functools.wraps`(fn) (keeping
wrapper async and returning the same error dict behavior), and remove the manual
assignments to wrapper.__name__ and wrapper.__annotations__; ensure
traceback.format_exc() remains used so error/traceback are still returned.
- Line 489: The unpacked return from load_model(...) currently assigns to model,
tokenizer, meta but meta is unused; change the unpacking to model, tokenizer, _
= load_model(inp.source, device, "base", model_tag=inp.model_tag, step=inp.step)
(or model, tokenizer, _meta = ...) to prefix the unused third value with an
underscore so linters know it is intentionally unused and no other code paths
are affected.
- Around line 430-432: When opening the tasks_yaml file before calling
yaml.safe_load, specify the file encoding for cross-platform consistency by
passing UTF-8 to the open call (i.e. update the open(tasks_yaml) usage so the
file is opened with encoding="utf-8"); keep the same with-context and
yaml.safe_load call and ensure variable names tasks and tasks_yaml remain
unchanged.

In `@proof/README.md`:
- Around line 113-127: Update the two fenced ASCII diagram blocks in README.md
(the block that begins with "proof::scan          git diff → changed files,
commits" and the block that starts with the box drawing
"┌──────────────────────────────────────────┐") to include a language specifier
by adding "text" to the opening fence so the blocks become fenced as text code
blocks; this will silence the MD040 lint warnings while preserving the diagrams.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4b6e37ad-d249-4952-ad2a-fb3e7bd79fe3

📥 Commits

Reviewing files that changed from the base of the PR and between a89c809 and 0213bb2.

📒 Files selected for processing (15)

nanochat/README.md
nanochat/pyproject.toml
nanochat/worker.py
proof/README.md
proof/package.json
proof/src/agent.ts
proof/src/browser.ts
proof/src/context.ts
proof/src/cookies.ts
proof/src/prompt.ts
proof/src/tools.ts
proof/src/types.ts
proof/src/worker.ts
proof/tsconfig.json
registry/index.json

coderabbitai · 2026-03-29T22:30:55Z

nanochat/pyproject.toml

+[project]
+name = "iii-nanochat"
+version = "0.1.0"
+description = "nanochat LLM worker for iii-engine — train, fine-tune, evaluate, and chat with GPT models"
+license = "Apache-2.0"
+requires-python = ">=3.10"
+dependencies = [
+    "iii-sdk>=0.10.0",
+    "torch>=2.0",
+    "pydantic>=2.0",
+    "tiktoken",
+    "tokenizers",
+    "datasets",
+    "pyarrow",
+    "psutil",
+]
+
+[project.optional-dependencies]
+train = ["wandb"]
+
+[project.scripts]
+iii-nanochat = "worker:main"


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check if worker.py is inside a nanochat package or at root level fd -t f 'worker.py' | head -5 # Check for __init__.py to verify package structure fd -t f '__init__.py' | head -10

Repository: iii-hq/workers

Length of output: 76

🏁 Script executed:

#!/bin/bash # Check if nanochat/__init__.py exists ls -la nanochat/__init__.py 2>/dev/null && echo "Found" || echo "Not found" # List the contents of nanochat directory ls -la nanochat/ | head -20 # Double-check the exact path of worker.py find nanochat -name "worker.py" -type f

Repository: iii-hq/workers

Length of output: 385

Create nanochat/__init__.py and fix console script entry point.

The nanochat package is missing __init__.py, which means nanochat is not a proper Python package. This will cause the console script entry point "worker:main" to fail at runtime. Create an __init__.py file in the nanochat directory and update the entry point to "nanochat.worker:main".

Additionally, add the missing [build-system] section for PEP 517/518 compliance:

🔧 Required fixes

Create nanochat/__init__.py (can be empty or with version info):

# nanochat/__init__.py __version__ = "0.1.0"

In pyproject.toml:

+[build-system] +requires = ["setuptools>=61.0"] +build-backend = "setuptools.build_meta" + [project]

Update the console script entry point:

[project.scripts] -iii-nanochat = "worker:main" +iii-nanochat = "nanochat.worker:main"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

[project]

name = "iii-nanochat"

version = "0.1.0"

description = "nanochat LLM worker for iii-engine — train, fine-tune, evaluate, and chat with GPT models"

license = "Apache-2.0"

requires-python = ">=3.10"

dependencies = [

"iii-sdk>=0.10.0",

"torch>=2.0",

"pydantic>=2.0",

"tiktoken",

"tokenizers",

"datasets",

"pyarrow",

"psutil",

]

[project.optional-dependencies]

train = ["wandb"]

[project.scripts]

iii-nanochat = "worker:main"

[build-system]

requires = ["setuptools>=61.0"]

build-backend = "setuptools.build_meta"

[project]

name = "iii-nanochat"

version = "0.1.0"

description = "nanochat LLM worker for iii-engine — train, fine-tune, evaluate, and chat with GPT models"

license = "Apache-2.0"

requires-python = ">=3.10"

dependencies = [

"iii-sdk>=0.10.0",

"torch>=2.0",

"pydantic>=2.0",

"tiktoken",

"tokenizers",

"datasets",

"pyarrow",

"psutil",

]

[project.optional-dependencies]

train = ["wandb"]

[project.scripts]

iii-nanochat = "nanochat.worker:main"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@nanochat/pyproject.toml` around lines 1 - 22, Create a package initializer file nanochat/__init__.py (e.g. define __version__ = "0.1.0" or leave empty) so the nanochat module is a proper package, and update the console script entry point in pyproject.toml from "worker:main" to "nanochat.worker:main" to reference the worker module inside the package; also add a PEP 517/518 [build-system] section to pyproject.toml (include a minimal requires list such as setuptools and wheel and set build-backend to setuptools.build_meta) so builds comply with build-system metadata requirements.

Fixed. Added [build-system] section with setuptools backend (PEP 517/518).

nanochat/README.md

nanochat/worker.py

coderabbitai · 2026-03-29T22:30:55Z

proof/src/browser.ts

+    if (isInteractive || isContent) {
+      refCounter++;
+      const ref = `e${refCounter}`;
+      outputLine += ` [ref=${ref}]`;
+      refMap.set(ref, { role, name: name ?? "" });
+    }
+
+    outputLines.push(outputLine);
+  }
+
+  return outputLines.join("\n");
+}
+
+export function resolveRef(
+  ref: string,
+  refMap: Map<string, RefEntry>,
+  page: Page,
+) {
+  const entry = refMap.get(ref);
+  if (!entry) throw new Error(`Ref "${ref}" not found in current snapshot. Take a new snapshot.`);
+  return page.getByRole(entry.role as any, { name: entry.name }).first();


⚠️ Potential issue | 🟠 Major

Snapshot refs are not stable when labels repeat.

The ref map only stores { role, name }, and resolveRef() always calls .first(). On pages with repeated labels like two "Delete" buttons, distinct refs collapse onto the same element and the agent will click/type into the wrong target.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@proof/src/browser.ts` around lines 129 - 149, The refs collapse when multiple elements share the same role+name because refMap only stores {role,name} and resolveRef always uses .first(); update the ref population (where refMap.set is called) to also capture the element's ordinal (e.g., index among page.getByRole(role, { name })) or a stable locator snapshot and store it in RefEntry (e.g., add an index/ordinal property), and then change resolveRef to locate all matches and select the stored ordinal (use locator.nth(entry.index) or equivalent) instead of .first(); keep a safe fallback if index is missing to preserve current behavior.