Lightweight AI assistant framework in Rust with pluggable LLM providers (Ollama, llama.cpp, vLLM) and built-in tool calling.
- Build:
cargo build - Test:
cargo test --workspace - Lint:
cargo clippy --workspace -- -D warnings - Format check:
cargo fmt --all -- --check - Formatting:
cargo fmt --all - Install:
make install(builds release + copies to~/.local/bin) - Run:
cargo run(Ollama default) orcargo run -- --llama-cpp/--vllm
Three crates in a Cargo workspace:
tinyharness-lib— Core library: providers, tools, sessions, context, skills, tokens. No terminal I/O.tinyharness-ui— UI library: ANSI output, confirmation prompts, diff display, command input.TinyHarness— Binary CLI: agent loop, slash commands, tool dispatch, setup.
provider/— Provider trait,OllamaProvider(raw SSE, Gemini signatures),LlamaCppProvider/VllmProvider(shared OpenAI-compat internals)tools/— 15 tools (ls, read, write, edit, grep, glob, run, web_search, web_fetch, switch_mode, question, auto_compact, invoke_skill, screenshot), registration inregister_defaults(), mode-based filteringsession.rs— JSONL persistence, auto-save every 5 messagescontext.rs— Workspace metadata + instruction file discovery (TINYHARNESS.md → .tinyharness.md → AGENTS.md → CLAUDE.md)skill.rs— Skill discovery from~/.config/tinyharness/skills/and.tinyharness/skills/mode.rs— Agent modes with.mdsystem promptsconfig/mod.rs— SettingsStore, ProviderKind, OllamaThinkType
src/agent/— Agent loop, tool execution, safety checks, display, multi-line input, provider setupsrc/commands/— 22+ slash commands (mode, model, sessions, compact, init, context, files, image, skill, settings, help, etc.),CommandRegistryandasync_command!macro
- Rust edition 2024
- Core logic (
tinyharness-lib) must not use terminal I/O, ANSI codes, or rustyline - Use
serde+schemarsfor serialization and tool schema generation - Prefer
Pin<Box<dyn Future>>overasync-traitto keep dependency tree small - Error handling:
Result<T, String>for user-facing,Result<T, Box<dyn Error>>for internal - Minimize dependencies; avoid adding new crates when existing ones suffice
#[macro_export]macros (extract_args!) live attinyharness_libroot, not insidetools- Tool categories:
ReadOnly(auto-executed),Destructive(requires confirmation),Signal(handled specially by agent loop)
main.rs→ parse CLI, create provider, health check, auto-select model, collect workspace context, initialize prompts, register tools, load/create session, build command registry, enterrun_agent_loop()- Agent loop: read input (or
--prompt), dispatch slash commands, send messages to provider, stream response, handle tool calls - Signal tools (
switch_mode,question,auto_compact,invoke_skill) bypass generic tool execution and are handled inline - Destructive tools prompt for confirmation (except
runwhich cannot be auto-accepted); ReadOnly tools run immediately - Tool results are batched into a single
Role::Toolmessage, appended to conversation - Auto-save session every 5 messages; flush on mode switch, session switch, exit
| Mode | Tools | Purpose |
|---|---|---|
| casual | web_search, web_fetch | Chat with web access |
| planning | ReadOnly + Signal tools | Analyze, plan, escalate to agent |
| agent | All 15 tools | Full development access |
| research | Same as planning (research-focused prompt) | Web research, then escalate |
cargo test --workspaceruns all teststinyharness-libhas good coverage;tinyharness-uiand binary crate have limited coverage (seetodo/01-testing-gaps.md)- Use
tempfilefor test isolation; tool tests must not touch the real filesystem - Run specific test:
cargo test <test_name> - Run per crate:
cargo test -p tinyharness-lib,cargo test -p TinyHarness,cargo test -p tinyharness-ui
- Provider startup: All providers run a health check (Ollama calls
list_local_models). If saved model is unavailable, auto-select picks the first available with a warning. - Ollama specifics: Own raw SSE parser (not ollama-rs streaming) to handle native and OpenAI-compatible formats; captures Gemini
thought_signaturefrom tool responses and re-injects them; fixes serialization quirks (lowercases tool type, injectsnamein tool results). - System prompts: Assembled from
header.md+<mode>.mdfor Agent/Planning/Research; Casual is self-contained. Prompts are refreshed on mode switch, file pinning changes, skill activation, and/refresh. - Command safety (
src/agent/safety.rs): Prefix matching with word boundaries, deny list priority, strips redirections before matching; rejects;,&,|,$(), backticks, newlines. Redirections like2>&1are auto-accepted if base command is safe. - Confirmation:
runtool cannot be auto-accepted even with 'a' (auto-accept mode); onlywriteandeditcan. - Compaction:
/compactuses single-pass for ≤200 intermediate messages, cascading (chunk+merge) for larger sessions. - Context warnings: Load warnings at 70%/90% thresholds based on last known token count (estimation).
- Session files: JSONL (metadata line first, then message lines); malformed lines silently skipped on load; stored in
~/.local/share/tinyharness/sessions/. - Web tools:
web_searchandweb_fetchusehttps://ollama.com/api/web_searchand require an Ollama API key set via/apikey. - Ctrl+C: Interrupts current LLM generation; second Ctrl+C exits immediately.
- Configuration: Set via
--config(interactive setup), stored as JSON in~/.config/tinyharness/settings.json. Persistent prompts are seeded from embedded defaults into~/.config/tinyharness/prompts/. - Image attachments: Base64 data URIs, used by multimodal models; set via
/image. async_command!macro: Registers commands that needprovider.lock().await.CommandResultvariants:SwitchSession,RenameSession,Init,SkillUse,SkillUnloadcarry data back to the agent loop.CommandContextholds shared mutable state: provider, mode, file context, session ID, skill registry, active skills, pending images, thinking toggle, compaction token usage.extract_args!macro exported attinyharness_libroot, not intools.
After making changes, run in order:
cargo fmt --allcargo clippy --workspace -- -D warningscargo test --workspacecargo build