feat: entire prompts search - searchable prompt history from checkpoints by AasheeshLikePanner · Pull Request #1211 · entireio/cli

AasheeshLikePanner · 2026-05-14T04:04:24Z

Summary:

Implements entire prompts — a local, offline-first command for searching the prompts behind your checkpoint history. This is the "search" feature from the roadmap: surfacing the why behind commits, not just the what.

What's in this PR

Four commands:

entire prompts search "cache decision"     # full-text search with filters
entire prompts list                         # recent prompts, newest first
entire prompts show <checkpoint-id>        # full prompt text for a checkpoint
entire prompts index --rebuild / --status  # manage the local index

Filters on search: --agent, --branch, --kind, --after, --files, --limit, --json

How it works

On every commit, the PostCommit hook appends a new entry to .entire/prompts/index.ndjson — a gitignored, appendable newline-delimited JSON file that lives next to your repo. No external service, no database, works offline.

On search, the index loads into memory and each entry gets scored:

Exact phrase match → +10
All query tokens found → +5
Any token found → +1
Term density bonus → up to +2

Tokenization runs NFC unicode normalization → lowercase → word boundary split → stopword filter → Porter stemmer. So "caching" matches prompts containing "cache", "cached", "caches".

Queries are sanitised before tokenizing — regex metacharacters stripped, minimum 2-char guard, quoted phrases extracted for exact matching.

File locking on writes: O_CREATE|O_EXCL for atomic acquisition, 3 retries with 50ms backoff for concurrent PostCommit hooks, stale lock detection (>30s) to survive crashes.

Architecture decisions

NDJSON over SQLite — appendable without full rewrites, no CGO, human-readable, portable
Porter stemmer (github.com/kljensen/snowball) — one new pure-Go dependency, zero CGO
Local index, not cloud — prompts are personal context, should stay local and work offline
PostCommit hook integration — index updates happen transparently, no user action needed

Tests

19 tests across rank_test.go and store_test.go:

Tokenizer: stemming, stopwords, unicode normalization, special chars
Query parser: basic, phrase extraction, regex stripping, min length
Scorer: exact phrase, all tokens, term density
Search: ranking, empty query, filter application
Store: concurrent writes, single entry, empty slice, lock contention

Benchmarks:

Tokenize: ~0.1ms
Search 1K entries: 5.6ms (target <100ms)
Index load 1K entries: 2.8ms (target <50ms)

Tested live against 4 real checkpoints, 94 prompts, 98KB index. Search, list, show all working end-to-end.

Known gaps

Three things are stubbed or incomplete — none affect the core feature working:

ReviewPrompt for agent_review kind not yet wired — entries with kind: agent_review fall back to empty prompt text. Will fix in follow-up.
TokenCount, ParentCheckpointID, SubagentDepth fields exist in the schema but aren't populated from metadata yet.
entire prompts index --verify flag exists but is a no-op placeholder.

Implements 'entire prompts' command group: - search: Keyword search with filters - list: List recent prompts - show: Display full prompt for checkpoint - index: Manage index Auto-rebuilds index on first search. Integrates with PostCommit hook for incremental updates. Entire-Checkpoint: fdc9780864bb

Comprehensive doc covering: - What was implemented (commands, files) - Logic flow (index building, search, incremental updates) - Algorithm details (tokenizer, scorer, locking) - Data structures - How to test - Known limitations - Future improvements - Architecture diagram

- Fixed error wrapping (wrapcheck) - Added NFC unicode normalization to Tokenize - Added query guard for special characters - Fixed file permissions (gosec) - Added nil check handling Remaining: 12 lint issues (mostly style)

Tests added: - TestTokenize_stemming, stopwords, unicode, specialChars - TestParseQuery_basic, phrase, specialChars, tooShort - TestScore_exactPhrase, allTokens, termDensity - TestSearch_returnsRanked, emptyQuery, filters - BenchmarkSearch1K: 5.6ms for 1K entries (target <100ms) All tests pass.

Implements offline-first, searchable prompt history from checkpoint data: - Add entire prompts search/list/show/index commands - Build NDJSON index from checkpoint metadata - Tokenize with Porter stemmer and NFC normalization - Weighted scoring: phrase(+10), all tokens(+5), any(+1), density(*2) - File locking with retry and stale detection Fixes: - Replace bubble sort O(n²) with sort.Slice O(n log n) - Add 3-retry lock with backoff to prevent data loss - Add stale lock detection for crash recovery Follow-ups (documented): - ReviewPrompt not wired (agent_review kind) - --verify flag is placeholder - TokenCount/ParentCheckpointID/SubagentDepth not populated

Entire-Checkpoint: 7a862f395125

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

AasheeshLikePanner added 9 commits May 14, 2026 09:21

Fix lint errors and add unicode normalization

2cf5506

- Fixed error wrapping (wrapcheck) - Added NFC unicode normalization to Tokenize - Added query guard for special characters - Fixed file permissions (gosec) - Added nil check handling Remaining: 12 lint issues (mostly style)

Update implementation documentation with test results

5294829

Remove feature context doc (internal only)

2683185

Remove .entire/prompts/index.ndjson from git (should be gitignored)

f886a3f

Fix .gitignore: allow .entire subdirs, restore .entire files from main

d8c6de2

Entire-Checkpoint: 7a862f395125

AasheeshLikePanner requested a review from a team as a code owner May 14, 2026 04:04

claude Bot reviewed May 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: entire prompts search - searchable prompt history from checkpoints#1211

feat: entire prompts search - searchable prompt history from checkpoints#1211
AasheeshLikePanner wants to merge 9 commits into
entireio:mainfrom
AasheeshLikePanner:feature/searchable-prompts

AasheeshLikePanner commented May 14, 2026

Uh oh!

claude Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

Conversation

AasheeshLikePanner commented May 14, 2026

Summary:

What's in this PR

How it works

Architecture decisions

Tests

Known gaps

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant