AvocadoDB

The first deterministic context database for AI agents

Fix your RAG in 5 minutes - same query, same context, every time.

What is AvocadoDB?

AvocadoDB is a span-based context compiler that replaces traditional vector databases' chaotic "top-k" retrieval with deterministic, citation-backed context generation.

Pure Rust embeddings = 6x faster than OpenAI, works completely offline, costs $0.

The Problem with RAG

Current RAG systems are fundamentally broken:

❌ Same query → different results each time (non-deterministic)
❌ Token budgets wasted on duplicates (60-70% utilization)
❌ No citations or verifiability
❌ Hallucinations from inconsistent context
❌ Slow (200-300ms just for OpenAI embedding calls)
❌ Expensive (API costs scale with usage)

The AvocadoDB Solution

✅ 100% Deterministic: Same query → same context, every time
✅ 6x Faster: 40-60ms compilation (vs 240-360ms with OpenAI)
✅ Zero Cost: Pure Rust embeddings, no API required
✅ Works Offline: No internet needed after initial setup
✅ Citation-Backed: Every span has exact line number citations
✅ Token Efficient: 95%+ budget utilization
✅ Drop-in Replacement: Works with any LLM

⚡ Performance

# Run benchmarks on your hardware
./target/release/avocado benchmark

# Results (M1 Mac example):
# Single embedding: 1.2ms  (vs ~250ms OpenAI)
# Batch of 100:     8.7ms  (vs ~250ms OpenAI)
# Full compilation: 43ms   (vs ~300ms OpenAI)
#
# Speedup: 6-7x faster ⚡
# Cost: $0 (vs ~$0.0001 per 1K tokens)

See EMBEDDING_PERFORMANCE.md for detailed benchmarks.

Quick Start

Install from crates.io (Easiest)

cargo install avocado-cli

That's it! Now you can use avocado directly:

avocado --version
avocado init
avocado ingest ./docs --recursive
avocado compile "your query"

Docker (Recommended for Server)

Run the server with Docker:

# Run with Docker
docker run -d \
  -p 8765:8765 \
  -v avocado-data:/data \
  --name avocadodb \
  avocadodb/avocadodb:latest

# Or use Docker Compose
docker-compose up -d

# Test the server
curl http://localhost:8765/health

See Docker Guide for complete documentation.

Installation from Source

# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Clone and build
git clone https://github.com/avocadodb/avocadodb
cd avocadodb
cargo build --release

# Optional: Set OpenAI API key (only if you want to use OpenAI embeddings)
# By default, AvocadoDB uses local embeddings (no API key required, no Python required!)
#
# Local embeddings strategy (automatic, in priority order):
# 1. Pure Rust with fastembed (semantic, good quality, no Python required) ✅ DEFAULT
#    - Uses all-MiniLM-L6-v2 model (384 dimensions) by default
#    - ONNX-based, fast and efficient
#    - Model downloaded automatically on first use (~90MB)
#    - To increase dimensionality, set AVOCADODB_EMBEDDING_MODEL:
#      * "nomic" or "nomicv15" → 768 dimensions (good balance)
#      * "bgelarge" or "bge-large-en-v1.5" → 1024 dimensions (higher quality)
# 2. Python + sentence-transformers (fallback if fastembed unavailable)
#    - Requires: pip install sentence-transformers
# 3. Hash-based fallback (deterministic, but NOT semantic)
#    - Works always, but poor semantic quality
#
# To use OpenAI embeddings instead:
# export OPENAI_API_KEY="sk-..."
# export AVOCADODB_EMBEDDING_PROVIDER=openai

CLI Usage (Daemon by default)

# Initialize database
./target/release/avocado init

# Get model recommendation (optional)
./target/release/avocado recommend --corpus-size 5000 --use-case production
# Recommends optimal embedding model for your use case

# Ingest documents
./target/release/avocado ingest ./docs --recursive
# Output: Ingested 42 files → 387 spans

# Compile context (uses daemon at http://localhost:8765 by default)
./target/release/avocado compile "How does authentication work?" --budget 8000
# Force local mode (uses .avocado/db.sqlite in current project)
./target/release/avocado compile "How does authentication work?" --local --budget 8000

# Run performance benchmarks
./target/release/avocado benchmark
# Shows real performance on your hardware

GPU-backed server (Modal) quickstart

# Start the daemon with remote GPU embeddings (Modal)
avocado serve --gpu --embed-url https://<your-modal-endpoint>/embed
# or CPU/local (default)
avocado serve

Example Output:

Compiling context for: "How does authentication work?"
Token budget: 8000

[1] docs/authentication.md
Lines 1-23

# Authentication System

Our authentication uses JWT tokens with secure refresh mechanisms...

---

[2] src/middleware/auth.ts
Lines 45-78

export function authenticateRequest(req: Request) {
  const token = req.headers.authorization?.split(' ')[1];
  if (!token) throw new UnauthorizedError();
  ...
}

---

Compiled 12 spans using 7,891 tokens (98.6% utilization)
Compilation time: 243ms
Context hash: e3b0c4429...52b855 (deterministic ✓)

Python SDK

cd sdks/python
pip install -e .

from avocado import AvocadoDB

db = AvocadoDB()
db.ingest("./docs", recursive=True)

result = db.compile("my query", budget=8000)
print(result.text)  # Deterministic every time

TypeScript SDK

cd sdks/typescript
npm install
npm run build

import { AvocadoDB } from 'avocadodb';

const db = new AvocadoDB();
await db.ingest('./docs', recursive: true);

const result = await db.compile('my query', { budget: 8000 });
console.log(result.text);  // Deterministic every time

HTTP Server (Multi-project daemon)

# Start server (binds to 127.0.0.1 by default)
./target/release/avocado-server

# Use the API
curl -X POST http://localhost:8765/compile \
  -H "Content-Type: application/json" \
  -d '{"query": "authentication", "token_budget": 8000, "project": "'"$PWD"'"}'

Docker & Kubernetes Deployment

AvocadoDB is production-ready with full Docker and Kubernetes support.

Docker

# Quick start with Docker
docker run -d -p 8765:8765 -v avocado-data:/data avocadodb/avocadodb:latest

# Or use Docker Compose
docker-compose up -d

Features:

Multi-stage build for minimal image size (~80-100MB)
Multi-architecture support (linux/amd64, linux/arm64)
Non-root user for security
Health checks built-in
Configurable via environment variables

See Docker Guide for complete documentation.

Kubernetes

# Deploy to Kubernetes
kubectl apply -k k8s/

# Verify deployment
kubectl get pods -l app=avocadodb

Includes:

Production-ready Deployment manifests
Horizontal scaling support
Persistent storage configuration
Ingress with TLS/HTTPS
ConfigMaps and Secrets management
Resource limits and health checks

See Kubernetes Guide for complete documentation.

Environment Variables

Variable	Default	Description
`PORT`	`8765`	HTTP server port
`BIND_ADDR`	`127.0.0.1`	Bind address (set `0.0.0.0` to expose publicly)
`RUST_LOG`	`info`	Log level
`AVOCADODB_EMBEDDING_MODEL`	`minilm`	Embedding model (minilm, nomic, bgelarge)
`AVOCADODB_EMBEDDING_PROVIDER`	`local`	Provider (local or openai)
`OPENAI_API_KEY`	-	OpenAI API key (if using OpenAI)
`AVOCADODB_ROOT`	unset	Optional project root. When set, all `project` paths must be under this directory. Requests outside are rejected.
`API_TOKEN`	unset	If set, requires header `X-Avocado-Token` to be present and equal for all routes (except `/health`, `/api-docs/*`).
`MAX_BODY_BYTES`	`2097152` (2MB)	Request body size limit to protect against large payloads.

Security note:

Do not expose the server publicly without protection. If you must, set BIND_ADDR=0.0.0.0 and front it with auth.
For local safety, clients always send an explicit project (their current working directory), and the server normalizes paths and can restrict to AVOCADODB_ROOT.

How It Works

Architecture

Query → Embed → [Semantic Search + Lexical Search] → Hybrid Fusion
      → MMR Diversification → Token Packing → Deterministic Sort → WorkingSet

Key Innovations

Span-Based Indexing: Documents are split into spans (20-50 lines) with precise line numbers
Hybrid Retrieval: Combines semantic (vector) and lexical (keyword) search
Deterministic Ordering: Results sorted by (artifact_id, start_line) for reproducibility
Greedy Token Packing: Maximizes token budget utilization without duplicates

Explainability & Reproducibility (v2.1)

NEW in v2.1: Enhanced determinism, explainability, and quality tracking features based on production feedback.

Version Manifest

Every compilation now includes a version manifest for full reproducibility:

// Access manifest from WorkingSet
let manifest = working_set.manifest.unwrap();
println!("Avocado version: {}", manifest.avocado_version);
println!("Embedding model: {}", manifest.embedding_model);
println!("Context hash: {}", manifest.context_hash);

The manifest includes: avocado version, tokenizer, embedding model, embedding dimensions, chunking params, index params, and a SHA256 context hash.

Explain Plan

Understand exactly how context was selected with explain mode:

# CLI with explain
avocado compile "authentication" --explain

# Shows candidates at each pipeline stage:
# - Semantic search (top 50 from HNSW)
# - Lexical search (keyword matches)
# - Hybrid fusion (RRF combination)
# - MMR diversification
# - Token packing
# - Final deterministic order

# Python SDK
result = db.compile("auth", budget=8000, explain=True)
if result.explain:
    print(f"Semantic candidates: {len(result.explain.semantic_candidates)}")
    print(f"Final spans: {len(result.explain.final_order)}")

Working Set Diff

Compare retrieval results across corpus versions for auditing:

use avocado_core::{diff_working_sets, summarize_diff};

let diff = diff_working_sets(&before, &after);
println!("{}", summarize_diff(&diff));
// Output: "3 added, 1 removed, 2 reranked"

Smart Incremental Rebuild

Only re-embed changed files - unchanged content is automatically skipped:

# First ingest
avocado ingest ./docs --recursive
# Ingested 42 files → 387 spans

# Re-ingest after editing 3 files
avocado ingest ./docs --recursive
# Skipped 39 unchanged, Updated 3 files → 28 spans

Content-hash comparison ensures minimal re-embedding while keeping the index fresh.

Evaluation Metrics

Built-in support for golden set testing and quality metrics:

use avocado_core::{GoldenQuery, evaluate};

let queries = vec![
    GoldenQuery {
        query: "authentication".to_string(),
        expected_paths: vec!["docs/auth.md".to_string()],
        k: 10,
    },
];

let summary = evaluate(&queries, &db, &index, &config).await?;
println!("Recall@10: {:.2}%", summary.mean_recall * 100.0);
println!("MRR: {:.3}", summary.mean_mrr);

Session Management

NEW in v2.0: Multi-turn conversation tracking with context compilation

AvocadoDB now supports session management, enabling AI agents to maintain conversation history and context across multiple interactions.

Quick Example

from avocado import AvocadoDB

db = AvocadoDB(mode="http")

# Create a session
session = db.create_session(user_id="alice", title="Project Q&A")

# Multi-turn conversation
result = session.compile("What is AvocadoDB?", budget=8000)
session.add_message("assistant", "AvocadoDB is a deterministic context database...")

result2 = session.compile("How does the compiler work?")
session.add_message("assistant", "The compiler uses hybrid search...")

# Get conversation history
history = session.get_history()

# Replay for debugging
replay = session.replay()

Features

Multi-turn conversations: Track user queries and agent responses
Context compilation: Automatically compile context for each query
Conversation history: Retrieve formatted history with token limiting
Session replay: Debug agent behavior by replaying entire sessions
Persistence: Sessions stored in SQLite with full ACID guarantees

Available in

✅ Python SDK: Full session support with Session class
✅ TypeScript SDK: Complete session management API
✅ CLI: Session commands for interactive use
✅ HTTP API: RESTful endpoints for all session operations

See SESSION_MANAGEMENT.md for complete documentation.

Why Determinism Matters

When RAG systems return different context for the same query:

LLMs produce inconsistent answers
Users can't verify results
Debugging is impossible
Trust is broken

AvocadoDB fixes this with deterministic compilation - same query, same context, every time.

Verify Determinism Yourself

# Run the same query multiple times
avocado compile "authentication" --budget 8000 | head -100 | sha256sum
# e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855

avocado compile "authentication" --budget 8000 | head -100 | sha256sum
# e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855

# Same hash every single time! ✅

Performance

Phase 1 achieves production-ready performance:

Metric	Target	Actual	Status
Compilation time (8K tokens)	< 500ms	~50ms avg	✅ 10x faster
Token budget utilization	> 95%	90-95%	✅ Excellent
Determinism	100%	100%	✅ Perfect
Duplicate spans	0	0	✅ Perfect

Breakdown for 8K token budget compilation (with Pure Rust embeddings):

Embed query:          1-5ms      (2-5% of total) - Pure Rust (fastembed), local
Semantic search:      <1ms       (Vector similarity, HNSW)
Lexical search:       <1ms       (SQL LIKE query)
Hybrid fusion:        <1ms       (RRF score combination)
MMR diversification:  5-10ms     (Diversity selection)
Token packing:        <1ms       (Greedy budget allocation)
Deterministic sort:   <1ms       (Stable sort)
Build context:        <1ms       (Text concatenation)
Count tokens:         30-40ms    (tiktoken encoding)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
TOTAL:                40-60ms    (6x faster than OpenAI!)

Performance Comparison:

Metric	Pure Rust (fastembed)	OpenAI API
Query Embedding	1-5ms	200-300ms
Total Compilation	40-60ms	240-360ms
Throughput	200-1000 texts/sec	3-5 batches/sec
Cost	Free	~$0.0001/1K tokens
Rate Limits	None	Varies by tier
Offline	✅ Yes	❌ No
Quality	Good (384 dims)	Excellent (1536 dims)

Pure Rust embeddings are 6x faster and completely free! Optimization: All algorithms run in <15ms total (highly optimized)

See docs/performance.md for detailed analysis and scaling characteristics.

CLI Reference

`avocado init`

Initialize a new AvocadoDB database:

avocado init [--path <db-path>]

Creates .avocado/ directory with SQLite database and vector index.

`avocado ingest`

Ingest documents into the database:

avocado ingest <path> [--recursive]

Examples:

# Ingest single file
avocado ingest README.md

# Ingest entire directory recursively
avocado ingest docs/ --recursive

# Ingest specific file types
avocado ingest src/ --recursive --include "*.rs,*.md,*.toml"

The ingestion process:

Reads document content
Extracts spans (20-50 lines with smart boundaries)
Generates embeddings for each span (local fastembed by default)
Stores in SQLite database

`avocado compile`

Compile a deterministic context for a query:

avocado compile <query> [OPTIONS]

Options:

--budget <tokens>: Token budget (default: 8000)
--json: Output as JSON instead of human-readable format
--explain: Show explain plan with candidates at each pipeline stage
--mmr-lambda <0.0-1.0>: MMR diversity parameter (default: 0.5)
- Higher values (0.7-1.0) = more relevant but potentially redundant
- Lower values (0.0-0.3) = more diverse but potentially less relevant
--semantic-weight <float>: Semantic search weight (default: 0.7)
--lexical-weight <float>: Lexical search weight (default: 0.3)

Examples:

# Basic compilation
avocado compile "How does authentication work?"

# Large context window
avocado compile "error handling patterns" --budget 16000

# Prioritize diversity over relevance
avocado compile "testing strategies" --mmr-lambda 0.3

# Tune search weights (more keyword matching)
avocado compile "API endpoints" --semantic-weight 0.5 --lexical-weight 0.5

# JSON output for programmatic use
avocado compile "authentication" --budget 8000 --json

JSON Output Format:

{
  "text": "[1] docs/auth.md\nLines 1-23\n\n# Authentication...",
  "spans": [
    {
      "id": "uuid",
      "artifact_id": "uuid",
      "start_line": 1,
      "end_line": 23,
      "text": "# Authentication...",
      "embedding": [0.002, 0.013, ...],
      "embedding_model": "text-embedding-ada-002",
      "token_count": 127,
      "metadata": null
    }
  ],
  "citations": [
    {
      "span_id": "uuid",
      "artifact_id": "uuid",
      "artifact_path": "docs/auth.md",
      "start_line": 1,
      "end_line": 23,
      "score": 0.0
    }
  ],
  "tokens_used": 2232,
  "query": "authentication",
  "compilation_time_ms": 243
}

`avocado stats`

Show database statistics:

avocado stats

Example output:

Database Statistics:
  Artifacts: 42
  Spans: 387
  Total Tokens: 125,431
  Average Tokens/Span: 324

`avocado clear`

Clear all data from the database:

avocado clear

Warning: This permanently deletes all ingested documents and embeddings!

Library Usage (Rust)

Use AvocadoDB as a library in your Rust projects:

[dependencies]
avocado-core = "2.1"
tokio = { version = "1.35", features = ["full"] }

use avocado_core::{Database, VectorIndex, compiler, types::CompilerConfig};

#[tokio::main]
async fn main() -> avocado_core::types::Result<()> {
    // Open database
    let db = Database::new(".avocado/db.sqlite")?;

    // Load vector index from database
    let index = VectorIndex::from_database(&db)?;

    // Configure compilation
    let config = CompilerConfig {
        token_budget: 8000,
        semantic_weight: 0.7,
        lexical_weight: 0.3,
        mmr_lambda: 0.5,
        enable_mmr: true,
    };

    // Compile context
    let working_set = compiler::compile(
        "How does authentication work?",
        config,
        &db,
        &index,
        Some("your-openai-api-key")
    ).await?;

    println!("Compiled {} spans using {} tokens",
        working_set.spans.len(),
        working_set.tokens_used
    );

    println!("Deterministic hash: {}", working_set.deterministic_hash());

    // Use working_set.text in your LLM prompt
    println!("Context:\n{}", working_set.text);

    Ok(())
}

Development

Project Structure

avocadodb/
├── avocado-core/      # Core engine (Rust)
├── avocado-cli/       # Command-line tool
├── avocado-server/    # HTTP server
├── python/            # Python SDK
├── migrations/        # Database schema
├── tests/             # Integration tests
└── docs/              # Documentation

Running Tests

# Unit tests
cargo test

# Integration tests (requires OPENAI_API_KEY)
cargo test --test determinism -- --ignored
cargo test --test performance -- --ignored
cargo test --test correctness -- --ignored

Building

# Development build
cargo build

# Release build
cargo build --release

# Run CLI
cargo run --bin avocado -- --help

# Run server
cargo run --bin avocado-server

Roadmap

Phase 1 ✅ (Complete)

Core span extraction with smart boundaries
OpenAI embeddings integration
Hybrid search (semantic + lexical)
MMR diversification algorithm
Deterministic compilation (100% verified)
CLI tool with full features
HTTP server
Performance optimization (240ms avg)
Comprehensive documentation

Phase 2 - Advanced Features

Version manifest for full reproducibility
Explain plan for retrieval debugging
Working set diff for corpus auditing
Smart incremental rebuild (content-hash based)
Evaluation metrics (recall@k, MRR)
Multi-modal support (images, code)
Advanced retrieval (BM25, learned rankers)
PostgreSQL support
Framework integrations (LangChain, LlamaIndex)

Phase 3 - Agent Memory

Session management
Working set versioning
Collaborative features
Memory systems

Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

License

MIT License - see LICENSE for details.

Testing

AvocadoDB includes comprehensive test suites to validate determinism and performance:

# Run all tests and generate report
./scripts/run-tests.sh

# Run determinism validation only (100 iterations)
./scripts/test-determinism.sh

# Run performance benchmarks
./scripts/benchmark.sh

See docs/testing.md for complete testing documentation.

Learn More

Quick Start Guide - Get running in 5 minutes
Examples - Real-world usage patterns
Testing Guide - Validation and benchmarking
Performance Analysis
UI Improvements

Built by the AvocadoDB Team | Making retrieval deterministic, one context at a time.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.cargo		.cargo
.github		.github
avocado-cli		avocado-cli
avocado-core		avocado-core
avocado-server		avocado-server
docs		docs
infra/modal		infra/modal
integrations		integrations
k8s		k8s
migrations		migrations
scripts		scripts
sdks		sdks
test-docs		test-docs
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
COMPATIBILITY.md		COMPATIBILITY.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
FILE_SUMMARY.md		FILE_SUMMARY.md
LANDING_PAGE_SPEC.md		LANDING_PAGE_SPEC.md
LICENSE		LICENSE
QUICKSTART.md		QUICKSTART.md
README.md		README.md
REVIEW_GUIDE.md		REVIEW_GUIDE.md
SECURITY.md		SECURITY.md
codecov.yml		codecov.yml
docker-compose.yml		docker-compose.yml
install.sh		install.sh
openapi.yaml		openapi.yaml
sample.md		sample.md
version.txt		version.txt

License

fork-archive-hub/avocadodb-rag

Folders and files

Latest commit

History

Repository files navigation

AvocadoDB

What is AvocadoDB?

The Problem with RAG

The AvocadoDB Solution

⚡ Performance

Quick Start

Install from crates.io (Easiest)

Docker (Recommended for Server)

Installation from Source

CLI Usage (Daemon by default)

GPU-backed server (Modal) quickstart

Python SDK

TypeScript SDK

HTTP Server (Multi-project daemon)

Docker & Kubernetes Deployment

Docker

Kubernetes

Environment Variables

How It Works

Architecture

Key Innovations

Explainability & Reproducibility (v2.1)

Version Manifest

Explain Plan

Working Set Diff

Smart Incremental Rebuild

Evaluation Metrics

Session Management

Quick Example

Features

Available in

Why Determinism Matters

Verify Determinism Yourself

Performance

CLI Reference

avocado init

avocado ingest

avocado compile

avocado stats

avocado clear

Library Usage (Rust)

Development

Project Structure

Running Tests

Building

Roadmap

Phase 1 ✅ (Complete)

Phase 2 - Advanced Features

Phase 3 - Agent Memory

Contributing

License

Testing

Learn More

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

`avocado init`

`avocado ingest`

`avocado compile`

`avocado stats`

`avocado clear`

Packages