Skip to content

Latest commit

 

History

History
307 lines (235 loc) · 8.95 KB

File metadata and controls

307 lines (235 loc) · 8.95 KB

Migration Guide

This document provides step-by-step instructions for migrating between different AutoMem configurations.

Heads up for existing deployments: The default embedding dimension is now 1024d (voyage-4) for new installs. If your Qdrant collection uses a different dimension (e.g. 3072d from text-embedding-3-large or 768d from text-embedding-3-small), no action is neededVECTOR_SIZE_AUTODETECT=true (the default) automatically adopts your existing collection dimension on startup. To explicitly pin your dimension, set VECTOR_SIZE=<your-dimension> in your .env. To enforce strict matching (fail on mismatch), set VECTOR_SIZE_AUTODETECT=false.

Table of Contents


Migrating to 1024d (voyage-4 default)

When to migrate: If you're switching from OpenAI embeddings to Voyage AI (the new recommended default).

Steps

  1. Backup your data: python scripts/backup_automem.py
  2. Set environment variables:
    EMBEDDING_PROVIDER=voyage    # or auto (will prefer Voyage if VOYAGE_API_KEY is set)
    VOYAGE_API_KEY=pa-...
    VECTOR_SIZE=1024
  3. Delete and recreate the Qdrant collection:
    curl -X DELETE http://localhost:6333/collections/memories
  4. Re-embed all memories:
    python scripts/reembed_embeddings.py
  5. Verify: Check that /health shows vector_size: 1024 and recall returns results.

Alternatively, set VECTOR_SIZE_AUTODETECT=true (the default) and AutoMem will adopt your existing collection dimension without migration. Only migrate when you want to switch embedding providers.


Upgrading to 3072d Embeddings

When to upgrade: If you need better semantic precision and have the storage budget for 4x larger embeddings.

Pros ✅

  • Better semantic precision: ~5-10% improvement on benchmarks
  • Improved multi-hop reasoning: Better at connecting related concepts
  • Recommended for production: If accuracy is critical and storage is not a constraint

Cons ❌

  • 4x storage cost: 768 → 3072 dimensions (4x more disk space)
  • 4x embedding cost: OpenAI charges per dimension
  • ~20% slower search: More dimensions = more computation
  • Migration required: Cannot reuse existing embeddings

Cost Comparison

Metric 768d (small) 3072d (large) Multiplier
Storage per 1M memories ~3GB ~12GB 4x
OpenAI cost per 1M tokens $0.02 $0.13 6.5x
Search latency ~50ms ~60ms 1.2x
Benchmark accuracy 88.2% 90.5% +2.3pp

Migration Steps

1. Backup Your Data

python scripts/backup_automem.py

This creates timestamped backups in backups/:

  • backups/falkordb/memories_YYYYMMDD_HHMMSS.rdb
  • backups/qdrant/qdrant_snapshot_YYYYMMDD_HHMMSS.tar.gz

2. Update Configuration

# Add to your .env file
echo "VECTOR_SIZE=3072" >> .env
echo "EMBEDDING_MODEL=text-embedding-3-large" >> .env

Or export temporarily:

export VECTOR_SIZE=3072
export EMBEDDING_MODEL=text-embedding-3-large

3. Re-embed All Memories

python scripts/reembed_embeddings.py

This will:

  • Fetch all memories from FalkorDB (source of truth)
  • Generate new 3072d embeddings using OpenAI API
  • Recreate Qdrant collection with new dimensions
  • Upsert all embeddings in batches

Expected time: ~5-10 minutes per 10k memories

4. Verify Migration

Check Qdrant collection info:

curl http://localhost:6333/collections/memories | jq '.result.config.params.vectors'

Should show:

{
  "size": 3072,
  "distance": "Cosine"
}

5. Test Recall

curl -X POST http://localhost:8001/recall \
  -H "Authorization: Bearer $AUTOMEM_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"query": "test recall", "limit": 5}'

Verify results are returned and scores look reasonable.

6. Restart Application

# If using Docker
docker compose up -d

# Or rerun the foreground dev stack
make dev

# If using systemd
sudo systemctl restart automem

# If using Railway
railway up

Rollback Procedure

If migration fails or results are poor:

# 1. Stop application
docker compose down  # or however you run AutoMem

# 2. Restore from backup
python scripts/restore_from_backup.py backups/qdrant/qdrant_snapshot_YYYYMMDD_HHMMSS.tar.gz
python scripts/restore_from_backup.py backups/falkordb/memories_YYYYMMDD_HHMMSS.rdb

# 3. Revert configuration
export VECTOR_SIZE=768
export EMBEDDING_MODEL=text-embedding-3-small

# 4. Restart
docker compose up -d

Downgrading to 768d Embeddings

When to downgrade: If storage costs are too high or 3072d isn't providing enough value.

Steps

Follow the same migration steps above, but use:

export VECTOR_SIZE=768
export EMBEDDING_MODEL=text-embedding-3-small

Then run reembed_embeddings.py to recreate the collection with 768d vectors.


Troubleshooting

Error: "Vector dimension mismatch"

Symptom (only when VECTOR_SIZE_AUTODETECT=false):

FATAL: Vector dimension mismatch detected!
  Existing Qdrant collection: 3072d
  Configured VECTOR_SIZE:     1024d

Solution (pick one):

  1. Set VECTOR_SIZE_AUTODETECT=true (default) to automatically adopt the existing collection dimension
  2. Set VECTOR_SIZE=<existing-dimension> in your .env to match your data
  3. Migrate to the new dimension: follow the 1024d, 3072d, or 768d migration steps above

Error: "OpenAI API rate limit"

Symptom:

Rate limit exceeded during re-embedding

Solution: The reembed_embeddings.py script uses whatever embedding provider is configured (Voyage, OpenAI, local, etc.). For large datasets:

  1. Run during off-peak hours
  2. Increase your provider's rate limits if applicable
  3. Split migration into batches using --batch-size flag

Error: "Qdrant collection already exists"

Symptom:

Collection 'memories' already exists with different dimension

Solution: Delete and recreate:

curl -X DELETE http://localhost:6333/collections/memories
python scripts/reembed_embeddings.py

⚠️ Warning: This deletes all embeddings. Make sure FalkorDB still has the memories (embeddings will be regenerated from there).

Migration is slow

Symptoms:

  • Taking hours for thousands of memories
  • High OpenAI API costs

Solutions:

  1. Check batch size: Script uses batches of 100 by default
  2. Parallel processing: Use --workers flag (if implemented)
  3. Spot check first: Test on a subset before full migration
  4. Use cheaper model for testing:
    export EMBEDDING_MODEL=text-embedding-3-small
    python scripts/reembed_embeddings.py --dry-run

Backup failed

Symptoms:

  • Backup script errors
  • Empty backup files

Solutions:

  1. Check disk space: df -h
  2. Check permissions: ls -la backups/
  3. Manual backup:
    # FalkorDB
    docker exec automem-falkordb-1 redis-cli --rdb /data/dump.rdb
    
    # Qdrant
    curl -X POST http://localhost:6333/collections/memories/snapshots

Best Practices

Before Any Migration

  1. Always backup first - Don't skip this step
  2. Test in staging - If you have a staging environment
  3. Monitor costs - Check OpenAI usage dashboard during migration
  4. Document current state - Note current VECTOR_SIZE and EMBEDDING_MODEL

After Migration

  1. Run benchmark tests - Verify accuracy hasn't degraded
  2. Monitor performance - Check search latency and throughput
  3. Update documentation - Note when migration occurred and why
  4. Store migration record:
    curl -X POST http://localhost:8001/memory \
      -H "Authorization: Bearer $AUTOMEM_API_TOKEN" \
      -H "Content-Type: application/json" \
      -d '{
        "content": "Migrated to 3072d embeddings for better semantic precision",
        "tags": ["migration", "config", "embeddings"],
        "importance": 0.8
      }'

Choosing Between 768d and 3072d

Use 768d (text-embedding-3-small) if:

  • Cost-conscious deployment
  • Storage is limited
  • Speed > slight accuracy gains
  • Personal/development use
  • Small dataset (<100k memories)

Use 3072d (text-embedding-3-large) if:

  • Production deployment
  • Accuracy is critical
  • Complex multi-hop reasoning needed
  • Large dataset benefits from precision
  • Storage/compute costs are acceptable

Related Documentation