This document provides step-by-step instructions for migrating between different AutoMem configurations.
Heads up for existing deployments: The default embedding dimension is now 1024d (voyage-4) for new installs. If your Qdrant collection uses a different dimension (e.g. 3072d from text-embedding-3-large or 768d from text-embedding-3-small), no action is needed — VECTOR_SIZE_AUTODETECT=true (the default) automatically adopts your existing collection dimension on startup. To explicitly pin your dimension, set VECTOR_SIZE=<your-dimension> in your .env. To enforce strict matching (fail on mismatch), set VECTOR_SIZE_AUTODETECT=false.
- Migrating to 1024d (voyage-4 default)
- Upgrading to 3072d Embeddings
- Downgrading to 768d Embeddings
- Troubleshooting
When to migrate: If you're switching from OpenAI embeddings to Voyage AI (the new recommended default).
- Backup your data:
python scripts/backup_automem.py - Set environment variables:
EMBEDDING_PROVIDER=voyage # or auto (will prefer Voyage if VOYAGE_API_KEY is set) VOYAGE_API_KEY=pa-... VECTOR_SIZE=1024 - Delete and recreate the Qdrant collection:
curl -X DELETE http://localhost:6333/collections/memories
- Re-embed all memories:
python scripts/reembed_embeddings.py
- Verify: Check that
/healthshowsvector_size: 1024and recall returns results.
Alternatively, set
VECTOR_SIZE_AUTODETECT=true(the default) and AutoMem will adopt your existing collection dimension without migration. Only migrate when you want to switch embedding providers.
When to upgrade: If you need better semantic precision and have the storage budget for 4x larger embeddings.
- Better semantic precision: ~5-10% improvement on benchmarks
- Improved multi-hop reasoning: Better at connecting related concepts
- Recommended for production: If accuracy is critical and storage is not a constraint
- 4x storage cost: 768 → 3072 dimensions (4x more disk space)
- 4x embedding cost: OpenAI charges per dimension
- ~20% slower search: More dimensions = more computation
- Migration required: Cannot reuse existing embeddings
| Metric | 768d (small) | 3072d (large) | Multiplier |
|---|---|---|---|
| Storage per 1M memories | ~3GB | ~12GB | 4x |
| OpenAI cost per 1M tokens | $0.02 | $0.13 | 6.5x |
| Search latency | ~50ms | ~60ms | 1.2x |
| Benchmark accuracy | 88.2% | 90.5% | +2.3pp |
python scripts/backup_automem.pyThis creates timestamped backups in backups/:
backups/falkordb/memories_YYYYMMDD_HHMMSS.rdbbackups/qdrant/qdrant_snapshot_YYYYMMDD_HHMMSS.tar.gz
# Add to your .env file
echo "VECTOR_SIZE=3072" >> .env
echo "EMBEDDING_MODEL=text-embedding-3-large" >> .envOr export temporarily:
export VECTOR_SIZE=3072
export EMBEDDING_MODEL=text-embedding-3-largepython scripts/reembed_embeddings.pyThis will:
- Fetch all memories from FalkorDB (source of truth)
- Generate new 3072d embeddings using OpenAI API
- Recreate Qdrant collection with new dimensions
- Upsert all embeddings in batches
Expected time: ~5-10 minutes per 10k memories
Check Qdrant collection info:
curl http://localhost:6333/collections/memories | jq '.result.config.params.vectors'Should show:
{
"size": 3072,
"distance": "Cosine"
}curl -X POST http://localhost:8001/recall \
-H "Authorization: Bearer $AUTOMEM_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"query": "test recall", "limit": 5}'Verify results are returned and scores look reasonable.
# If using Docker
docker compose up -d
# Or rerun the foreground dev stack
make dev
# If using systemd
sudo systemctl restart automem
# If using Railway
railway upIf migration fails or results are poor:
# 1. Stop application
docker compose down # or however you run AutoMem
# 2. Restore from backup
python scripts/restore_from_backup.py backups/qdrant/qdrant_snapshot_YYYYMMDD_HHMMSS.tar.gz
python scripts/restore_from_backup.py backups/falkordb/memories_YYYYMMDD_HHMMSS.rdb
# 3. Revert configuration
export VECTOR_SIZE=768
export EMBEDDING_MODEL=text-embedding-3-small
# 4. Restart
docker compose up -dWhen to downgrade: If storage costs are too high or 3072d isn't providing enough value.
Follow the same migration steps above, but use:
export VECTOR_SIZE=768
export EMBEDDING_MODEL=text-embedding-3-smallThen run reembed_embeddings.py to recreate the collection with 768d vectors.
Symptom (only when VECTOR_SIZE_AUTODETECT=false):
FATAL: Vector dimension mismatch detected!
Existing Qdrant collection: 3072d
Configured VECTOR_SIZE: 1024d
Solution (pick one):
- Set
VECTOR_SIZE_AUTODETECT=true(default) to automatically adopt the existing collection dimension - Set
VECTOR_SIZE=<existing-dimension>in your.envto match your data - Migrate to the new dimension: follow the 1024d, 3072d, or 768d migration steps above
Symptom:
Rate limit exceeded during re-embedding
Solution:
The reembed_embeddings.py script uses whatever embedding provider is configured (Voyage, OpenAI, local, etc.). For large datasets:
- Run during off-peak hours
- Increase your provider's rate limits if applicable
- Split migration into batches using
--batch-sizeflag
Symptom:
Collection 'memories' already exists with different dimension
Solution: Delete and recreate:
curl -X DELETE http://localhost:6333/collections/memories
python scripts/reembed_embeddings.pySymptoms:
- Taking hours for thousands of memories
- High OpenAI API costs
Solutions:
- Check batch size: Script uses batches of 100 by default
- Parallel processing: Use
--workersflag (if implemented) - Spot check first: Test on a subset before full migration
- Use cheaper model for testing:
export EMBEDDING_MODEL=text-embedding-3-small python scripts/reembed_embeddings.py --dry-run
Symptoms:
- Backup script errors
- Empty backup files
Solutions:
- Check disk space:
df -h - Check permissions:
ls -la backups/ - Manual backup:
# FalkorDB docker exec automem-falkordb-1 redis-cli --rdb /data/dump.rdb # Qdrant curl -X POST http://localhost:6333/collections/memories/snapshots
- ✅ Always backup first - Don't skip this step
- ✅ Test in staging - If you have a staging environment
- ✅ Monitor costs - Check OpenAI usage dashboard during migration
- ✅ Document current state - Note current VECTOR_SIZE and EMBEDDING_MODEL
- ✅ Run benchmark tests - Verify accuracy hasn't degraded
- ✅ Monitor performance - Check search latency and throughput
- ✅ Update documentation - Note when migration occurred and why
- ✅ Store migration record:
curl -X POST http://localhost:8001/memory \ -H "Authorization: Bearer $AUTOMEM_API_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "content": "Migrated to 3072d embeddings for better semantic precision", "tags": ["migration", "config", "embeddings"], "importance": 0.8 }'
Use 768d (text-embedding-3-small) if:
- Cost-conscious deployment
- Storage is limited
- Speed > slight accuracy gains
- Personal/development use
- Small dataset (<100k memories)
Use 3072d (text-embedding-3-large) if:
- Production deployment
- Accuracy is critical
- Complex multi-hop reasoning needed
- Large dataset benefits from precision
- Storage/compute costs are acceptable
- Environment Variables - Configuration reference
- Testing Guide - Benchmark testing
- Monitoring & Backups - Backup strategies
- Railway Deployment - Cloud deployment guide