Multi-agent clinical variant interpretation system using LangGraph and MCP.
Takes genetic variants (VCF or manual entry), queries public databases (ClinVar, gnomAD, Ensembl VEP, PubMed), applies ACMG/AMP classification criteria with chain-of-thought reasoning, and produces structured interpretation reports with full evidence provenance.
graph TD
U[User: CLI / API / Streamlit] --> O
subgraph Orchestrator ["Orchestrator Agent (LangGraph)"]
O[Plan & Route] --> QC
O --> AN
O --> LIT
O --> CL
O --> RV
end
subgraph Agents
QC[QC Agent<br/>MultiQC, flagstat,<br/>coverage thresholds]
AN[Annotation Agent<br/>ClinVar, gnomAD,<br/>Ensembl VEP, UniProt]
LIT[Literature Agent<br/>PubMed + RAG over<br/>ACMG guidelines]
CL[Classification Agent<br/>ACMG rule engine +<br/>chain-of-thought]
RV[Reviewer Agent<br/>Self-evaluation,<br/>contradiction detection]
end
AN --> |MCP| CV[(ClinVar)]
AN --> |MCP| GN[(gnomAD)]
AN --> |MCP| EN[(Ensembl VEP)]
LIT --> PM[(PubMed)]
LIT --> VDB[(ChromaDB<br/>ACMG Guidelines)]
QC --> |QC issues?| O
AN --> |Novel variant?| LIT
CL --> RV
RV --> HITL{Confidence<br/>above threshold?}
HITL --> |Yes| RPT[Structured Report<br/>+ Provenance Trail]
HITL --> |No| HUM[Human Review<br/>Checkpoint]
HUM --> RPT
- 6 specialized agents with distinct system prompts, tools, and responsibilities
- Deterministic ACMG rule engine — LLM reasons about criteria; rules enforce correct classification
- Dynamic routing — QC failures skip annotation; novel variants trigger literature search
- Human-in-the-loop — confidence-gated checkpoints for low-confidence classifications
- Self-evaluation — Reviewer Agent cross-checks all conclusions and flags contradictions
- Full provenance — every conclusion traceable to the specific data source and reasoning step
- Reusable MCP servers — ClinVar, gnomAD, Ensembl VEP as standalone tools
# Clone and install
git clone https://github.com/deepmind11/variantagent.git
cd variantagent
cp .env.example .env # Add your API keys
make dev
# Analyze a variant
variantagent analyze "chr17:7674220 G>A"
# Or use Docker
docker compose up| Component | Technology |
|---|---|
| Agent framework | LangGraph |
| Tool protocol | MCP (Model Context Protocol) |
| LLM | Claude / GPT-4o (configurable) |
| API backend | FastAPI |
| RAG | ChromaDB + sentence-transformers |
| Data contracts | Pydantic v2 |
| Testing | pytest + VCR.py + deepeval |
| CI/CD | GitHub Actions |
| Observability | LangSmith |
variantagent/
├── src/variantagent/
│ ├── agents/ # 6 LangGraph agents
│ │ ├── orchestrator.py
│ │ ├── qc_agent.py
│ │ ├── annotation_agent.py
│ │ ├── literature_agent.py
│ │ ├── classification_agent.py
│ │ └── reviewer_agent.py
│ ├── tools/ # Parsers and rule engines
│ │ ├── vcf_parser.py
│ │ ├── flagstat_parser.py
│ │ ├── multiqc_parser.py
│ │ └── acmg_engine.py
│ ├── mcp_servers/ # Standalone MCP servers
│ │ ├── clinvar_server.py
│ │ ├── gnomad_server.py
│ │ └── ensembl_vep_server.py
│ ├── models/ # Pydantic data contracts
│ ├── api/ # FastAPI REST API
│ ├── config.py
│ └── cli.py
├── tests/
│ ├── unit/
│ ├── integration/
│ └── e2e/
├── data/
│ ├── test_samples/ # Synthetic test data
│ ├── knowledge_base/ # ACMG guidelines for RAG
│ └── benchmarks/ # ClinGen evaluation data
├── docs/architecture/decisions/
├── streamlit/
├── pyproject.toml
├── Dockerfile
└── docker-compose.yml
make dev # Install with dev dependencies
make test # Run all tests
make lint # Lint with ruff
make typecheck # Type check with mypy
make ci # Run full CI pipeline locally
make serve # Start FastAPI dev server
make streamlit # Start Streamlit UIAll data sources are public and free:
- ClinVar — Variant-disease assertions
- gnomAD — Population allele frequencies
- Ensembl VEP — Variant consequence prediction
- UniProt — Protein functional annotation
- PubMed — Scientific literature
- ACMG/AMP 2015 — Classification guidelines
- Not for clinical use. This is a research/educational tool. Clinical variant interpretation requires validated, accredited systems.
- Subset of ACMG criteria. Implements ~17 of 28 ACMG evidence criteria. Functional studies (PS3) and segregation (PP1/BS4) require data not available from public APIs.
- LLM reasoning is non-deterministic. The same variant may receive slightly different criterion assessments across runs. The deterministic rule engine ensures classification consistency given the same criteria.
- Rate-limited by public APIs. NCBI E-utilities allow 3 requests/second without an API key (10/sec with key). Batch analysis of large VCFs will be slow.
- No somatic variant interpretation. This system follows germline ACMG/AMP guidelines. Somatic interpretation (AMP/ASCO/CAP 2017) is a different framework.
MIT