VariantAgent

Multi-agent clinical variant interpretation system using LangGraph and MCP.

Takes genetic variants (VCF or manual entry), queries public databases (ClinVar, gnomAD, Ensembl VEP, PubMed), applies ACMG/AMP classification criteria with chain-of-thought reasoning, and produces structured interpretation reports with full evidence provenance.

Architecture

graph TD
    U[User: CLI / API / Streamlit] --> O

    subgraph Orchestrator ["Orchestrator Agent (LangGraph)"]
        O[Plan & Route] --> QC
        O --> AN
        O --> LIT
        O --> CL
        O --> RV
    end

    subgraph Agents
        QC[QC Agent<br/>MultiQC, flagstat,<br/>coverage thresholds]
        AN[Annotation Agent<br/>ClinVar, gnomAD,<br/>Ensembl VEP, UniProt]
        LIT[Literature Agent<br/>PubMed + RAG over<br/>ACMG guidelines]
        CL[Classification Agent<br/>ACMG rule engine +<br/>chain-of-thought]
        RV[Reviewer Agent<br/>Self-evaluation,<br/>contradiction detection]
    end

    AN --> |MCP| CV[(ClinVar)]
    AN --> |MCP| GN[(gnomAD)]
    AN --> |MCP| EN[(Ensembl VEP)]
    LIT --> PM[(PubMed)]
    LIT --> VDB[(ChromaDB<br/>ACMG Guidelines)]

    QC --> |QC issues?| O
    AN --> |Novel variant?| LIT
    CL --> RV
    RV --> HITL{Confidence<br/>above threshold?}
    HITL --> |Yes| RPT[Structured Report<br/>+ Provenance Trail]
    HITL --> |No| HUM[Human Review<br/>Checkpoint]
    HUM --> RPT

Features

6 specialized agents with distinct system prompts, tools, and responsibilities
Deterministic ACMG rule engine — LLM reasons about criteria; rules enforce correct classification
Dynamic routing — QC failures skip annotation; novel variants trigger literature search
Human-in-the-loop — confidence-gated checkpoints for low-confidence classifications
Self-evaluation — Reviewer Agent cross-checks all conclusions and flags contradictions
Full provenance — every conclusion traceable to the specific data source and reasoning step
Reusable MCP servers — ClinVar, gnomAD, Ensembl VEP as standalone tools

Quick Start

# Clone and install
git clone https://github.com/deepmind11/variantagent.git
cd variantagent
cp .env.example .env  # Add your API keys
make dev

# Analyze a variant
variantagent analyze "chr17:7674220 G>A"

# Or use Docker
docker compose up

Tech Stack

Component	Technology
Agent framework	LangGraph
Tool protocol	MCP (Model Context Protocol)
LLM	Claude / GPT-4o (configurable)
API backend	FastAPI
RAG	ChromaDB + sentence-transformers
Data contracts	Pydantic v2
Testing	pytest + VCR.py + deepeval
CI/CD	GitHub Actions
Observability	LangSmith

Project Structure

variantagent/
├── src/variantagent/
│   ├── agents/           # 6 LangGraph agents
│   │   ├── orchestrator.py
│   │   ├── qc_agent.py
│   │   ├── annotation_agent.py
│   │   ├── literature_agent.py
│   │   ├── classification_agent.py
│   │   └── reviewer_agent.py
│   ├── tools/            # Parsers and rule engines
│   │   ├── vcf_parser.py
│   │   ├── flagstat_parser.py
│   │   ├── multiqc_parser.py
│   │   └── acmg_engine.py
│   ├── mcp_servers/      # Standalone MCP servers
│   │   ├── clinvar_server.py
│   │   ├── gnomad_server.py
│   │   └── ensembl_vep_server.py
│   ├── models/           # Pydantic data contracts
│   ├── api/              # FastAPI REST API
│   ├── config.py
│   └── cli.py
├── tests/
│   ├── unit/
│   ├── integration/
│   └── e2e/
├── data/
│   ├── test_samples/     # Synthetic test data
│   ├── knowledge_base/   # ACMG guidelines for RAG
│   └── benchmarks/       # ClinGen evaluation data
├── docs/architecture/decisions/
├── streamlit/
├── pyproject.toml
├── Dockerfile
└── docker-compose.yml

Development

make dev          # Install with dev dependencies
make test         # Run all tests
make lint         # Lint with ruff
make typecheck    # Type check with mypy
make ci           # Run full CI pipeline locally
make serve        # Start FastAPI dev server
make streamlit    # Start Streamlit UI

Data Sources

All data sources are public and free:

ClinVar — Variant-disease assertions
gnomAD — Population allele frequencies
Ensembl VEP — Variant consequence prediction
UniProt — Protein functional annotation
PubMed — Scientific literature
ACMG/AMP 2015 — Classification guidelines

Limitations

Not for clinical use. This is a research/educational tool. Clinical variant interpretation requires validated, accredited systems.
Subset of ACMG criteria. Implements ~17 of 28 ACMG evidence criteria. Functional studies (PS3) and segregation (PP1/BS4) require data not available from public APIs.
LLM reasoning is non-deterministic. The same variant may receive slightly different criterion assessments across runs. The deterministic rule engine ensures classification consistency given the same criteria.
Rate-limited by public APIs. NCBI E-utilities allow 3 requests/second without an API key (10/sec with key). Batch analysis of large VCFs will be slow.
No somatic variant interpretation. This system follows germline ACMG/AMP guidelines. Somatic interpretation (AMP/ASCO/CAP 2017) is a different framework.

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VariantAgent

Architecture

Features

Quick Start

Tech Stack

Project Structure

Development

Data Sources

Limitations

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
docs		docs
src/variantagent		src/variantagent
tests		tests
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

VariantAgent

Architecture

Features

Quick Start

Tech Stack

Project Structure

Development

Data Sources

Limitations

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages