PolyDDI is an open-source PyTorch framework for predicting emergent drug-drug-drug interactions — adverse effects that arise only when three or more drugs are taken together. It uses Graph Neural Networks (GraphSAGE) trained on DrugBank pairwise DDI data and mines higher-order interaction signals from FDA FAERS adverse-event reports.
Under an honest evaluation protocol (drug-level split, hard negatives, triplet-only negative pool, 3-seed mean) the best v2 configuration reaches AUROC 0.660 ± 0.007, within the error bar of a supervised Morgan-fingerprint RandomForest baseline. Earlier quoted numbers above 0.76 were measured under an easier negative pool; see docs/PAPER_CLAIMS.md and docs/KNOWN_LIMITATIONS.md for the full investigation.
| Method | Test AUROC |
|---|---|
| Tanimoto similarity (best agg) | 0.5255 ± 0.0032 |
| Pretrained pairwise GNN aggregation (M2 SAGE) | 0.5704 ± 0.0094 |
| Finetuned-encoder pairwise aggregation | 0.5858 ± 0.0017 |
| Learned MLP over sorted pair logits | 0.6175 ± 0.0163 |
| Morgan FP + RandomForest | 0.6480 ± 0.0151 |
| PolyDDI v2 (finetuned_symmetric + hard-neg) | 0.6601 ± 0.0069 |
All numbers from results/summary_table.csv (reproducer:
python scripts/aggregate_results.py). Full per-configuration breakdown
and the original-vs-honest pool comparison in
results/v2_triplet_only_reeval/comparison.csv.
- Under the honest protocol, six qualitatively different approaches — from structural similarity to a trained triplet-specific GNN — converge within a 0.14pt AUROC band. The v2-vs-Morgan+RF gap (+0.012pt) is within Morgan+RF's 3-seed error bar.
- The negative candidate pool matters a lot. Switching from the
all-graph pool (1706 drugs) used during v2 training to the
triplet-only pool (260 drugs) used by every baseline lowers
reported v2 AUROC by 0.103pt for the headline config, and by up
to 0.346pt for random-negative-trained configs. Previously-quoted
"ceiling" numbers (0.988, 0.941, 0.763) were all inflated by the
easier pool. See
docs/KNOWN_LIMITATIONS.md§"Evaluation pool asymmetry". - Encoder fine-tuning contributes +0.065-0.098pt over a frozen pretrained encoder under the honest protocol (Claim 03 in PAPER_CLAIMS.md).
- The attention-head variant (
finetuned_attention_triplet_hard) overfits training-time all-graph hard negatives — training val AUROC reaches 0.955 while honest test AUROC falls to 0.493 — a concrete counter-example to using training-time val for model selection (Claim 04).
- FAERS에서 통계적으로 유의한 emergent 3-drug interaction signal을 자동 추출할 수 있다
- Pairwise DDI graph 위의 GNN embedding으로 3-way interaction을 부분적으로 예측할 수 있다 (honest AUROC ~0.66)
- 3-way interaction 예측은 본질적으로 어렵고, GNN-basd / fingerprint-based / aggregation 방식 모두 honest protocol 하에서 좁은 성능 대역 (≤0.14pt)에 수렴한다
- Evaluation negative pool 선택이 reported AUROC를 최대 0.35pt까지 부풀릴 수 있으며, 이 효과는 기존 연구 비교 시 반드시 고려되어야 한다
- Transductive: 그래프에 없는 신규 분자는 예측 불가
- FAERS ground truth는 noisy: 관찰 편향, 보고 편향, confounding
- Drug-level split에서 S1 시나리오가 0: split 구성의 한계
- Evaluation pool asymmetry: v2 training-time 숫자(all-graph pool)와 본 논문 표 숫자(triplet-only pool)가 다름. 자세한 내용은
docs/KNOWN_LIMITATIONS.md§"Evaluation pool asymmetry" - Hard-negative structural bias: 1-drug swap 방식의 hard negative는 3 pair 중 2개가 positive와 공유. 본질적 aggregation 편향 (
docs/KNOWN_LIMITATIONS.md§"Hard-negative structural bias")
Inductive molecular encoder (SMILES/atom-level GNN)로의 확장 + 본 논문에서 수립한 honest evaluation protocol 하의 재벤치마킹.
git clone https://github.com/brianyu43/polyddi-v2.git
cd polyddi-v2# conda environment
conda create -n polyddi python=3.11
conda activate polyddi
pip install torch==2.6.0+cpu -f https://download.pytorch.org/whl/cpu
pip install torch-geometric==2.7.0 PyTDC==1.1.15 "setuptools<81"
pip install pandas pyarrow pyyaml matplotlib# Download FAERS raw data (8 quarters) into data/faers/raw/
# Then run the full pipeline:
python scripts/reproduce.py --seed 42 --faers-raw-dir data/faers/raw
# With skip for already-completed steps:
python scripts/reproduce.py --seed 42 --faers-raw-dir data/faers/raw --skip-existing# Prepare DrugBank pairwise graph
python scripts/01_prepare_data.py --seed 42
# Build drug name vocabulary
python scripts/00_build_drugbank_vocabulary.py --drugbank-dir artifacts/data/drugbank/seed-42
# Mine FAERS signals (per quarter)
python scripts/03_mine_faers.py --config configs/data/faers.yaml --quarters 2023Q1 --seed 42
# Map to DrugBank space
python scripts/04_map_drugs.py --config configs/data/faers.yaml \
--signal-dir artifacts/faers/signals/2023Q1/seed-42 \
--drugbank-dir artifacts/data/drugbank/seed-42 --seed 42
# Merge all quarters
python scripts/05_merge_mapped_triplets.py --seed 42
# Prepare triplet dataset
python scripts/06_prepare_triplets.py \
--merged-dir artifacts/faers/merged/seed-42 \
--drugbank-dir artifacts/data/drugbank/seed-42 \
--output-dir artifacts/faers/triplet_dataset/seed-42 --seed 42
# Train pairwise encoder
python scripts/02_train_pairwise.py --config configs/models/sage_pairwise.yaml --seed 42
# Train triplet model
python scripts/07_train_triplet.py \
--config configs/models/finetuned_symmetric_triplet_hard.yaml \
--triplet-dir artifacts/faers/triplet_dataset/seed-42 \
--drugbank-dir artifacts/data/drugbank/seed-42 --seed 42DrugBank (TDC) FAERS (FDA)
| |
01_prepare_data 03_mine_faers (x8 quarters)
| |
00_build_vocabulary 04_map_drugs (x8 quarters)
| |
| 05_merge_mapped_triplets
| |
| 06_prepare_triplets
| |
02_train_pairwise |
| |
+-------+-------+-------+
|
07_train_triplet (x6 configs)
|
08_permutation_test
09_analyze_results
10_summary_table
11_case_study
artifacts/
data/drugbank/seed-42/ # Pairwise graph, features, splits, vocabulary
faers/
signals/{quarter}/seed-42/ # Per-quarter FAERS signal mining results
mapped/{quarter}/seed-42/ # Drug-mapped triplets per quarter
merged/seed-42/ # Merged triplets across quarters
triplet_dataset/seed-42/ # Final triplet dataset (drug_level split)
triplet_dataset_random/seed-42/ # Random split variant
runs/
pairwise/sage/seed-42/ # Pairwise SAGE encoder checkpoint
triplet/
finetuned_symmetric/seed-42/ # Best config (random neg)
finetuned_symmetric_hard/seed-42/ # Best config (hard neg)
frozen_symmetric/seed-42/
frozen_symmetric_hard/seed-42/
finetuned_attention/seed-42/
frozen_attention/seed-42/
finetuned_symmetric_random_split/seed-42/
analysis/
permutation/ # Permutation test results
gap_analysis.json # Val-test gap analysis
learning_curves.png # Training curves
summary_table.csv # All results in one table
case_studies.json # Top TP/FP/FN case studies
- DrugBank (via TDC) -- pairwise DDI graph with 1,706 drugs and 191,808 edges
- DrugBank Open Vocabulary -- drug name/synonym mapping for FAERS linkage
- FDA FAERS -- 8 quarters (2023Q1-2024Q4) of adverse event reports
@misc{polyddi2026,
title = {PolyDDI: Higher-Order Drug-Drug Interaction Prediction
with Graph Neural Networks},
year = {2026},
url = {https://github.com/brianyu43/polyddi-v2}
}MIT License