End-to-end repo for extracting features from smile videos, finger-tapping videos, and speech (quick brown fox); training unimodal and uncertainty‑aware fusion models; and running paper-ready analyses and figures.
- Project Layout
- Quick Start
- Data Layout
- Feature Extraction
- Unimodal Models
- Fusion Models
- Analysis & Figures
- Pretrained Models
- Reproducibility Tips
- Citations
- License
.
├── code
│ ├── analyses
│ ├── feature_extraction_pipeline
│ │ ├── facial_expression_smile
│ │ ├── finger_tapping
│ │ └── quick_brown_fox
│ ├── fusion_model
│ └── unimodal_models
├── data
│ ├── facial_expression_smile
│ ├── finger_tapping
│ └── quick_brown_fox
├── environment.yml
└── models
Key highlights:
code/feature_extraction_pipeline/*— task‑specific feature extraction.code/unimodal_models/*— per‑modality model training/inference.code/fusion_model/— uncertainty‑aware fusion (all three tasks/modalities).code/analyses/— metrics, stats, demographics, figures (paper‑ready).data/— canonical CSVs for each modality + fusion splits.models/— saved scalers and model checkpoints per modality + fusion.
Use the YAML file environment.yml to create a conda env called park.
# From the repository root where environment.yml lives
# Create a new environment (override the name if you want)
conda env create -f environment.yml -n park
# Activate it
conda activate park
# If the environment already exists and you want to update it
# (this prunes removed deps and adds new ones)
conda env update -f environment.yml --pruneCanonical CSVs and metadata live in data/:
- Fusion:
full_fusion_dataset.csv,train_val_data_to_save.csv,test_data_to_save.csv,test_df_*_to_save.csv,predictions.csv - Demographics/labels:
demographic_table.csv,metadata_*.csv,labeling_neurologists_PARK.csv,df_stage_data.csv - Per‑modality:
- Smile:
data/facial_expression_smile/data_smile.csv - Finger tapping:
data/finger_tapping/data_finger_tapping.csv - Speech:
data/quick_brown_fox/data_speech.csv
- Smile:
Splits:
- Dev/Test IDs:
dev_set_participants.txt,test_set_participants*.txt
Tip: Keep file paths consistent with these CSVs to enable plug‑and‑play training across scripts.
All task pipelines live under code/feature_extraction_pipeline/.
- Script:
code/feature_extraction_pipeline/facial_expression_smile/smile_feature_extraction.py - Sample Input:
code/feature_extraction_pipeline/facial_expression_smile/raw_videos/SMILE.mp4 - Output Folder:
code/feature_extraction_pipeline/facial_expression_smile/extracted_features/ - Notes:
- Extracts framewise facial landmarks/AUs and aggregates per‑video features.
- See README in the same folder for task-specific usage details.
Run:
python smile_feature_extraction.py --video_files_directory=raw_videos --openface_files_directory=openface_extracts --output_dir=extracted_features 2>/dev/null- Script:
code/feature_extraction_pipeline/finger_tapping/feature_extraction.py - Samples:
code/feature_extraction_pipeline/finger_tapping/sample_videos/*.mp4 - Outputs:
code/feature_extraction_pipeline/finger_tapping/sample_outputs/- Contains intermediate features (
*.pkl) and annotated videos (output.mp4), plus placeholders for MediaPipe annotations.
- Contains intermediate features (
- Notes:
- Produces a consolidated
features.csvsuitable for model training. - See README in the same folder for task-specific usage details.
- Produces a consolidated
Run:
python feature_extraction.py --folder sample_videos/ --output outputs/ - Core Scripts:
code/feature_extraction_pipeline/quick_brown_fox/video_preprocess.py— standardizes audio from input videos.code/feature_extraction_pipeline/quick_brown_fox/extract_wavlm_features.py— extracts WavLM embeddings and exportswavlm_features.csv.- Helpers:
helpers.py,speech_utils.py
- Sample Data:
code/feature_extraction_pipeline/quick_brown_fox/sample_data/- Includes
QUICK_BROWN_FOX.mp4, standardized outputs, and examplewavlm_features.csv.
- Includes
- Notes: See README in the same folder for task-specific usage details.
Run:
# (1) optional preprocessing (normalize audio/video)
python video_preprocess.py --file_path path/to/your.mp4
# (2) extract WavLM features
# this code consdiers `sample_data' as input and output directory
python extract_wavlm_features.pyEach modality has a BAAL‑style unimodal trainer in code/unimodal_models/:
- Smile:
- Code:
code/facial_expression_smile/unimodal_smile_baal.py - Config:
code/facial_expression_smile/constants_baal.py - Data:
data/facial_expression_smile/data_smile.csv
- Code:
- Finger:
- Code:
code/finger_tapping/unimodal_finger_baal.py - Config:
code/finger_tapping/constants_baal.py - Data:
data/finger_tapping/data_finger_tapping.csv
- Code:
- Speech:
- Code:
code/quick_brown_fox/unimodal_fox_wavlm_baal.py - Config:
code/quick_brown_fox/constants_baal.py - Data:
data/quick_brown_fox/data_speech.csv
- Code:
Usage (The best performing hyper-parameters are already set inside the code):
# Smile
python unimodal_smile_baal.py
# Finger
python unimodal_finger_baal.py
# Speech
python unimodal_fox_wavlm_baal.py Artifacts saved under models/<modality>_{scaler,predictive_model,residual_model}/.
Uncertainty‑aware late fusion combines unimodal predictions and optionally uncertainty terms.
- Core module:
code/fusion_model/uncertainty_aware_fusion_wavlm.py - Configs/consts:
code/fusion_model/constants.py - Artifacts:
- Trained fusion checkpoint:
models/uncertainty_aware_fusion/model.pth - Config:
models/uncertainty_aware_fusion/model_config.json - Uncertain indices:
code/fusion_model/uncertain_indices.csv
- Trained fusion checkpoint:
Typical usage (example flags):
python uncertainty_aware_fusion_wavlm.pyNotes
- The fusion expects consistent participant/session keys across CSVs.
- During the runtime, the fusion code will save several files later required for diffeerent analyses.
Located in code/analyses/:
-
big_csv_generation.ipynb: Generate a consolidated “big” CSV that merges per-modality features, labels, splits, and metadata into a single analysis-ready table (used by stats/figures).
-
Notebooks
- demographic_table_generation.ipynb: Build cohort demographics tables from metadata.
- figure_1.ipynb … figure_4.ipynb: Recreate all paper figures.
- statistical_analysis.ipynb: Run hypothesis tests, effect sizes, CIs, and produce result tables.
- Exported plots live under
code/analyses/plots/figure_{1..4}/
-
Metrics:
calculate_performance_metrics.py— compute AUROC/MAE/QWK/etc. from predictions. -
Data from User Feedback Study: CSVs under
code/analyses/validation_data/. -
SHAP
- Notebook:
code/fusion_model/shap_analysis.ipynb - Raw dump:
code/fusion_model/shap_raw.pklandshap_outputs/shap_raw_test_trainbg.pkl - Panels:
shap_outputs/fig_shap_panels.{png,pdf}
- Notebook:
The repo ships with ready‑to‑use checkpoints:
models/
facial_expression_smile_best_auroc_baal/
finger_model_both_hand_fusion_baal/
fox_model_best_auroc_baal/
uncertainty_aware_fusion/
Each contains a scaler/, predictive_model/, optional residual_model/, and a model_config.json. Use these for quick evaluation or as initialization for fine‑tuning.
- Pin Python and library versions in your conda env (
environment.yml) to match training. - Seed all RNGs in unimodal and fusion scripts where applicable.
- Freeze CSVs in
data/(train/val/test) before final runs. - Track Git commit hashes when generating artifacts in
code/analyses.
@inproceedings{islam2025accessible,
title={Accessible, at-home detection of Parkinson’s disease via multi-task video analysis},
author={Islam, Md Saiful and Adnan, Tariq and Freyberg, Jan and Lee, Sangwu and Abdelkader, Abdelrahman and Pawlik, Meghan and Schwartz, Cathe and Jaffe, Karen and Schneider, Ruth B and Dorsey, Ray and others},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={39},
number={27},
pages={28125--28133},
year={2025}
}
@article{islam2025remote,
title={Remote AI Screening for Parkinson's Disease: A Multimodal, Cross-Setting Validation Study},
author={Islam, Md Saiful and Adnan, Tariq and Abdelkader, Abdelrahman and Liu, Zipei and Ma, Evelyn and Park, Sooyong and Azad, Asif and Liu, Pai and Pawlik, Meghan and Hartman, Emily and others},
year={2025}
}
Proprietary/Research use only unless otherwise specified by the project owner.
For external use, please contact the maintainers to clarify licensing and data‑sharing constraints.