While Chemeleon GitHub repository focuses on text-guided crystal structure generation, this repository provides a framework for De Novo Generation (DNG) and Crystal Structure Prediction (CSP) tasks.
- CSP (Crystal Structure Prediction): Predicts stable crystal structures from given atom types
- DNG (De Novo Generation): Generates new crystal structures from scratch
- Python 3.11+
- PyTorch >= 2.1.0
- CUDA (optional, for GPU acceleration)
pip install chemeleon-dngIf you don't have uv installed:
curl -LsSf https://astral.sh/uv/install.sh | shThen install the package:
git clone https://github.com/hspark1212/chemeleon-dng.git
cd chemeleon-dng
uv syncGenerate crystal structures for given chemical formulas:
from chemeleon_dng.sample import sample
sample(
task="csp",
formulas=["NaCl", "LiMnO2"],
num_samples=10,
output_dir="results",
device="cpu"
)Tip
Invoke help(sample) to explore all available parameters and usage examples.
After installing via pip, you can use the chemeleon-dng command directly:
chemeleon-dng --task=csp --formulas="NaCl,LiMnO2" --num_samples=10 --output_dir="results" --device=cpuThis command generates 10 crystal structures for the given formulas using the CSP task and saves the CIF files of the generated structures in the results/ directory using CPU.
Generate novel crystal structures without predefined compositions:
from chemeleon_dng.sample import sample
sample(
task="dng",
num_samples=200,
batch_size=100,
output_dir="results",
device="cuda"
)For the command line interface:
chemeleon-dng --task=dng --num_samples=200 --batch_size=100 --output_dir="results" --device=cudaThis command generates 200 random crystal structures using the DNG task with two batches of 100 each, and saves the generated structures in the results/ directory using GPU.
For a comprehensive step-by-step guide on using Chemeleon-DNG for crystal structure discovery, check out our interactive tutorial:
- Composition screening with SMACT
- Crystal structure generation with chemeleon-DNG
- Geometry optimization with MACE force fields and TorchSim
- Stability analysis using Materials Project phase diagrams via mp-api
When you run the sample script, it will automatically download the pretrained models from the figshare repository and save them in the ckpts/ directory (if not already present). The pretrained models were trained on mp-20 and alex_mp_20 datasets.
The framework includes pretrained checkpoints located in the ckpts/ directory:
chemeleon_csp_alex_mp_20_v0.0.2.ckptchemeleon_dng_alex_mp_20_v0.0.2.ckptchemeleon_csp_mp_20_v0.0.2.ckptchemeleon_dng_mp_20_v0.0.2.ckpt
If automatic download fails (timeout, connection error, or firewall restrictions), manually download the checkpoints:
# Download from Figshare
wget https://ndownloader.figshare.com/files/54966305 -O checkpoints.tar.gz
# Extract to project root
tar -xzf checkpoints.tar.gz
# Verify
ls ckpts/For checkpoints in custom locations, use the model_path parameter:
sample(task="csp", formulas=["NaCl"], model_path="/path/to/checkpoint.ckpt")For benchmarking purposes, we provide 10,000 sampled structures for the DNG task trained on mp-20 and alex_mp_20 datasets in the benchmarks/ directory. The sampled structures are saved in CIF format and compressed JSON format.
If you find our work helpful, please cite the following publication:
"Exploration of crystal chemical space using text-guided generative artificial intelligence" Nature Communications (2025)
DOI: 10.1038/s41467-025-59636-y
@article{park2025exploration,
title={Exploration of crystal chemical space using text-guided generative artificial intelligence},
author={Park, Hyunsoo and Onwuli, Anthony and Walsh, Aron},
journal={Nature Communications},
volume={16},
number={1},
pages={1--14},
year={2025},
publisher={Nature Publishing Group}
}This project is licensed under the MIT License, developed by Hyunsoo Park as part of the Materials Design Group at Imperial College London.
See the LICENSE file for more details.