Skip to content

Files

Latest commit

1b9f1ed · Mar 26, 2025

History

History
201 lines (163 loc) · 5.57 KB

README.md

File metadata and controls

201 lines (163 loc) · 5.57 KB

ORI

header


English | 简体中文


About

This repository is the ORI protein design warehouse of Tencent AI For Life Sciences Lab, including projects such as protein generation, protein attribute prediction and protein basic model reinforcement learning.

De Novo Design of Functional Proteins with ORI
Bin He, Chenchen Qin...Jianhuayao
Paper: https://arxiv.org/abs/xxxxx

TX-L2 TX-L6 Hen Lysozyme

Main Models

Project Model Dataset Description
Protein Generation ORI-PGM-1B Uniref50,PDB ORI protein generation 1B model
ORI-PGM-3B Uniref50, PDB ORI protein generation 3B model
Protein Discriminator USM-100M Uniclust30, Uniref50 USM 100M mono sequence and msa foundation model
USM-100M-Solubility Solubility dataset USM 100M solubility prediction model
USM-100M-Thermostablility Thermostablility dataset USM 100M thermostablility prediction model
USM-100M-SignalP Signal peptide dataset USM 100M signal pepetide prediction model
xfold USMFold-100M(will be released soon) Uniref50, PDB Superfast protein folding prediction model based on USM
ESMFold Uniref50, PDB, AFDB Optimized ESMFold in ori program
RLWF Reinforcement Learning from Wet-lab Feedback

you can download pre-trained weights with following link:

Getting Started

Install

Requirements

  • System: Linux and MacOS
  • Python 3.8 and above
  • Pytorch 2.0.0 and above, no more than 2.4.0
  • If you use Nvidia GPU, make sure the memory is greater than 8G

To install

You can install the package with the following command line. For other installation methods and options, please refer to INSTALL.md.

# install miniconda
wget -O minicnda3.sh https://repo.anaconda.com/miniconda/Miniconda3-py39_24.5.0-0-Linux-x86_64.sh
# specific miniconda install path
CONDA_PATH=/miniconda
bash minicnda3.sh -b -p ${CONDA_PATH}
rm minicnda3.sh
# init environment
conda init
# download code
git clone https://github.com/TencentAI4S/ori.git
cd ori
conda env create -n ori -f environment.yml
conda activate ori

Download Model Weights(Optional)

if you want to test model offline, please download model weights to "~/.cache/torch/hub/checkpoints" first.

1. Protein Generation

Lysozyme Generation

prompt='<Glucosaminidase><temperature90><:>'
python projects/progen/generate_protein.py -p ${prompt} -n 5

Enzyme Generation

prompt='<EC:3.1.1.101><temperature90><:>'
python projects/progen/generate_protein.py -p ${prompt} -n 5

Multifunctional Enzyme Generation

prompt='<EC:3.2.1.14><EC:3.2.1.17><temperature60><:>'
python projects/progen/generate_protein.py -p ${prompt} -n 5

2. Protein Solubility Prediction

python projects/prodiscriminator/predict_solubility.py -i projects/prodiscriminator/data/solubility_demo.fasta

3. Protein Thermostablility Prediction

python projects/prodiscriminator/predict_thermostability.py -i projects/prodiscriminator/data/thermostability_demo.fasta

4. Signal Peptide Prediction

python projects/prodiscriminator/predict_signal_peptide.py -i projects/prodiscriminator/data/signalp_demo.fasta

5. Protein Fold Prediction with USMFold

python projects/xfold/usmfold_predict.py -i projects/xfold/data/test.fasta

Citation

If you use this codebase, or otherwise find our work valuable, please cite ori:

@article{ori,
  title={De Novo Design of Functional Proteins with ORI},
  author={Bin He,Chenchen Qin...Jianhuayao},
  journal={arXiv preprint arXiv:xxx},
  year={2025}
}