-
Notifications
You must be signed in to change notification settings - Fork 3
Training Conditional Random Fields (CRF) by Maximizing Area Under the ROC Curve (AUC)
License
realbigws/DeepCNF_AUC
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
----------------------- DeepCNF_Package (v1.02) date: 2015.05.30 ----------------------- Title: Training Conditional Random Fields by Maximizing AUC Author: Sheng Wang Contact email: [email protected] References: 1. AUC-maximized Deep Convolutional Neural Fields for Protein Sequence Labeling Sheng Wang#, Siqi Sun, Jinbo Xu ECML/PKDD, 2016 https://link.springer.com/chapter/10.1007/978-3-319-46227-1_1 2. AUCpreD: Proteome-level Protein Disorder Prediction by AUC-maximized Deep Convolutional Neural Fields Sheng Wang#, Jianzhu Ma, Jinbo Xu ECCB, 2016 Bioinformatics, 2016 https://academic.oup.com/bioinformatics/article-abstract/32/17/i672/2450776 -------------------------------------------------------- ========= Abstract: ========= 1. The training program of Deep Convolutional Neural Filed ( DeepCNF ) for maximizing (1) log probability, (2) posterior probability , or (3) AUC value. 2. The predicting program of DeepCNF for a given feature file by using a trained model. ======== Install: ======== 1. download the package git clone https://github.com/realbigws/DeepCNF_AUC/ cd DeepCNF_AUC/ +++++++++++++++++++++ 2. compile 2.1 compile under single CPU cd source_code/ make cd ../ ------------- 2.2 compile under MPI/OpenMP cd source_code/ make mpi cd ../ ====== Usage: ====== 1. DeepCNF_Train Usage : mpirun -np NP ./DeepCNF_Train -r train_file -t test_file -w window_str -d node_str -s state_num -l feat_num [-S feat_select] [-n finetune_num] [-f finetune_reg] [-m init_model] [-o out_root] [-G gate_function] [-W label_weight] [-M method] [-D AUC_degree] Options: -np NP : number of processors. -r train_file : training file. -t test_file : testing file. -w window_str : window string for DeepCNF. e.g., '5,5' -d node_str : node string for DeepCNF. e.g., '40,20' -s state_num : state number. -l feat_num : feature number at each position. -S feat_select : feature selection. e.g., '1-7,9' [default uses all features] -n finetune_num : fine-tune iteration number. [default is 200] -f finetune_reg : fine-tune regularizer. [default is 0.5] -m init_model : file for initial model parameters. [optional, and default is NULL] -o out_root : output directory for trained models. [optional, and default is 'MODELS/'] -G gate_function : gate function type: sigmoid (1), tanh (2), and relu (0). [default is 1] -M method : maximize log_prob (0), posterior_prob (1), or AUC (2). [default is 0] -W label_weight : label weight. e.g., '0.1,0.9'. [default is 1 for each label] -D AUC_degree : degree (1-30) of polynomials for AUC [default is 3]. ------------------------------------- 2. DeepCNF_Pred Usage : ./DeepCNF_Pred -i input_file -w window_str -d node_str -s state_num -l feat_num -m init_model [-S feat_select] Options: -i input_file : input feature file. -w window_str : window string for DeepCNF. e.g., '5,5' -d node_str : node string for DeepCNF. e.g., '40,20' -s state_num : state number. -l feat_num : feature number at each position. -m init_model : file for trained model. -S feat_select : feature selection. e.g., '1-7,9' [default uses all features] --------------------------------------- ======== Example: ======== 1. Maximize log probability: cd examples/ ./DeepCNF_Train -r 2fzlA.reso -t 2fzlA.reso -w 5 -d 5 -s 3 -l 40 -n 50 cd ../ --------------------- 2. Maximize AUC value: cd examples/ ./DeepCNF_Train -r 1kq1S.feat -t 1kq1S.feat -w 5 -d 5 -s 2 -l 129 -n 50 -M 2 cd ../ --------------------- 3. Predict the labels of a given feature file by using a certain model: cd examples/ ./DeepCNF_Pred -i 1kq1S.feat -w 5 -d 5 -s 2 -l 129 -m model.10 cd ../ --------------------- 4. make feature file for DeepCNF cd examples/ ./DeepCNF_FeatMake 5fgnA.profile 5fgnA.label > 5fgnA.feat cd ../ ================== Input file format: ================== 1. Training data format //-> data format ( length, features, labels [should be authentic] ) //-> example below: 78 feat1 feat2 feat3, ..., featN .... (with 78 feature lines) 0 1 0 2 ... ( with 78 label lines) 54 feat1 feat2 feat3, ..., featN .... (with 54 feature lines) 0 1 0 2 ... ( with 54 label lines) ---------------------------------- 2. Testing data format //-> data format ( length, features, labels [can be anything] ) //-> example below: 78 feat1 feat2 feat3, ..., featN .... (with 78 feature lines) -1 -1 -1 -1 ... ( with 78 label lines) 54 feat1 feat2 feat3, ..., featN .... (with 54 feature lines) -1 -1 -1 -1 ... ( with 54 label lines)
About
Training Conditional Random Fields (CRF) by Maximizing Area Under the ROC Curve (AUC)
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published