This study presents a novel machine learning framework for odor prediction. We compiled a large dataset of odorants with expert-annotated olfactory descriptors. By applying multi-label classification methods, we aimed to elucidate the relationships between molecular structure and odor perception.
## When use the code please change the folder path and use the same name of files.
a. In the code: 'Combine_data_P&G_ML.ipynb', For dataset combination, dataset use this link:(https://drive.google.com/drive/folders/1juK9db6W9xIIaVs3UbHEIvL9yW_NwpmN?usp=sharing)
b. For label clean use this code'Label_clean_P&G_ML.ipynb' this data folder: https://drive.google.com/drive/folders/1a8T83HLk395B5DYoJZ9Y7nOHA-mesczZ?usp=sharing
c. Feature combination, use this code'Features_Combine_data_P&G.ipynb'. the input file dataset use this link.
https://drive.google.com/drive/folders/1ecWukhdTfbhtKPYYU5rhjxCprEtNDOVp?usp=sharing
d. In training code 'ML_training and Evaluation.ipynb', For feature's data. pls use this link: https://drive.google.com/drive/folders/1eLv3WS6rkA-qPIp1nJfXCeIvmHIQmLt2?usp=sharing.
Feature data: 'feature_1901.csv'; Label data:label_502.csv
for evaluation, pls use the dataset link: https://drive.google.com/drive/folders/1Ytweu1yGHAxfvP5Q8jARMqL198OuTlCz?usp=sharing
e. For plot performace, the input file save in Figure folder in the link.https://drive.google.com/drive/folders/1eLv3WS6rkA-qPIp1nJfXCeIvmHIQmLt2?usp=sharing. figure/perfomace3a.xlsx.
Python==3.7
Sklearn==1.2.2.
Numpy==1.25.2.
matplotlib==3.7.1
pandas==2.0.3.
iterative-stratification==-0.1.7
Feature-engine
nltk==3.8.1.
fuzzywuzzy==-0.18.0