Directory notation

This work, titled “Deep Spectral Component Filtering as a Foundation Model for Spectral Analysis Demonstrated in Metabolic Profiling,” is published in Nature Machine Intelligence. This repository contains code for utilizing a pretrained foundation model tailored for spectral analysis. To enhance accessibility for spectroscopy researchers, the code has been designed with user-friendliness in mind, allowing for a seamless start without requiring complex training frameworks or extensive environment configuration. Additionally, we have provided scripts for fine-tuning, accompanied by clear instructions within them, to assist users in loading their own data and adapting the model to their specific tasks.

Directory notation

pretrain: This folder contains the pretrained model weights and general-purpose tools for utilizing the foundation model.

customized_task: This folder includes scripts for applying the pretrained model and finetuning it to suit your specific tasks.

preprocessing: This folder provides scripts for preprocessing, along with source files and results for tasks such as infrared paraffin removal and SERS nanoparticle (NPs) removal.

quantify: This folder contains scripts for quantification, accompanied by spectral data ready for quantitative analysis.

ComFilE: This folder includes scripts and results for the Component Filtering Explanation (ComFilE) method. ComFilE can be used to rank the importance of specific spectral components (e.g., metabolites in serum) and interpret their contributions to distinguishing results (e.g., disease vs. control samples).

ComFilE_Extended: This folder contains scripts and results for the k-order Component Filtering Explanation (where k > 1). The k-order ComFilE extends the methodology to analyze the cooperative effects of k spectral components in explaining result distinctions.

Quick start

The scripts to build DSCF for your personalized work are in the directory "costumized_task".

To start with the scripts, you should follow the instructions in the 'dataset.py' to load your spectra into the corresponding file folds.

Fold 'Component-spec' is for the spectral dictionary of pure substances.

Fold 'Impurity-spec' is for unwanted spectral components to filter out from spectra

Fold 'Pure-spec' is for spectral components to be preserved.

{'dir':'Pure-spec/', 'tensor_dim':2, 'spec_tensor_dim':-1,}

An attribution dictionary should be innit for each data fold. Tensor_dim is to describe the total dimension of one data file. Spec_tensor_dim is to describe the id of spectral dimension in the data file, ranging from (0,tensor_dim-1).

The output mode can be customized by revising the 'return value' in the gettitem function.

Gallery of implicit results behind this work

Model architecture

DSCF model is a hierarchical local attention encoder-decoder transformer. The detailed components of the model are described in DSCF_model_pe.py. The following image is the general outline of the general pre-trained model. The pre-trained weights of the tiny-version model are available and can be downloaded at https://figshare.com/s/2b31ca642313086dcfe6. The weights of larger models can be downloaded at 10.6084/m9.figshare.28582499.

Preprocessing

Paraffin removal is a general routine in FFPE IR analysis. DSCF model can be tailored for paraffin removal. The following images are results of raw data, paraffin and paraffin-removed data.

Explaining for spectral marker

Some of the in-silico explaining results are as follows, where highlighted components are ground truth.

The code for detailed downstream tasks is coming soon after the manuscript is formally published.

2nd-order ComFilE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Directory notation

Quick start

Gallery of implicit results behind this work

Model architecture

Preprocessing

Explaining for spectral marker

About

Releases 3

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 91 Commits
ComFilE		ComFilE
ComFilE_Extended		ComFilE_Extended
customized_task		customized_task
preprocessing		preprocessing
pretrain		pretrain
quantify		quantify
LICENSE		LICENSE
README.md		README.md

License

streamflowmaster/Deep-Spectral-Component-Filtering-DSCF-

Folders and files

Latest commit

History

Repository files navigation

Directory notation

Quick start

Gallery of implicit results behind this work

Model architecture

Preprocessing

Explaining for spectral marker

About

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages