Transcriptomics EDAM

Transcriptomics EDAM is an ontology developed to model experimental processes and related entities in computational transcriptomics experiments.

Features

This ontology was developed to improve the coverage of objects, processes, and intermediate states commonly encountered in transcriptomics workflows. It combines the original EDAM ontology (v1.25) and STATO. It also contains terms extracted from transcriptomics publications.

We introduce an upper-level class, data status, to represent transformations where the data type and format remain unchanged but the data content changes. For example, a dataset may become filtered without changing its file format or structural type.
We add a Database branch, reflecting the fact that database references often implicitly indicate both data types and associated operations. Due to the large number of named databases, only a representative subset is included in the core ontology (see the withoutDatabases folder). The withDatabases folder contains a more comprehensive collection of database names curated manually and sourced from the Nucleic Acids Research (NAR) database collection. However, this version is less actively maintained, as it introduces significant overhead during normalisation.
To better represent statistical objects and analytical operations, we import relevant branches from an existing statistics ontology.
We include object properties that capture biological and analytical semantics, such as has input and has means. While expressive, their practical use is currently limited due to the high curation cost required for consistent application.
To further improve coverage, we collected frequently used transcriptomics-related entities from the literature and added them to the ontology. Definitions and synonyms for these classes were generated using large language models and are currently presented as a flat list.

During transcriptomics methodology development, this ontology is continuously updated based on manual inspection and practical modelling needs.

Example Modelling Use Cases

How should the statement “The gene lists were filtered based on differential expression analysis” be normalised and represented semantically?
If a study uses the Gene Expression Omnibus, what types of analyses are implied, and which data formats should be expected and modelled?

Stats

withoutDatabase

Version	Total classes	Branch operation	Branch topic	Branch data	Branch format
0.0.2

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
caseStudies		caseStudies
withDatabases		withDatabases
withoutDatabases		withoutDatabases
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transcriptomics EDAM

Features

Example Modelling Use Cases

Stats

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Transcriptomics EDAM

Features

Example Modelling Use Cases

Stats

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Packages