This repository stores metadata templates in use at SciLifeLab, organized according to data type. The information flow between this repository, the data producing platforms and the data submitter with the end goal of data submission to a public end repository is sketched in the diagram below.
Title | Description | Link |
---|---|---|
SciLifeLab Genomics Technical Metadata Template | This template aims to capture technical metadata for genomics data produced at the Genomics platform, compatible with submission requirements from ENA and ArrayExpress. | genomics/README.md |
A template has a title, a description and a semantic version number, as well as well as a list of associated attribute fields.
Within a template each template attribute field needs to have:
- Field name
- Level of requirement/cardinality (mandatory vs optional)
mandatory_for_data_producer
: to be filled in by the data producing facility as far as possiblemandatory_for_data_submitter
: to be filled in by the data submitter, not expected to be known by the data producing facility
- Description
- List of controlled vocabulary terms, if applicable
- Target end repository
- Target end repository (field) name
- Target end repository (field) description
In addition to data type specific fields capturing the technical metadata itself, all templates include additional organizational metadata such as
- SciLifeLab infrastructure platform and unit
- Unit internal project ID(s)
- Associated order ID
- Experimental Sample IDs (as assigned by the unit, 1 exp sample = 1 data file (pair))
- Associated Sample IDs (as shared by the researcher with the unit)
- Delivery date
- Template name
- Template version
Templates are provided as .tsv, .json and .json schema. A row entry for an individual sample in a filled out .tsv would then correspond to the following information
<data_type_specific_field1> | ... | <data_type_specific_fieldM> | <data_file_name_R1> | ... | <data_file_name_RP> | <orga_meta_field1> | ... | <orga_meta_fieldN> |
---|