Add support for Cell Ranger #101

arteymix · 2025-09-13T18:32:43Z

TODO

integrate CellRanger in bioluigi
detect layout of runs in SRA
prevent Cell Ranger from creating MRO files in the current directory, we should probably ditch the --output-dir option and change the directory for the execution
run bamtofastq for BAM-file SRA submissions

arteymix · 2025-09-16T18:17:28Z

.github/workflows/build.yml

-      run: |
-        conda env update --file environment.yml --name base
+        activate-environment: rnaseq-pipeline
+        environment-file: environment.yml


This should be included immediately in the trunk.

setup.py

Parse SRA metadata from its XML format so that we can infer the role that each file plays in fastq-dump output. Add typing and fix many bugs. Retrieve the SRA public dir from a configuration Improve layout detection from SRA metadata Detect bcl2fastq standard filenames and also commonly used names. Add a fallback that checks for the presence of I1/I2/R1/R2, but warns since this is very unreliable. Track issues encountered in runs using an enumerated flag. Make resolution of test resources relative Allow some of the parameters for filtering cells to be overwritten if needed. Use CellRangerCount task from bioluigi Fix unpacking of singleton for single-run experiments Remove cell_ranger_bin from config, it's declared in bioluigi

Add more metadata.

Rename fastq_file_types to read_types and add an enumerated type for possible values.

Detect which pipeline branch to take by looking up the assay type of a dataset. Add a special case for FAC-sorted single-cell datasets that should be treated as bulk. Add support for 10x BAM SRA submissions. This is done by looking up the header of the BAM files to infer the sequencing layout and calling bamtofastq downstream on the original submission. Temporarily use the branch of bioluigi with improved sratools support and Cell Ranger.

arteymix · 2025-10-09T16:43:01Z

example.luigi.cfg

+[rnaseq_pipeline.sources.sra]
+# location where tools like prefetch and fastq-dump will store downloaded SRA files
+# you can get this value with vdb-config -p
+ncbi_public_dir=/cosmos/scratch/ncbi/public


I've encountered issues with parsing the output of vdb-config, so this is a more robust solution overall.

arteymix · 2025-10-09T16:44:35Z

rnaseq_pipeline/rnaseq_utils.py

+                  is_single_end: bool = False, is_paired: bool = False):
+    """Detects the layout of the sequencing run files based on their names and various additional information.
+
+    :param run_id: Identifier for the run


Mention here that a run is akin to a lane.

…rget

arteymix force-pushed the feature-cell-ranger branch 3 times, most recently from a9daf79 to 89ec252 Compare September 15, 2025 22:31

arteymix commented Sep 16, 2025

View reviewed changes

setup.py Outdated Show resolved Hide resolved

This was linked to issues Sep 18, 2025

Handle SRA experiments with multiple lanes mapped on distinct runs #94

Open

Remove --clip and --skip-technical from fastq-dump #87

Closed

arteymix force-pushed the feature-cell-ranger branch 2 times, most recently from 9e01957 to f250d94 Compare September 21, 2025 23:50

arteymix added 6 commits September 23, 2025 16:15

Use efetch directory with -id instead of esearch

8a285bc

Use conda-incubator/setup-miniconda

4a35d2b

Fix missing GemmaTaskMixin import

6b21de9

Skip checking GemmaDatasetHasBatch since it requires credentials

8ad0e47

Ignore SRA runs that do not contain transcriptomic RNA-Seq data

749eea6

arteymix force-pushed the feature-cell-ranger branch from 43c04ed to a987102 Compare September 23, 2025 23:21

Parse the --readTypes option

1d0932d

Add more metadata.

arteymix force-pushed the feature-cell-ranger branch from a987102 to 1d0932d Compare September 23, 2025 23:41

arteymix and others added 4 commits September 24, 2025 11:09

Improve and fix logging for extracting SRA metadata

bb11982

Rename fastq_file_types to read_types and add an enumerated type for possible values.

Validate SRA metadata by reading it prior to writing it to disk

7861dcc

Do not open the browser in Google OAuth flow

0035f85

arteymix commented Oct 9, 2025

View reviewed changes

arteymix added 6 commits October 9, 2025 10:14

Update Python to 3.12

819368a

Improvements for local source

9310e18

fixup! Add support for single-cell RNA-Seq datasets

f4d0588

Mark test data as generated

46cd60e

Add missing test data file

77f595b

Fix Makefile

1396137

arteymix added 2 commits October 9, 2025 10:58

Replace luigi-wrapper with a simple CLI tool

29a3cfb

Skip fac-sorted dataset test since it's not public

cb32904

This was linked to issues Oct 9, 2025

Integrate Cell Ranger for reprocessing single-cell datasets based on 10x Chromium #97

Open

Handle 10x Cell Ranger submissions to SRA from BAM #102

Open

Add a task for retrieving GEO platform metadata #99

Open

arteymix mentioned this pull request Oct 9, 2025

Add a task for retrieving GEO platform metadata #99

Open

arteymix added 2 commits October 9, 2025 11:46

sra: Cache BAM headers

f8df3d2

Delete organized single-cell data implement remove() to DownloadRunTa…

0757134

…rget

arteymix self-assigned this Oct 9, 2025

arteymix added this to the 2.2.0 milestone Oct 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for Cell Ranger #101

Add support for Cell Ranger #101

Uh oh!

arteymix commented Sep 13, 2025 •

edited

Loading

Uh oh!

arteymix Sep 16, 2025

Uh oh!

Uh oh!

arteymix Oct 9, 2025

Uh oh!

arteymix Oct 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add support for Cell Ranger #101

Are you sure you want to change the base?

Add support for Cell Ranger #101

Uh oh!

Conversation

arteymix commented Sep 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TODO

Uh oh!

arteymix Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

arteymix Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

arteymix Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

arteymix commented Sep 13, 2025 •

edited

Loading