Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New reader for G4X datasets (Singular Genomics) #281

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

ckmah
Copy link
Contributor

@ckmah ckmah commented Feb 19, 2025

👋 Hello @scverse/spatialdata and community,

I would like to contribute the initial version of a spatialdata-io reader for Singular Genomics G4X datasets that I recently developed for internal use (I work at Singular), and now for the spatial community. It is still experimental and not fully battle-tested, but I tried to keep the API consistent with the other readers as much as possible. However, there are few key additions I made to streamline use with our datasets:

Notable features

  • Incremental I/O of elements (images, tables etc.) G4X datasets can get pretty large since they are multimodal. Therefore, we made sure the reader saves elements as soon as they are converted to reduce memory Reduce readers' memory consumption #229 and mitigate data loss. This is handled via the g4x(..., mode="append") parameter. The user can also choose mode="overwrite" to turn this off. The constructed SpatialData is also re-read from disk automatically to fully take advantage of lazy loading.
  • Read one or more samples at once. This corresponds to our assay design and enables converting an entire experiment with a single function call. The reader will then return a single SpatialData object or a list of them accordingly.

Additional Dependencies

  • Some of our images are encoded in the Jpeg2000 (.jp2, j2k) format and require the glympur package to read

Misc.

Are there any other pieces I should have in this PR? Devs please let me know, I'm happy to add them. Here are relevant ones I can think of:

  • Documentation/tutorial notebook? (also not sure if I used @injectdocs decorator properly)
  • Parse experimental metadata: sample names, positions, acquisition info etc.
  • spatialdata-io CLI compatibility

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant