module 'spatialdata' has no attribute 'match_sdata_to_table' #912

bellenger-l · 2025-03-25T10:26:11Z

Hello,

I wanted to filter my Xenium data but I have an error with the function match_sdata_to_table.

Here an reproducible example

```python
import spatialdata as spd
from spatialdata.datasets import blobs

sdata = blobs()
sub_adata = sdata.tables["table"][:10]
sub_sdata = spd.match_sdata_to_table(
sdata=sdata, table_name="table", table=sub_adata, how="right"
)
```

Return the following error :

AttributeError: module 'spatialdata' has no attribute 'match_sdata_to_table'. Did you mean: 'match_element_to_table'?

Desktop (optional):

RedHat (8.7)
spatialdata 0.3.0

How can I fix this ?
Thanks for your time
Best
Lea

The text was updated successfully, but these errors were encountered:

Pancreas-Pratik · 2025-03-27T20:53:55Z

Hi Lea,
I had this same issue. I learned that spatialdata 0.3.0 is the release version.
match_sdata_to_table is currently in the dev version.

In order to fix this I had to install a dev version that had the pull request #627. I chose to go to install the additional pull request #883 just to avoid any other potential issues.

How to install a specific pull request:

If you go on the main page https://github.com/scverse/spatialdata and click "Commits":

and then click the double square to "Copy SHA" [yellow arrow]:

and then construct the pip install command for spatialdata and to specify the specific dev version with the SHA:

pip install git+https://github.com/scverse/spatialdata.git@6e259f0afbf67379ade21e62560910e89f752c68

You may need git as well in order to run that command. If you are on an institution HPC, you could see if git is already installed via module avail git and module load <git output from module avail git>.

Respectfully,
Pratik

bellenger-l · 2025-03-31T11:45:43Z

Hello @Pancreas-Pratik ,

Thanks a lot for your help !! Your explanation were completely clear. Now the function works, I lost a lot of information (all images, Points, Labels and some shapes) in my subsetted spatialdata object but I think it's another problem, I'll take a closer look.

Best regards,
Lea

Pancreas-Pratik · 2025-03-31T13:44:50Z

Hello @Pancreas-Pratik ,

Thanks a lot for your help !! Your explanation were completely clear. Now the function works, I lost a lot of information (all images, Points, Labels and some shapes) in my subsetted spatialdata object but I think it's another problem, I'll take a closer look.

Best regards, Lea

You are welcome @bellenger-l
I am unsure regarding the the loss of information. I am just learning how to use spatialdata myself, therefore I could not pinpoint where your issue is for the sake of helping you with troubleshooting.

What I do is, I load a fresh spatialdata object via xenium() every time I am analyzing, and then just run through all of the code I had written and saved in my jupyter .ipynb notebook from start to where I had left off. So in your case, if you were to do it this way, the code you used for subset would have to be re-run every time you restart your jupyter notebook kernel. Maybe the way I am doing it is bad practice, since I have read and write the entire xenium /outs/ folder every time I am working on this project, but it feels cleaner to me in a way in terms of reproducibility (knowing that whoever runs the code I am running, it should work for them every time).

I have not figured out the .zarr file saving option yet completely. I think that is how to save progress on changes made to a spatialdata in an intermediate space?

bellenger-l · 2025-03-31T14:31:13Z

There is two separate things, I'm afraid...

the Zarr store that allow to have your spatialdata object in a different space where some element can be saved throughout your analysis.
the loss of information due to the filtering.

For instance, I use the Zarr store because I am testing spatialdata and different spatial transcriptomics packages and some steps are very time consuming and I don't necessarily want to compute everything from the beggining. It's working fine in my opinion except when we want to save the table (anndata within the spatialdata object), we need to save the entire zarr store again.

Regarding the match_sdata_to_table function this is what I see :

My spatialdata before filtering sdata :

SpatialData object, with associated Zarr store: /home/blabla/object.zarr
├── Images
│     ├── 'he_image': DataArray[cyx] 
│     ├── 'morphology_focus': DataTree[cyx] 
│     └── 'morphology_mip': DataTree[cyx] 
├── Labels
│     ├── 'cell_labels': DataTree[yx] 
│     └── 'nucleus_labels': DataTree[yx] 
├── Points
│     └── 'transcripts': DataFrame with shape: (<Delayed>, 10) (3D points)
├── Shapes
│     ├── 'cell_boundaries': GeoDataFrame shape: (413404, 1) (2D shapes)
│     ├── 'cell_circles': GeoDataFrame shape: (413404, 2) (2D shapes)
│     ├── 'nucleus_boundaries': GeoDataFrame shape: (413404, 1) (2D shapes)
│     └── 'tissue_outline': GeoDataFrame shape: (27, 1) (2D shapes)
└── Tables
      └── 'table': AnnData (413404, 426)
with coordinate systems:
    ▸ 'global', with elements:
        he_image (Images), morphology_focus (Images), morphology_mip (Images), cell_labels (Labels), nucleus_labels (Labels), transcripts (Points), cell_boundaries (Shapes), cell_circles (Shapes), nucleus_boundaries (Shapes), tissue_outline (Shapes)

My subsetted spatialdata object obtained with the following command : sub_sdata = spd.match_sdata_to_table(sdata=sdata, table_name="table", table=sub_adata, how="right")

SpatialData object
├── Shapes
│     └── 'cell_circles': GeoDataFrame shape: (411085, 2) (2D shapes)
└── Tables
      └── 'table': AnnData (411085, 426)
with coordinate systems:
    ▸ 'global', with elements:
        cell_circles (Shapes)

So even when I perform the filtering and affect the result in another variable, I correctly retrieve cells of interest but at the expense of different spatialdata slots.
What I don't know is why ? My hypothesis is it's somehow due to the annotation tables that are not linked to the different elements of spatialdata (except cell_circles?)

Did you see the same phenomenon when using match_sdata_to_table function ?

Pancreas-Pratik · 2025-04-02T00:45:01Z

Oh. I can help with this.

I was having trouble with the same, which is , essentially, re-inputting the anndata back into the sdata and subsetting cell_boundaries and cell_circles (I imagine the same can be done for nucleus_boundaries). @LucaMarconato actually helped me with this exact issue. His solution is here: #898 (comment) Below is how I implemented his solution for myself and renamed the object names to your object names.

How to subset the `cell_circles` and `cell_boundaries` and re-input subsetted anndata back into the spatialdata (See ***Note below):

# backup sdata first
sdata_backup=sdata

# subset for cell_circles
sub_sdata = spd.match_sdata_to_table(sdata=sdata, table_name="table", table=sub_adata, how="right")
sdata.shapes['cell_circles']=sub_sdata.shapes['cell_circles']

# repeat for cell_boundaries
sdata["table"].obs["region"] = "cell_boundaries"
sdata.set_table_annotates_spatialelement(table_name="table", region="cell_boundaries")

sub_sdata = spd.match_sdata_to_table(sdata=sdata, table_name="table", table=sub_adata, how="right")
sdata.shapes['cell_boundaries']=sub_sdata.shapes['cell_boundaries']

# and then re-input the anndata
sdata.tables["table"] = sub_adata

# sdata should have cell_circles and cell_boundaries filtered by sub_adata and sub_adata should be re-inputted back into the sdata 
sdata

***Note: If you see @LucaMarconato solution to my issue above, he said the current solution is only a temporary fix:
From #898 (comment):

Two comments:

we are currently improving the ergonomics around these type of operations with a new API called match_sdata_to_table(), merged here

#627 and with a work-in-progress PR called for a new API called filter_table_by_query(), discussed here #894. Also, squidpy will offer APIs similar to the scanpy ones. The implementation for the moment will be separate because first we need to enable the join APIs (used in the functions above), to return a view and not always a copy. This is being worked out in this spatialdata PR here #701. The squidpy PR is this one here: scverse/squidpy#967

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

module 'spatialdata' has no attribute 'match_sdata_to_table' #912

module 'spatialdata' has no attribute 'match_sdata_to_table' #912

bellenger-l commented Mar 25, 2025

Pancreas-Pratik commented Mar 27, 2025 •

edited

Loading

bellenger-l commented Mar 31, 2025 •

edited

Loading

Pancreas-Pratik commented Mar 31, 2025

bellenger-l commented Mar 31, 2025 •

edited

Loading

Pancreas-Pratik commented Apr 2, 2025

module 'spatialdata' has no attribute 'match_sdata_to_table' #912

module 'spatialdata' has no attribute 'match_sdata_to_table' #912

Comments

bellenger-l commented Mar 25, 2025

Pancreas-Pratik commented Mar 27, 2025 • edited Loading

How to install a specific pull request:

bellenger-l commented Mar 31, 2025 • edited Loading

Pancreas-Pratik commented Mar 31, 2025

bellenger-l commented Mar 31, 2025 • edited Loading

Pancreas-Pratik commented Apr 2, 2025

How to subset the cell_circles and cell_boundaries and re-input subsetted anndata back into the spatialdata (See ***Note below):

Pancreas-Pratik commented Mar 27, 2025 •

edited

Loading

bellenger-l commented Mar 31, 2025 •

edited

Loading

bellenger-l commented Mar 31, 2025 •

edited

Loading

How to subset the `cell_circles` and `cell_boundaries` and re-input subsetted anndata back into the spatialdata (See ***Note below):