Skip to content

Simulate errors out when celltype names are only numbers, requires text prefix to run correctly. #100

@nagendraKU

Description

@nagendraKU

I ran scaden simulate with celltype names being the Leiden cluster numbers. Got the following error message and the data.h5ad file was not created.

INFO Datasets: ['testdata_all_bat'] bulk_simulator.py:84
INFO Simulating data from testdata_all_bat bulk_simulator.py:89
INFO Loading testdata_all_bat dataset ... bulk_simulator.py:141
INFO Merging unknown cell types: ['unknown'] bulk_simulator.py:107
INFO Subsampling testdata_all_bat ... bulk_simulator.py:110
/home/ku_user/scadendl/lib64/python3.6/site-packages/anndata/_core/anndata.py:120: ImplicitModificationWarning: Transforming to str index.
warnings.warn("Transforming to str index.", ImplicitModificationWarning)
... storing 'ds' as categorical
Traceback (most recent call last):
File "/home/ku_user/scadendl/lib64/python3.6/site-packages/anndata/_io/utils.py", line 209, in func_wrapper
return func(elem, key, val, *args, **kwargs)
File "/home/ku_user/scadendl/lib64/python3.6/site-packages/anndata/_io/h5ad.py", line 247, in write_dataframe
col_names = [check_key(c) for c in df.columns]
File "/home/ku_user/scadendl/lib64/python3.6/site-packages/anndata/_io/h5ad.py", line 247, in
col_names = [check_key(c) for c in df.columns]
File "/home/ku_user/scadendl/lib64/python3.6/site-packages/anndata/_io/utils.py", line 109, in check_key
raise TypeError(f"{key} of type {typ} is an invalid key. Should be str.")
TypeError: 0 of type <class 'int'> is an invalid key. Should be str.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/ku_user/scadendl/bin/scaden", line 8, in
sys.exit(main())
File "/home/ku_user/scadendl/lib64/python3.6/site-packages/scaden/main.py", line 48, in main
cli()
File "/home/ku_user/scadendl/lib64/python3.6/site-packages/click/core.py", line 1137, in call
return self.main(*args, **kwargs)
File "/home/ku_user/scadendl/lib64/python3.6/site-packages/click/core.py", line 1062, in main
rv = self.invoke(ctx)
File "/home/ku_user/scadendl/lib64/python3.6/site-packages/click/core.py", line 1668, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/ku_user/scadendl/lib64/python3.6/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/ku_user/scadendl/lib64/python3.6/site-packages/click/core.py", line 763, in invoke
return __callback(*args, **kwargs)
File "/home/ku_user/scadendl/lib64/python3.6/site-packages/scaden/main.py", line 215, in simulate
fmt=data_format,
File "/home/ku_user/scadendl/lib64/python3.6/site-packages/scaden/simulate.py", line 22, in simulation
bulk_simulator.simulate()
File "/home/ku_user/scadendl/lib64/python3.6/site-packages/scaden/simulation/bulk_simulator.py", line 90, in simulate
self.simulate_dataset(dataset)
File "/home/ku_user/scadendl/lib64/python3.6/site-packages/scaden/simulation/bulk_simulator.py", line 130, in simulate_dataset
ann_data.write(os.path.join(self.out_dir, dataset + ".h5ad"))
File "/home/ku_user/scadendl/lib64/python3.6/site-packages/anndata/_core/anndata.py", line 1911, in write_h5ad
as_dense=as_dense,
File "/home/ku_user/scadendl/lib64/python3.6/site-packages/anndata/_io/h5ad.py", line 111, in write_h5ad
write_attribute(f, "obs", adata.obs, dataset_kwargs=dataset_kwargs)
File "/usr/lib64/python3.6/functools.py", line 807, in wrapper
return dispatch(args[0].class)(*args, **kw)
File "/home/ku_user/scadendl/lib64/python3.6/site-packages/anndata/_io/h5ad.py", line 130, in write_attribute_h5ad
_write_method(type(value))(f, key, value, *args, **kwargs)
File "/home/ku_user/scadendl/lib64/python3.6/site-packages/anndata/_io/utils.py", line 216, in func_wrapper
) from e
TypeError: 0 of type <class 'int'> is an invalid key. Should be str.

Above error raised while writing key 'obs' of <class 'h5py._hl.files.File'> from /.

I then appended "celltype_" to the Leiden cluster numbers (eg: celltype_13) in the celltype file, and simulate runs correctly, generating the data.h5ad file. I still get the following warning message though.

/home/ku_user/scadendl/lib64/python3.6/site-packages/anndata/_core/anndata.py:120: ImplicitModificationWarning: Transforming to str index.
warnings.warn("Transforming to str index.", ImplicitModificationWarning)
... storing 'ds' as categorical

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions