All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Fixed
op.translate(and everyop.resourcecall for non-human organisms) failing with HTTP 404 after HGNC dropped thegenenames/subtree from the EBI FTP mirror; the HCOP fifteen-column files are now fetched from the new HGNC Google Cloud Storage bucket (#303) - Removed a redundant
pd.read_csvcall inop.translatethat re-downloaded the HCOP file a second time and bypassed the retry logic in_download - Fixed
bm.benchmarkground-truth matrix row order not matching score matrix afterpivot, which silently misaligned perturbation labels when obs names sort differently from their original order (#302)
pl.volcanonow accepts a gene name (str) or list of gene names (list[str]) for thetopparameter to annotate specific features on volcano plots
- Refactored
ds.ensmbl_to_symbolto reuse_downloadand fixed mirror fallback to actually switch between Ensembl mirrors - Made
tqdmprogress bar compact in_download
- Fixed
pl.dotplotignoringvcenter=0due to falsy check (#293) - Fixed
_logsetting the root logging level to INFO for all packages (#296)
- Added
alternativeargument todecoupler.mt.query_set. By default'greater', before it was'two-sided' - Unpinned
scipyversion limit
- Fixed missing progressbar for
decoupler._download._download - Added missing
decoupler.mt.query_setdocumentation
pp.adjmatnow returns the same features as used as input instead of the subset ofnetpp.pseudobulknow returns the same order features as used as input instead of shuffling them- Added a dedicated header and 5 attemps to
_downloadto mitigate 429 Client Error from Zenodo downloads
pp.query_setto test overlap between a given feature set against a database of sets
tl.rankby_obsmnow acceptsAnnData.obscolumn names specified in theobs_keysargument- Most plotting functions now accept extra arguments through
kw_arguments. - p-values now are corrected using a custom numba-optimized version of
scipy.stats.false_discovery_controlcalled_fdr_bh_axis1_numba
- mypy checks in CI
- notebook checks in CI
- Fixed error in
decoupler.pp.pseudobulkwhenadata.obs_nameswere not unique, now throws verbose error - Fixed corner case in
decoupler.mt.gseawhen p-values were infitite and could not be corrected
- Updated logo
decoupler._download._downloadnow returns bytes instead of a dataframe. To transform topandas.DataFrameusedecoupler._download._bytes_to_pandas- Enrichment methods and pseudobulking now work with backed AnnData objects, useful when working with big datasets and memory is limited
- Fixed error in
pl.obsmwhere default value ofcmap_obswas not properly set.
- Added
pre-commitfunctionality to the repository
- Modified links and paths to follow scverse's repository
- Fixed error message when extra dependencies where not installed
- Fixed
dcorimport bug as an external dependency
- Fixed error in
pp.pseudobulkwhen obs columns were not categorical
- Allowed ordering functions (
pp.bin_order,pl.orderandpl.order_targets) not to be bound between 0 and 1 - Added ipywidgets as dependency
- Silenced xgboost warnings
- Handled corner case in
bm.metric.aucwhen scores are all 0 - Fixed error in
bm.metric.hmeanwhen metrics were str instead of list - Fixed error when
obscolumn is a list inpp.pseudobulk
- Fixed an error in
pp.pseudobulkwhen handling empty samples or features
Major update to accomodate the scverse template {cite}scverse.
All functions have been rewritten to follow the new API, errors when running previous versions (1.X.X) are expected if decoupler >= 2.0.0 is installed.
- Methods are now in the
mtmodule and are built from shared classMethod- Use
decoupler.mt.<method_name>to call a method -
min_nargument has been renamedtmin - New argument
bsizeallows to run a method with batches in case excessive memory usage is an issue -
$p_{values}$ of the enrichment scores are now corrected by Benjamini-Hochberg -
mdtandudtare now based onxgboostinstead ofsklearnfor better scalability.udtstatistic is now the coefficient of determination$R^2$ instead of the importance of a single decision tree. -
mlmandulmnow include atvalparameter, which allows returning either the t-value of the slope or the slope itself as the enrichment statistic -
oranow returns the odds ratio of the contingency table as a statistic, and computes a two-sided Fisher exact test instead of a one-sided one -
vipernow correctly estimates shadow regulons when network weights are values other than -1 or +1 -
wsumandwmeanare deprecated, instead now the methodwaggrallows to run both methods and any custom function. This makes it easier to quickly test new enrichment methods without having to deal withdecoupler's implementation
- Use
- Databases from Omnipath can now be accessed through the new
opmodule- Use
decoupler.op.<resource_name>to access a database - Removed the
omnipathpackage as a dependancy - Fixed
collectrito the publication version instead of the OmniPath one - Made
progenyonly return significant genes by default instead of the top N genes per pathway
- Use
- Plots are now in a new
plmodule- Use
decoupler.pl.<plot_name>to call a plot - They use a common class
Plotterto make it easier to share arguments between them -
plot_violinshas been deprecated - Names that have changed
-
plot_psbulk_samplestofilter_samples -
plot_running_scoretoleading_edge -
plot_associationstoobsm -
plot_targetstosource_targets
-
- Use
- Preprocessing functions are now in the new
ppmodule- Renamed
check_corrtonet_corr, now also returns adjusted$p_{values}$ - Renamed
get_actstoget_obsm - Renamed
get_pseudobulktopseudobulk. Now it does not automatically remove low quality samples, this is now done with the functionfilter_samples - Deprecated
get_contrast,get_top_targetsandformat_contrast_results.PyDESeq2should be used instead - Moved
rank_sources_groupsto the newtlmodule asrankby_group - Moved
get_metadata_associationsto the newtlmodule asrankby_obsm
- Renamed
- Moved the benchmarking pipeline inside a new
bmmodule, with its metrics and plotting functions in further submodules (bm.metricandbm.pl)
dsmodule with functions to download several datasets at different resolutions- Bulk:
hsctgfbandknocktf - Single-cell:
pbmc3k,covid5kanderygast1k - Spatial:
msvisium - Toy data:
toyandtoy_bench - Utils:
ensmbl_to_symbol
- Bulk:
- New database functions in the
opmodule- Added
hallmarkas a custom resource
- Added
- New plotting funcitons in the
plmodule- Added
obsbarto plot size of metadata columns inanndata.AnnData.obs - Added
orderto plot sources or features along a continous process such as a trajectory - Added
order_targetsto plot the targets of a given source along a continous process
- Added
- New preprocessing functions in the
ppmodule- Added two functions to format networks,
adjmatto return an adjacency matrix, andidxmaxto return a list of sets - Added
filter_samplesto filter pseudobulk profiles after runningpseudobulk - Added
knnto calculate K-Nearest Neighbors similarities based on spatial distances - Added
bin_orderto bin features across a continous process
- Added two functions to format networks,
tlmodule with functions to perform statistical tests- Added
rankby_orderto test for non-linear associations of features with a continous process
- Added
- New benchmarking metrics and plotting related functions in the
bmmodule- Added two more metrics,
F-scoreandqrank - Added shared plots for metrics,
bm.pl.auc,bm.pl.fscoreandbm.pl.qrank - Added a summary plot across metrics
bm.pl.summary
- Added two more metrics,