Skip to content

Commit f762651

Browse files
committed
clean code, fix show_figs showing figures when set to false in caps2surf,
add variance ratio and davies_bouldin, add nan in dataframe for caps that dont exist in groups instead of zero to show better distinction,and raise error in timeseries if no tr specified but condition is requested.
1 parent 037533d commit f762651

File tree

12 files changed

+426
-303
lines changed

12 files changed

+426
-303
lines changed

CHANGELOG.md

+20
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,26 @@ noted in the changelog (i.e new functions or parameters, changes in parameter de
4141
- *.patch* : Contains no new features, simply fixes any identified bugs.
4242
- *.postN* : Consists of only metadata-related changes, such as updates to type hints or doc strings/documentation.
4343

44+
## [0.12.0] - 2024-06-26
45+
- Entails some code cleaning and verification to ensure that the code cleaned for clarity purposes produces the same
46+
results.
47+
48+
### 🚀 New/Added
49+
- Davies Bouldin and Variance Ratio (Calinski Harabasz) added
50+
51+
### ♻ Changed
52+
- For `CAPs.calculate_metrics()` if performing an analysis on groups where each group has a different number of CAPs, then for "temporal_fraction",
53+
"persistence", and "counts", "nan" values will be seen for CAP numbers that exceed the group's number of CAPs.
54+
- For instance, if group "A" has 2 CAPs but group "B" has 4 CAPs, the DataFrame will contain columns for CAP-1,
55+
CAP-2, CAP-3, and CAP-4. However, for all members in group "A", CAP-3 and CAP-4 will contain "nan" values to
56+
indicate that these CAPs are not applicable to the group. This differentiation helps distinguish between CAPs
57+
that are not applicable to the group and CAPs that are applicable but had zero instances for a specific member.
58+
59+
### 🐛 Fixes
60+
- Adds error earlier when tr is not specified or able to be retrieved form the bold metadata when the condition is specified
61+
instead of allowing the pipeline to produce this error later.
62+
- Fixed issue with `show_figs` in `CAP.caps2surf()` showing figure when set to False.
63+
4464
## [0.11.3] - 2024-06-24
4565
### ♻ Changed
4666
- With parallel processing, joblib outputs are now returned as a generator as opposed to the default, which is a list,

README.md

+5-4
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
# neurocaps
2-
[![DOI](https://img.shields.io/badge/DOI-10.5281%2Fzenodo.11642615-blue)](https://doi.org/10.5281/zenodo.12523896)
2+
[![Latest Version](https://img.shields.io/pypi/v/neurocaps.svg)](https://pypi.python.org/pypi/neurocaps/)
3+
[![DOI](https://img.shields.io/badge/DOI-10.5281%2Fzenodo.11642615-blue)](https://doi.org/10.5281/zenodo.12555589)
34
[![Test Status](https://github.com/donishadsmith/neurocaps/actions/workflows/testing.yaml/badge.svg)](https://github.com/donishadsmith/neurocaps/actions/workflows/testing.yaml)
45
[![License](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
56

@@ -100,9 +101,9 @@ The provided example demonstrates setting up a custom parcellation containing no
100101
- **Parallel Processing:** Use parallel processing by specifying the number of CPU cores in the `n_cores` parameter in the `get_bold()` method. Testing on an HPC using a loop with `TimeseriesExtractor.get_bold()` to extract session 1 and 2 BOLD timeseries from 105 subjects from resting-state data (single run containing 360 volumes) and two task datasets (three runs containing 200 volumes each and two runs containing 200 volumes) reduced processing time from 5 hours 48 minutes to 1 hour 26 minutes (using 10 cores). *Note:* If you are using an HPC, remember to allocate the appropriate amount of CPU cores with your workload manager. For instance in slurm use `#SBATCH --cpus-per-task=10` if you intend to use 10 cores.
101102

102103
**Main features for `CAP` includes:**
103-
- **Optimal Cluster Size Identification:** Perform the silhouette or elbow method to identify the optimal cluster size, saving the optimal model as an attribute.
104-
- **Parallel Processing:** Use parallel processing, when using the silhouette or elbow method, by specifying the number of CPU cores in the `n_cores` parameter in the `get_caps()` method. *Note:* If you are using an HPC, remember to allocate the appropriate amount of CPU cores with your workload manager. For instance in slurm use `#SBATCH --cpus-per-task=10` if you intend to use 10 cores.
105-
- **Grouping:** Perform CAPs analysis for entire sample or groups of subject IDs (using the `groups` parameter when initializing the `CAP` class). K-means clustering, silhouette and elbow methods, and plotting are done for each group when specified.
104+
- **Optimal Cluster Size Identification:** Perform the Davies Bouldin, Silhouette, Elbow, or Variance Ratio criterions to identify the optimal cluster size, saving the optimal model as an attribute.
105+
- **Parallel Processing:** Use parallel processing, when using the Davies Bouldin, Silhouette, Elbow, or Variance Ratio criterions , by specifying the number of CPU cores in the `n_cores` parameter in the `get_caps()` method. *Note:* If you are using an HPC, remember to allocate the appropriate amount of CPU cores with your workload manager. For instance in slurm use `#SBATCH --cpus-per-task=10` if you intend to use 10 cores.
106+
- **Grouping:** Perform CAPs analysis for entire sample or groups of subject IDs (using the `groups` parameter when initializing the `CAP` class). K-means clustering, all cluster selection methods (Davies Bouldin, Silhouette, Elbow, or Variance Ratio criterions), and plotting are done for each group when specified.
106107
- **CAP Visualization:** Visualize the CAPs as outer products or heatmaps, with options to use subplots to reduce the number of individual plots, as well as save. Refer to the [documentation](https://neurocaps.readthedocs.io/en/latest/generated/neurocaps.analysis.CAP.html#neurocaps.analysis.CAP.caps2plot) for the `caps2plot()` method in the `CAP` class for available `**kwargs` arguments and parameters to modify plots.
107108
- **Save CAPs as NifTIs:** Convert the atlas used for parcellation to a stat map and saves them (`caps2niftis`).
108109
- **Surface Plot Visualization:** Convert the atlas used for parcellation to a stat map projected onto a surface plot with options to customize and save plots. Refer to the [documentation](https://neurocaps.readthedocs.io/en/latest/generated/neurocaps.analysis.CAP.html#neurocaps.analysis.CAP.caps2surf) for the `caps2surf()` method in the `CAP` class for available `**kwargs` arguments and parameters to modify plots. Also includes the option to save the NifTIs. There is also another a parameter in `caps2surf`, `fslr_giftis_dict`, which can be used if the CAPs NifTI files were converted to GifTI files using a tool such as Connectome Workbench, which may work better for converting your atlas to fslr space. This parameter allows plotting without re-running the analysis and only initializing the `CAP` class and using the `caps2surf` method is needed.

docs/introduction.rst

+9-5
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,11 @@
11
**neurocaps**
22
=============
3+
.. image:: https://img.shields.io/pypi/v/neurocaps.svg
4+
:target: https://pypi.python.org/pypi/neurocaps/
5+
:alt: Latest Version
6+
37
.. image:: https://img.shields.io/badge/DOI-10.5281%2Fzenodo.11642615-blue
4-
:target: https://doi.org/10.5281/zenodo.12523896
8+
:target: https://doi.org/10.5281/zenodo.12555589
59
:alt: DOI
610

711
.. image:: https://github.com/donishadsmith/neurocaps/actions/workflows/testing.yaml/badge.svg
@@ -19,7 +23,7 @@ Citing
1923
======
2024
::
2125
22-
Smith, D. (2024). neurocaps. Zenodo. https://doi.org/10.5281/zenodo.12523896
26+
Smith, D. (2024). neurocaps. Zenodo. https://doi.org/10.5281/zenodo.12555589
2327

2428
Usage
2529
=====
@@ -91,10 +95,10 @@ Main features for ``TimeseriesExtractor`` includes:
9195
Main features for ``CAP`` includes:
9296
-----------------------------------
9397

94-
- **Optimal Cluster Size Identification:** Perform the silhouette or elbow method to identify the optimal cluster size, saving the optimal model as an attribute.
95-
- **Parallel Processing:** Use parallel processing, when using the silhouette or elbow method, by specifying the number of CPU cores in the ``n_cores`` parameter in the ```get_caps()`` method.
98+
- **Optimal Cluster Size Identification:** Perform the Davies Bouldin, Silhouette, Elbow, or Variance Ratio criterions to identify the optimal cluster size, saving the optimal model as an attribute.
99+
- **Parallel Processing:** Use parallel processing, when using the Davies Bouldin, Silhouette, Elbow, or Variance Ratio criterions by specifying the number of CPU cores in the ``n_cores`` parameter in the ```get_caps()`` method.
96100
*Note:* If you are using an HPC, remember to allocate the appropriate amount of CPU cores with your workload manager. For instance in slurm use ``#SBATCH --cpus-per-task=10`` if you intend to use 10 cores.
97-
- **Grouping:** Perform CAPs analysis for entire sample or groups of subject IDs (using the ``groups`` parameter when initializing the ``CAP`` class). K-means clustering, silhouette and elbow methods, and plotting are done for each group when specified.
101+
- **Grouping:** Perform CAPs analysis for entire sample or groups of subject IDs (using the ``groups`` parameter when initializing the ``CAP`` class). K-means clustering, all cluster selection methods (Davies Bouldin, Silhouette, Elbow, or Variance Ratio criterions), and plotting are done for each group when specified.
98102
- **CAP Visualization:** Visualize the CAPs as outer products or heatmaps, with options to use subplots to reduce the number of individual plots, as well as save.
99103
Refer to the `documentation <https://neurocaps.readthedocs.io/en/latest/generated/neurocaps.analysis.CAP.html#neurocaps.analysis.CAP.caps2plot>`_ for the ``caps2plot()`` method in the ``CAP`` class for available ``**kwargs`` arguments and parameters to modify plots.
100104
- **Save CAPs as NifTIs:** Convert the atlas used for parcellation to a stat map and saves them (``caps2niftis``).

neurocaps/__init__.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,4 @@
22

33
__all__=["analysis", "extraction"]
44
# Version in a single place
5-
__version__ = "0.11.3"
5+
__version__ = "0.12.0"

neurocaps/_utils/__init__.py

+1
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
from ._check_parcel_approach import _check_parcel_approach
33
from ._pickle_to_dict import _convert_pickle_to_dict
44
from ._cap_internals import _cap2statmap
5+
from ._cap_internals import _create_node_labels
56
from ._cap_internals import _CAPGetter
67
from ._cap_internals import _run_kmeans
78
from ._timeseriesextractor_internals import _check_confound_names
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
11
from ._capgetter import _CAPGetter
22
from ._cap2statmap import _cap2statmap
3+
from ._create_labels import _create_node_labels
34
from ._run_kmeans import _run_kmeans

neurocaps/_utils/_cap_internals/_capgetter.py

+16-8
Original file line numberDiff line numberDiff line change
@@ -8,14 +8,6 @@ def __init__(self):
88
pass
99

1010
### Attributes exist when CAP initialized
11-
@property
12-
def n_clusters(self):
13-
return self._n_clusters
14-
15-
@property
16-
def cluster_selection_method(self):
17-
return self._cluster_selection_method
18-
1911
@property
2012
def groups(self):
2113
return self._groups
@@ -31,6 +23,14 @@ def parcel_approach(self, parcel_dict):
3123
self._parcel_approach = _check_parcel_approach(parcel_approach=parcel_dict, call="setter")
3224

3325
### Attributes exist when CAP.get_caps() used
26+
@property
27+
def n_clusters(self):
28+
return self._n_clusters if hasattr(self, "_n_clusters") else None
29+
30+
@property
31+
def cluster_selection_method(self):
32+
return self._cluster_selection_method if hasattr(self, "_cluster_selection_method") else None
33+
3434
@property
3535
def n_cores(self):
3636
return self._n_cores if hasattr(self, "_n_cores") else None
@@ -47,6 +47,10 @@ def caps(self):
4747
def kmeans(self):
4848
return self._kmeans if hasattr(self, "_kmeans") else None
4949

50+
@property
51+
def davies_bouldin(self):
52+
return self._davies_bouldin if hasattr(self, "_davies_bouldin") else None
53+
5054
@property
5155
def silhouette_scores(self):
5256
return self._silhouette_scores if hasattr(self, "_silhouette_scores") else None
@@ -55,6 +59,10 @@ def silhouette_scores(self):
5559
def inertia(self):
5660
return self._inertia if hasattr(self, "_inertia") else None
5761

62+
@property
63+
def variance_ratio(self):
64+
return self._variance_ratio if hasattr(self, "_variance_ratio") else None
65+
5866
@property
5967
def optimal_n_clusters(self):
6068
return self._optimal_n_clusters if hasattr(self, "_optimal_n_clusters")else None
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
"""Internal function to create labels at the node level for caps2plot"""
2+
import collections, re
3+
4+
def _create_node_labels(parcellation_name, parcel_approach, columns):
5+
# Get frequency of each major hemisphere and region in Schaefer, AAL, or Custom atlas
6+
if parcellation_name == "Schaefer":
7+
nodes = parcel_approach[parcellation_name]["nodes"]
8+
# Retain only the hemisphere and primary Schaefer network
9+
nodes = [node.split("_")[:2] for node in nodes]
10+
frequency_dict = collections.Counter([" ".join(node) for node in nodes])
11+
elif parcellation_name == "AAL":
12+
nodes = parcel_approach[parcellation_name]["nodes"]
13+
frequency_dict = collections.Counter([node.split("_")[0] for node in nodes])
14+
else:
15+
frequency_dict = {}
16+
for names_id in columns:
17+
# For custom, columns comes in the form of "Hemisphere Region"
18+
hemisphere_id = "LH" if names_id.startswith("LH ") else "RH"
19+
region_id = re.split("LH |RH ", names_id)[-1]
20+
node_indices = parcel_approach["Custom"]["regions"][region_id][hemisphere_id.lower()]
21+
frequency_dict.update({names_id: len(node_indices)})
22+
23+
# Get the names, which indicate the hemisphere and region
24+
# Reverting Counter objects to list retains original ordering of nodes in list as of Python 3.7
25+
names_list = list(frequency_dict)
26+
labels = ["" for _ in range(0,len(parcel_approach[parcellation_name]["nodes"]))]
27+
28+
starting_value = 0
29+
30+
# Iterate through names_list and assign the starting indices corresponding to unique region and hemisphere key
31+
for num, name in enumerate(names_list):
32+
if num == 0:
33+
labels[0] = name
34+
else:
35+
# Shifting to previous frequency of the preceding network to obtain the new starting value of
36+
# the subsequent region and hemisphere pair
37+
starting_value += frequency_dict[names_list[num-1]]
38+
labels[starting_value] = name
39+
40+
return labels, names_list
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,21 @@
11
"""Internal function for performing silhouette or elbow method with or without multiprocessing"""
22
from sklearn.cluster import KMeans
3-
from sklearn.metrics import silhouette_score
3+
from sklearn.metrics import davies_bouldin_score, calinski_harabasz_score, silhouette_score
44

55
def _run_kmeans(n_cluster, random_state, init, n_init, max_iter, tol, algorithm, concatenated_timeseries, method):
66
model = KMeans(n_clusters=n_cluster, random_state=random_state, init=init, n_init=n_init, max_iter=max_iter,
77
tol=tol, algorithm=algorithm).fit(concatenated_timeseries)
8-
if method == "silhouette":
9-
cluster_labels = model.labels_
8+
9+
cluster_labels = model.labels_
10+
11+
if method == "davies_bouldin":
12+
performance = {n_cluster: davies_bouldin_score(concatenated_timeseries, cluster_labels)}
13+
elif method == "elbow":
14+
performance = {n_cluster: model.inertia_}
15+
elif method == "silhouette":
1016
performance = {n_cluster: silhouette_score(concatenated_timeseries, cluster_labels, metric="euclidean")}
1117
else:
12-
performance = {n_cluster: model.inertia_}
18+
# Variance Ratio
19+
performance = {n_cluster: calinski_harabasz_score(concatenated_timeseries, cluster_labels)}
1320

1421
return performance

0 commit comments

Comments
 (0)