-
Notifications
You must be signed in to change notification settings - Fork 15
Open
Description
Summary
The Nelson lab bins recordings into fixed-duration segments (e.g., 2-minute bins) and correlates per-bin photometry statistics — mean fluorescence or transient count — with a continuous behavioral variable measured at the same temporal resolution (e.g., akinesia severity score, locomotion velocity). This analysis pattern is distinct from PSTH (event-triggered) and from tonic epoch comparison; it is a time-resolved, whole-session correlation that is not currently supported by GuPPy.
Motivation
- Rodrigo Paz (Nelson lab) described this as a primary analysis for their Parkinson's disease model experiments: "we take two minutes of signal, and we average either the number of transients or the raw fluorescence, and we correlate that with a manually scored thing [akinesia severity], for that bin"
- The correlation spans the whole session — comparing the fluorescence bin at one time point to the behavioral score at the same time point — to ask whether signal and behavior co-vary across the session
- This is specifically useful for long experiments where a manipulation happens and the lab wants to see how fluorescence tracks behavioral recovery or deterioration over time
- The existing cross-correlation module (
src/guppy/analysis/cross_correlation.py) computes sample-by-sample correlation between two continuous time series; it is not designed for computing per-bin statistics and correlating them with an independent behavioral score vector - Alexandra noted that photobleaching correction (Issue 01) is prerequisite for this analysis to be valid: "if nothing was happening, for the signal to be dead flat for three hours. So that we can say, how does the velocity correlate during this two-minute bin here, and this two-minute bin here"
- The lab currently performs this analysis in Excel or custom scripts after exporting GuPPy z-scores, fragmenting the workflow
Proposed Solution
- Add a new
src/guppy/analysis/binned_correlation.pymodule implementing:- Time-binning of the processed signal into fixed-duration, non-overlapping bins with a configurable width (e.g.,
bin_width_sec = 120) - Per-bin statistic computation: mean z-score or dF/F; optionally transient count per bin if Step 5 has already been run
- Pearson and Spearman correlation of the per-bin signal statistic against a user-supplied continuous behavioral variable vector of matching length
- Output: a CSV per session with bin start times, per-bin signal statistic, per-bin behavioral variable value, and overall correlation coefficient and p-value
- Optional scatter plot (signal statistic vs. behavioral variable per bin) saved alongside the CSV
- Time-binning of the processed signal into fixed-duration, non-overlapping bins with a configurable width (e.g.,
- The behavioral variable is provided as a CSV with one column per measure, sampled at a frequency compatible with the chosen bin width (the module resamples/averages to bin resolution if needed)
- Add a
step6_binned_correlationkeyword-only function tosrc/guppy/testing/api.pyacceptingbehavioral_data_pathandbin_width_secas parameters - Add a GUI panel or sub-step for configuring and running this analysis, ideally accessible from the visualization dashboard after Step 5
Open Questions
- Should the bin width be fixed across all sessions in a batch, or per-session? Fixed is simpler and more reproducible for group comparisons; per-session would require the user to set it individually
- How should the behavioral variable be aligned to photometry bins when the behavioral data has a different sampling rate? Simple bin-averaging of the behavioral variable is the safest default; interpolation is an alternative
- Should computing transient count per bin require that Step 5 (transient detection) has been run first, or should the binned analysis be independently runnable using only z-score output from Step 4?
- Is there a multi-session group-level requirement — e.g., pooling bins across sessions before computing correlation, or computing per-session correlations and then averaging the coefficients? Rodrigo's description suggests per-session, but group-level analysis may be needed for publication
- What correlation method to default to: Pearson (assumes normality, sensitive to outliers) vs. Spearman (rank-based, more robust)? Both should likely be computed and reported
- Should bins with high artifact content (flagged during artifact removal) be excluded from the correlation, or is that outside scope for a first version?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels