Skip to content

Clonalyzer: toolkit for analyzing growth, yields, and rates in mammalian cell cultures. Designed for CHO fed-batch but adaptable to any time-series data.

Notifications You must be signed in to change notification settings

ebalderasr/clonalyzer

Repository files navigation

Clonalyzer

Clonalyzer is a modular toolkit for kinetic and stoichiometric analysis of CHO fed-batch cultures. It supports per-interval, grouped, and exponential-phase analyses with high-quality visualizations.

Designed for bioprocess engineers and data scientists working with clonal cell line characterization.

🧬 What is Clonalyzer?

Clonalyzer is a modular Python toolkit for the kinetic and stoichiometric analysis of mammalian cell cultures, particularly designed for fed-batch bioprocesses using CHO cells. It helps quantify growth rates, nutrient consumption, metabolite production, yields, and specific ratesβ€”per replicate or per clone.

πŸ”Ž Use cases include:

  • Comparing clone performance in early-stage screening
  • Monitoring nutrient and metabolite profiles over time
  • Estimating growth and productivity during the exponential phase
  • Generating clean, publication-ready plots

Although Clonalyzer was designed with fed-batch CHO processes in mind, it is not limited to them. For instance:

  • Block 3 (exponential-phase analysis) can be used for any batch process.
  • Block 1 and 2 support general interval or time-based profiling, including perfusion or hybrid strategies.

πŸ–οΈ Kinetic and Stoichiometric Calculations

Clonalyzer computes the following parameters for each Clone Γ— Replicate:

Parameter Symbol Units Description
Growth rate μ h⁻¹ Calculated as slope of ln(VCD) over time
Integrated viable cell density IVCD cells·h·mL⁻¹ or cells·h Area under the VCD curve over time (trapezoidal rule)
Cell balance βˆ†X cells Difference in viable cells in total volume
Substrate balance βˆ†S (Glc, Lac) mol Difference in total moles in volume
Yield on substrate Yβ‚“/β‚› cellsΒ·mol⁻¹ βˆ†X / βˆ†S
Specific rate qβ‚› pmolΒ·cell⁻¹·h⁻¹ βˆ†S normalized to IVCD and converted to pmol

For example, specific consumption of glucose (q_Glc):

q_Glc = (βˆ†Glucose in mol Γ— 1e12) / IVCD  β†’  pmol/(cellΒ·h)

All rates are computed using volume-normalized quantities for full mass balance integrity.


πŸ“ƒ Full Calculation Details

All calculations are performed per biological replicate (Clone Γ— Rep), using volume-normalized data to maintain mass balance integrity.

Specific Growth Rate (ΞΌ)

$$\mu = \frac{\ln X_2 - \ln X_1}{t_2 - t_1}$$

Where:

  • $$X_1$$, $$X_2$$ are viable cell densities at times $$t_1$$ and $$t_2$$
  • Units: cells/mL and hours
  • Result: $$\mu$$ in $$h^{-1}$$

Integral of Viable Cell Density (IVCD)

The integral of viable cell density over time is estimated using the trapezoidal rule:

$$ IVCD_{mL} = \frac{X_1 + X_2}{2} \cdot (t_2 - t_1) $$

To convert to a total IVCD accounting for culture volume:

$$ IVCD_{tot} = IVCD_{mL} \cdot \frac{V_1 + V_2}{2} $$

Where:

  • X₁, Xβ‚‚: viable cell densities at t₁ and tβ‚‚
  • Units: cellsΒ·hΒ·mL⁻¹ for IVCDβ‚˜β‚—; cellsΒ·h for IVCDβ‚œβ‚’β‚œ

Metabolite or Biomass Balance (βˆ†S, βˆ†X)

$$\Delta X = X_2 V_2 - X_1 V_1 \quad\text{and}\quad \Delta S = S_1 V_1 - S_2 V_2$$

  • $$X$$: cells/mL
  • $$S$$: mol/mL
  • $$V$$: mL

$$\Delta S$$ is positive if the substrate was consumed, and negative if it was produced.

Yield on Substrate ($$Y_{X/S}$$)

$$Y_{X/S} = \frac{\Delta X}{\Delta S}$$

  • Units: cells/mol

Specific Rate ($q_S$)

$$q_S = \frac{\Delta S \cdot 10^{12}}{\text{IVCD}_{\text{tot}}}$$

  • $$\Delta S$$: mol
  • $$IVCD_{tot}$$: cellΒ·h
  • $$q_S$$: pmol/(cellΒ·h)

πŸ“„ For a detailed explanation, see the How does Clonalyzer do the calculations.pdf


All rates are computed using volume-normalized quantities for full mass balance integrity. πŸ“„ For a detailed explanation of how kinetic and stoichiometric parameters are calculated, see the How does Clonalyzer do the calculations.pdf document included in this repository.

πŸ“„ Input Data Format

Clonalyzer expects a single CSV file in data/data.csv with the first row reserved for metadata (it will be skipped automatically).

Required Columns (used in calculations)

Column name Description Units
t_hr Time since inoculation hours
Clone Clone identifier (e.g., A, B, C) string
Rep Biological replicate number integer (1, 2, 3)
VCD Viable cell density cells/mL
Viab_pct Cell viability plot %
Vol_mL Culture volume at sampling time mL
Glc_g_L Glucose concentration g/L
Lac_g_L Lactate concentration g/L
Gln_mM Glutamine concentration mmol/L (mM)
Glu_mM Glutamate concentration mmol/L (mM)
is_post_feed Whether the sample is post-feeding TRUE or FALSE

Optional Columns (used in some plots if present)

Column name Example usage Units
GFP_mean, TMRM_mean Cytometry signal (GFP, mitochondrial potential) arbitrary units

Columns such as Notes, Glucose_Added_mL, or Quadrants are ignored, but can coexist in your file.

Example

t_hr t_day Clone Rep Timestamp Date is_post_feed VCD DCD Viab_pct Glc_g_L Lac_g_L
0 0 A 1 10:00 03/07/2025 FALSE 3.10E+05 4.00E+03 98.73 6.59 0.00
24 1 B 1 10:00 04/07/2025 FALSE 5.20E+05 4.00E+03 99.22 5.88 0.44

Flexibility for Other Measurements

Clonalyzer is designed to gracefully handle extra columns. This allows you to include additional data such as:

  • Cell size (e.g., from a Coulter counter)
  • pH, osmolarity, conductivity
  • Any signal from cytometry or online sensors

You may include as many additional columns as neededβ€”the system will ignore them unless explicitly used in plotting.

πŸ“ Project Structure

Clonalyzer/
β”œβ”€β”€ Block_1.ipynb                  # Notebook for interval-based kinetics (Clone Γ— Rep Γ— Time)
β”œβ”€β”€ Block_2.ipynb                  # Notebook for grouped kinetics (Clone Γ— Time)
β”œβ”€β”€ Block_3.ipynb                  # Notebook for exponential-phase analysis (Clone Γ— Rep)
β”œβ”€β”€ scripts/                       # Standalone Python scripts (modular components)
β”‚   β”œβ”€β”€ interval_kinetics.py       # Interval-based kinetic calculations
β”‚   β”œβ”€β”€ grouped_kinetics.py        # Aggregated (mean Β± SD) calculations
β”‚   β”œβ”€β”€ exp_phase_kinetics.py      # Kinetics during exponential phase
β”‚   β”œβ”€β”€ plot_raw.py                # Scatter plots for raw data
β”‚   β”œβ”€β”€ plot_grouped.py            # Line plots with error bars (grouped data)
β”‚   └── plot_exp.py                # Bar plots (clone-level metrics)
β”œβ”€β”€ data/
β”‚   └── data.csv                   # Input dataset (with metadata in first row)
β”œβ”€β”€ outputs/                       # All generated CSVs and figures
β”‚   β”œβ”€β”€ interval_kinetics.csv
β”‚   β”œβ”€β”€ results_agg_by_clone_time.csv
β”‚   β”œβ”€β”€ kinetics_by_clone.csv
β”‚   β”œβ”€β”€ kinetics_by_clone_rep.csv
β”‚   β”œβ”€β”€ figures_raw/
β”‚   β”œβ”€β”€ figures_agg/
β”‚   └── figures_exp/
β”œβ”€β”€ requirements.txt              # Python dependencies
β”œβ”€β”€ README.md                     # Project documentation
└── How does Clonalyzer do the calculations.pdf

πŸš€ Quickstart

  1. Clone the repository
git clone https://github.com/ebalderasr/Clonalyzer.git
cd Clonalyzer
  1. Install dependencies
pip install -r requirements.txt
  1. Prepare your data

Place your CSV file inside the data/ folder and rename it to:

data/data.csv

πŸ“ˆ Usage

Clonalyzer is organized into three independent analysis blocks. Each block includes a Jupyter Notebook that executes both data processing and plotting steps in sequence. This setup ensures a streamlined, beginner-friendly experience.


πŸ”Ή Block 1: Interval-based kinetics (Clone Γ— Rep Γ— Time)

Use this block to compute kinetics between each pair of consecutive time points per replicate (interval-by-interval). Ideal for detailed kinetic trajectories.

➑️ To run Block 1, open and execute the notebook:

Block_1.ipynb

This notebook performs:

  • Interval-based kinetic calculations (interval_kinetics)
  • Per-sample scatter plots (plot_raw)

Output:

  • CSV file: outputs/interval_kinetics.csv
  • Figures: outputs/figures_raw/ (time trends, kinetics, correlations)

πŸ”Ή Block 2: Aggregated kinetics (Clone Γ— Time)

Use this block to compute and visualize the average Β± SD of all measurements and parameters per clone at each time point.

➑️ To run Block 2, open and execute the notebook:

Block_2.ipynb

This notebook performs:

  • Aggregation of results (grouped_kinetics)
  • Time-course plots with error bars (plot_grouped)

Output:

  • CSV file: outputs/results_agg_by_clone_time.csv
  • Figures: outputs/figures_agg/ (time trends, kinetics, correlations)

πŸ”Ή Block 3: Exponential-phase kinetics (Clone Γ— Rep)

Use this block to extract clone-level metrics only during exponential growth. You’ll be prompted to specify the start and end time of the exponential phase.

➑️ To run Block 3, open and execute the notebook:

Block_3.ipynb

This notebook performs:

  • Kinetic calculations restricted to the exponential phase (exp_phase_kinetics)
  • Bar plots of clone-level performance (plot_exp)

Output:

  • CSV files:

    • outputs/kinetics_by_clone_rep.csv
    • outputs/kinetics_by_clone.csv
  • Figures: outputs/figures_exp/


πŸ§ͺ Optional: Use the standalone scripts directly

Each script used in the notebooks is also available in the scripts/ folder for advanced users or integration into custom pipelines.

To use these scripts manually:

  1. Copy the desired script from scripts/ to the root folder.
  2. Ensure data/data.csv exists in the root-level data/ folder.
  3. Run the script from the root of the repository using:
python script_name.py

⚠️ These scripts expect relative paths like data/data.csv and outputs/, so they must be executed from the root folder, not from within scripts/.

πŸ“‚ Outputs

All processed files and figures are saved in the outputs/ folder.

πŸ‘€ Author

Emiliano Balderas R.
GitHub: @ebalderasr

πŸ“„ License

MIT License.

About

Clonalyzer: toolkit for analyzing growth, yields, and rates in mammalian cell cultures. Designed for CHO fed-batch but adaptable to any time-series data.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published