Skip to content

This repository contains an R script for generating Circos plots to visualize accessory gene presence-absence patterns and expression levels in *Streptococcus pneumoniae* serotype 3. It compares clade I (PT8465) and clade II (ND6401) strains, highlighting differences in gene expression and adaptation.

Notifications You must be signed in to change notification settings

J22160/Circular_Ideogram_Plot

Repository files navigation

README: Circos Plot for Pan-Transcriptomics Analysis

Pangenomics

Pangenomics provides a comprehensive framework for analyzing the full genetic repertoire of a given species, encompassing core genes (shared across all strains) and accessory genes (present in some but not all strains). This approach allows for the identification of genetic variations that contribute to strain-specific adaptations, pathogenicity, and other phenotypic traits.

Importance of Accessory Genes

Accessory genes play a crucial role in defining the distinct characteristics of individual strains. These genes often encode functions that provide selective advantages, such as antibiotic resistance, immune evasion, or enhanced virulence. Studying their presence and absence across multiple strains provides valuable insights into the genetic factors that drive evolutionary success and niche specialization.

Pan-Transcriptomics Approach

Pan-transcriptomics extends pangenome analysis by integrating RNA-Seq data to study gene expression dynamics. By overlaying transcriptional activity onto the pangenome, this approach enables the identification of differentially expressed genes across strains, shedding light on regulatory mechanisms that contribute to strain fitness and adaptability. Combining pangenomics with RNA-Seq data allows for the functional characterization of accessory genes, revealing which genes are actively expressed under specific conditions. This integration helps in identifying key genes responsible for the success of particular strains, distinguishing between genes that are merely present and those that play a pivotal role in pathogenesis, host interactions, or environmental adaptation.

Clade I vs. Clade II in Streptococcus pneumoniae Serotype 3

This analysis specifically focuses on two strains of Streptococcus pneumoniae serotype 3:

  • Clade I: PT8465
  • Clade II: ND6401

These two strains exhibit significant differences in their accessory gene content and transcriptional activity, which may contribute to variations in virulence, host interactions, and immune evasion strategies. The Circos plot provides a comparative visualization of gene presence-absence patterns and expression levels between clade I and clade II, highlighting key differences that may underlie their pathogenic potential. This comparative approach helps in identifying unique gene expression signatures associated with each clade, providing deeper insights into their adaptive mechanisms.

About This Repository

This repository contains an R script for generating a circular ideogram using the circlize package. The script visualizes gene presence-absence patterns for accessory genes along with their expression levels, offering a clear representation of strain-specific gene expression trends.

Circos Plot Visualization

The generated Circos plot includes:

  • Gene Presence-Absence Data: Outer rings illustrate whether an accessory gene is present or absent across multiple strains, including clade I (PT8465) and clade II (ND6401).
  • Expression Levels: Overlaying RNA-Seq data on the gene presence-absence matrix highlights transcriptionally active genes, distinguishing between expressed and silent accessory genes.
  • Functional Annotation (Outermost Circle): The outermost ring represents EggNOG functional categories, allowing for rapid interpretation of gene function and potential biological relevance.

This visualization provides an intuitive way to explore the relationship between gene content and transcriptional activity, facilitating the identification of functionally important accessory genes that differentiate clade I from clade II.

Dependencies

Ensure that the following R packages are installed before running the script:

  • circlize
  • tidyverse
  • ComplexHeatmap

You can install them using:

install.packages(c("circlize", "tidyverse", "ComplexHeatmap"))

Usage

  1. Prepare the input dataset containing gene presence-absence data, expression levels, and functional annotations.
  2. Run the R script to generate the Circos plot.
  3. Interpret the visualized relationships between accessory gene presence, expression, and functional categories, focusing on differences between clade I (PT8465) and clade II (ND6401).

This repository serves as a powerful tool for visualizing pan-transcriptomic data, enabling researchers to uncover functional insights into strain-specific adaptations through a dynamic and intuitive Circos plot representation.

About

This repository contains an R script for generating Circos plots to visualize accessory gene presence-absence patterns and expression levels in *Streptococcus pneumoniae* serotype 3. It compares clade I (PT8465) and clade II (ND6401) strains, highlighting differences in gene expression and adaptation.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages