title

author

date

output

TOG RNA-seq Workshop To-Dos

Nikita Telkar

June 2021

html_document

keep_md	toc	toc_depth	theme	highlight
true	true	4	flatly	pygments

Pre-Workshop To-Dos

In order to make sure that the workshop progresses as smoothly as possible (given that we have a number of steps to cover), please do complete the following to-dos before the workshop.

1.0 Update R and RStudio

Download the latest version of R here
Download the latest version of RStudio here

1.1 Update all packages

Once you have updated R and RStudio to their latest versions, open RStudio and select the Tools option from the menu bar.
Note: Do not open any R script when performing this step - choose the option as soon as you open RStudio.

1.2 Install these packages

Here's the list of all the packages you will need for this workshop. You can simply copy and paste the following into your RStudio console.

install.packages(c("tidyverse", "here", "rmarkdown", "knitr", "kableExtra", "janitor", "scales", "ggpubr",
    "pheatmap", "reshape2"))
    
install.packages("BiocManager")

BiocManager::install(c("clusterProfiler", "biomaRt", "edgeR", "limma", "Rsubread"))

remotes::install_github("wvictor14/plomics")

1.3 Create a New Project

Create a new project in RStudio with the following folders:

data
scripts

project_folder.Rproj (I've called mine TOG_RNAseq_Workshop_2021)
|
|__ data
|
|__ scripts

1.4 Download Expression and Phenotype files

Navigate to the data folder at our workshop repo on our GitHub, and download the following two files we'll be using to your data folder:

GSE157103_formatted_eDat.txt
GSE157103_formatted_pDat.txt
BAM_R_obj.RDS

To download, click on the file name, and it will diaplay a message saying Sorry about that, but we can’t show files that are this big right now with a View Raw link. Click on the link, wait for the browser to stop loading, and right click + Save As.... For any files not ending with the .txt extension, make sure to remove the .txt suffix at the end while saving the file.

For ease of usage, I've edited the raw files released on GEO to a tidier format, specifically for this workshop. To find out how I formatted these two files, you can download the .Rmd file titled 0_GSE157103_Data_Formatting.Rmd.

2.0 Brush up on your Stats

As we'll be going through a number of steps required for data processing and analysis, we won't have enough time to get into explanation of the statistical tests used, as well as when/why they're used for particular types of data. A few examples of the methods that we're going to apply are:

Linear Modelling
PCA
Normalization

Here are a few resources that you might want to go through joining the workshop:

Types of statistical tests
Statistics definitions
The BCCHR Summer Statistics Video Series

3.0 Download BAM file or the corresponding R object

One of the steps we'll look at is converting BAM files to an expression count matrix.

However, because the conversion form FASTQ (raw sequencing files) to BAM (aligned sequence files) requires time, as well as substantial computer memory and storage, I'll be directly demonstrating how to extract sequence/gene counts from an already generated BAM file. However, you can find the genral steps of converting a FASTQ file to a BAM file within the FASTQ-BAM_BAM-FASTQ.pdf file in the data folder.

We'll be using the BAM file of a participant from Phase 3 of the Human Genome Project.

If you want to conduct this step in real-time during the workshop (and your computer has enough storage space), download the file (557 Mb) here, and save it to your data folder. If low on space, download the BAM_R_obj.RDS which is essentially the output we get after loadin in the BAM file (don't worry, we'll go through the command required to load in BAM files reagrdless!)

4.0 Skim through Journal Article

Download the associated journal article for this data titled Overmyer_2021.pdf from the GitHub repo.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pre_Workshop_ToDos.md

Pre_Workshop_ToDos.md

Pre-Workshop To-Dos

1.0 Update R and RStudio

1.1 Update all packages

1.2 Install these packages

1.3 Create a New Project

1.4 Download Expression and Phenotype files

2.0 Brush up on your Stats

3.0 Download BAM file or the corresponding R object

4.0 Skim through Journal Article

Files

Pre_Workshop_ToDos.md

Latest commit

History

Pre_Workshop_ToDos.md

File metadata and controls

Pre-Workshop To-Dos

1.0 Update R and RStudio

1.1 Update all packages

1.2 Install these packages

1.3 Create a New Project

1.4 Download Expression and Phenotype files

2.0 Brush up on your Stats

3.0 Download BAM file or the corresponding R object

4.0 Skim through Journal Article