forked from netbiolab/scHumanNet
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathREADME.Rmd
More file actions
217 lines (170 loc) · 10.2 KB
/
README.Rmd
File metadata and controls
217 lines (170 loc) · 10.2 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# scHumanNet
Construction of cell-type-specific functional gene network, with SCINET and HumanNetv3
### Framework Introduction
scHumanNet enables cell-type specific networks with scRNA-seq data. The [SCINET framework (Mohammade et al. Cell Syst. 2019)](https://www-sciencedirect-com.ezp-prod1.hul.harvard.edu/science/article/pii/S2405471219303837?via%3Dihub) takes a single cell gene expression profile and the “reference interactome” HumanNet v3, to construct a list of cell-type specific network. With the modified version of SCINET source code and the detailed tutorial described below, researchers could take any single-cell RNA sequencing (scRNA-seq) data of any biological context (e.g., disease) and construct their own cell-type specific network for downstream analysis.

For a given scRNA-seq data set, the SCINET framework utilize imputation, transformation, and normalization from the [ACTIONet package(Mohammadi et al. Nat. Commun. 2018)](https://www-nature-com.ezp-prod1.hul.harvard.edu/articles/s41467-020-18416-6) to robustly capture cell-level gene interactions. HumanNet v3 with 1.1 million weighted edges are used as a scaffold information to infer the likelihood of each gene interactions. A subsampling scheme for each cell-type clusters (cell groups) ensures retaining of accurate gene interaction strength despite the incompleteness of single cell dataset at high resolution. Overall, we show that HumanNet v3 not only allow gene prioritization in broad spectrum of diseases, but through construction of context specific cell-type networks, also allow an in-depth study of disease heterogeneity and its underlying mechanism at a cellular level.
### Setting up the Environment
For running scHumanNet, we recommend a `conda` envrionment to install packages in the `packages` folder
```bash
$ conda create -n scHumanNet R==4.0
$ git clone https://github.com/JunhaCha/scHumanNet.git
$ conda activate scHumanNet
(scHumanNet) $ conda install --file ./scHumanNet/packages/requirements_scHumanNet.txt
(scHumanNet) $ unzip ./ACTIONet_2.0.18_HNv3.zip -d ./ACTIONet_2.0.18_HNv3/
```
install the modified version of ACTIONet
```bash
(scHumanNet) $ R CMD INSTALL ./ACTIONet_2.0.18_HNv3
```
start R and install SCINET and scHumanNet
```r
devtools::install_github("shmohammadi86/SCINET")
devtools::install_github("JunhaCha/scHumanNet")
```
### Load required libraries
(add Seurat if necessary)
```r
library(ACTIONet)
library(SCINET)
library(Seurat)
library(igraph)
library(SingleCellExperiment)
library(purrr)
library(dplyr)
```
### Construction of scHumanNet (Example 1)
For the first example case, we showcase the construction of scHumanNet using publically accessivble pan-cancer dataset from [Qian et al. Cell Research 2020](https://www-nature-com.ezp-prod1.hul.harvard.edu/articles/s41422-020-0355-0). The 10X count folder and the metadata can be downloaded from [http://blueprint.lambrechtslab.org]( http://blueprint.lambrechtslab.org)
```r
counts <- Read10X('/your/path/to/CRC_counts/')
meta <- read.table('/your/path/to/CRC_metadata.csv', header = T, sep = ',')
```
Convert to sce
This tutorial converts count data and metadata to sce obeject from `SingleCellExperiment`, to be used as intput for network construction
```r
data <- SingleCellExperiment(assays = list(logcounts = counts), colData = meta)
```
For seurat objects, manually insert count data and metadata within the `SingleCellExperiment()`, or use the `as.SingleCellExperiment()` provided in the `Seurat` package.
```r
data <- SingleCellExperiment(assays = list(logcounts = seurat_object@assays$RNA@counts), colData = [email protected])
data <- Seurat::as.SingleCellExperiment(seurat.object)
```
prior to scHumanNet construction, reduce data and use the ace class from the ACTIONet package
```r
ace <- reduce.ace(data)
```
The column `CellType` of the metadata here indicates the column where each barcode is annotated from the user's preferred choice of methods
```r
ace[['Labels']] <- meta$CellType
```
```r
ace <- compute.cluster.feature.specificity(ace, ace$Labels, "celltype_specificity_scores")
Celltype.specific.networks = run.SCINET.clusters(ace, specificity.slot.name = "celltype_specificity_scores_feature_specificity")
```
Load HumanNetv3 interactome
```r
data('HNv3_XC_LLS')
```
Sort each genepair alphabetically and add LLS weight from HumanNetv3. Elements of `sorted.net.list` are stored as edgelist
```r
sorted.net.list <- SortAddLLS(Celltype.specific.networks, reference.network = graph.hn3)
```
Check each element of list
```r
lapply(sorted.net.list, head)
```
Save cell-type-specific networks, with both SCINET and LLS weights included in the edgelist for downstream network analysis
```r
saveRDS(sorted.net.list, './sorted_el_list.rds')
```
### Differential Network analysis with scHumanNet (Example 2)
In this example we provide a framework for a common downstream network analysis, identification of differential hub in a control vs disease scRNA-seq study. Here we present an example cell-type specific gene prioritization assocated with ASD. Differential hub gene is identified that significantly differs in centrality for each neuronal celltypes of healthy vs ASD scHumanNet(data derived from [Velmeshev et al. Science 2019](https://pubmed.ncbi.nlm.nih.gov/31097668/)).

Download the publically accessible data `meta.txt` and `10X folder` from [https://autism.cells.ucsc.edu](https://autism.cells.ucsc.edu)
```r
counts <- Seurat::Read10X('/your/path/to/10X_out/')
meta <- read.table('/your/path/to/meta.txt', header = T, sep = '\t')
#check if barcodes match
rownames(meta) <- meta$cell
meta$cell <- NULL
identical(colnames(counts), rownames(meta))
#merge annotated celltypes to larger granularity
#neu_mat, NRGN neurons are seperated and will be excluded because it is either similar to Excitatory neurons based on UMAP analysis and is thus considered ambiguous
meta$celltypes_merged <- ifelse(meta$cluster %in% c('AST-FB','AST-PP'), 'Astrocyte',
ifelse(meta$cluster %in% c('IN-PV', 'IN-SST','IN-SV2C', 'IN-VIP'), 'Inhibitory',
ifelse(meta$cluster %in% c('L2/3', 'L4', 'L5/6','L5/6-CC'), 'Excitatory',
ifelse(meta$cluster %in% c('Neu-mat','Neu-NRGN-I', 'Neu-NRGN-II'), 'Others',
as.character(meta$cluster)))))
```
To make a control vs disease network for each celltype we make a new column that combines celltype and disease annotation
For the Velmeshev 2019 data, column name `diagnosis` and `celltypes_merged` includes disease and celltype annotation respectively.
```r
meta$celltype_condition <- paste(meta$diagnosis, meta$celltypes_merged, sep = '_')
```
Construct celltype specific networks for control and disease similarly as above
```r
data <- SingleCellExperiment(assays = list(logcounts = counts), colData = meta)
ace = reduce.ace(data)
ace[['Labels']] <- meta$celltype_condition
ace = compute.cluster.feature.specificity(ace, ace$Labels, "celltype_specificity_scores")
Celltype.specific.networks = run.SCINET.clusters(ace, specificity.slot.name = "celltype_specificity_scores_feature_specificity")
```
Add LLS weight from HumanNetv3 for downstream analysis
```r
data('HNv3_XC_LLS')
sorted.net.list <- SortAddLLS(Celltype.specific.networks, reference.network = graph.hn3)
```
In this tutorial we will select degree, sum of all weights connecting the node, as a centrality measure to prioritize genes. The function `GetCentrality` also supports betweenness, closeness, and eigenvector centrality as well.
```r
strength.list <- GetCentrality(method='degree', net.list = sorted.net.list)
```
Percentile rank is used to account for netork size differences. For every gene in the reference interactome, if a node is not existent in the scHumanNet, 0 value is assigned.
```r
rank.df.final <- CombinePercRank(strength.list)
```
Get top 50 central genes for each celltype
```r
top.df <- TopHub(rank.df.final, top.n = 50)
head(top.df)
```
| Control_Excitatory | ASD_Excitatory | ... | ASD_Others |
| ------------- | ------------- | ------------- | ------------- |
| ULK1 | ULK1 |... | UQCRC1 |
| MTFMT | GRIN2B | ...| NDUFA8 |
| ... | ... | ... | ... |
| COX4I1 | RPE | ...| NDUFA3 |
Get the differential percentile rank value for each genes with function `DiffPR()`, where the output is a dataframe with genes and the corresponding diffPR value for each scHumanNets. The input of DiffPR includes the output of CombinePercRank(), metadata, column name of the annotated celltypes and condition(disease & control), and of the two annotation which will be used as a control. This example dataset `diagnosis` contains `Control` and `ASD`, and the column `celltypes_merged` stores the annotated celltypes.
```r
diffPR.df <- DiffPR(rank.df.final, celltypes = 'celltypes_merged', condition = 'diagnosis', control = 'Control', meta = meta)
head(diffPR.df)
```
| Astrocyte | Astrocyte_ASD-Control | ... | Others | Others_ASD-Control |
| ------------- | ------------- | ------------- | ------------- | ------------- |
| AR | -1.0000000 |... | UQCRC2 | -0.9987382 |
| FASN | -0.9996068 | ...| NDUFB8 | -0.9981073 |
| ... | ... | ... | ... | ... |
| ACAA1 | 0.9967987 | ...| NDUFB3 | -0.9946372 |
Finally, we provide a nonparametric method to filter differential hubs with the function `FindDiffHub()`. Input requires the output of DiffPR, and the user-defined pvalue threshold. The output consists of a gene column, diffPR value sorted from negative to positive value, pvalue, and the celltype. To extract genes, use the `gene` column instead of `rownames()`.
```r
diffPR.sig <- FindDiffHub(diffPR.df, p.value = 0.05)
diffPR.sig
```
| | gene | diffPR | pvalue | celltype |
| ------------- | ------------- | ------------- | ------------- | ------------- |
| TRIM21 |TRIM21 |-0.8537161 | 0.04973245 | Astrocyte |
| FOXH1 | FOXH1 | -0.8568620 | 0.04910293 | Astrocyte |
| ... | ... | ... | ... | ... |
| COX16.1 | COX16 | 0.9793814 | 0.007617547 | Others |
| MAP2K1.1 | MAP2K1 | 0.9888977 | 0.003414762 | Others |