-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
131 lines (93 loc) · 3.4 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# FLASHMM
<!-- badges: start -->
<!-- badges: end -->
FLASHMM is a package for analysis of single-cell differential expression (DE) using a linear mixed- effects model (LMM). The mixed-effects model has become a powerful tool in single-cell studies due to their ability to model intra-subject correlation and inter-subject variability.
FLASHMM package provides two functions, lmm and lmmfit, for fitting LMM. The lmm function uses summary-level statistics as arguments. The lmmfit function is a wrapper function of lmm, which directly uses cell-level data and computes the summary statistics inside the function. The lmmfit function is simple to be operated but it has a limitation of memory use. For large scale data, it is recommended to precompute the summary statistics and then use lmm function to fit LMM.
In summary, FLASHMM package provides the following functions.
* lmm: fit LMM using summary-level data.
* lmmfit: fit LMM using cell-level data.
* lmmtest: perform statistical tests on fixed effects and the contrasts of the fixed effects.
* sslmm: compute the summary-level data using cell-level data.
* simuRNAseq: simulate multi-sample multi-cell-type scRNA-seq dataset based on a negative binomial distribution.
## Installation
You can install the development version of FLASHMM from Github:
```{r echo = TRUE, results = "hide", message = FALSE}
devtools::install_github("https://github.com/Baderlab/FLASHMM", build_vignettes = TRUE)
```
## Example
This is a basic example which shows you how to use FLASHMM to perform single-cell differential expression analysis.
```{r}
library(FLASHMM)
```
### Simulating a scRNA-seq dataset by simuRNAseq
Simulate a multi-sample multi-cell-cluster scRNA-seq dataset that contains 25 samples and 4 clusters (cell-types) with 2 treatments.
```{r dataset}
set.seed(2412)
dat <- simuRNAseq(nGenes = 50, nCells = 1000,
nsam = 25, ncls = 4, ntrt = 2, nDEgenes = 6)
str(dat)
##
#counts and meta data
counts <- dat$counts
metadata <- dat$metadata
rm(dat)
```
### DE analysis using LMM
**Model design**
* Y: gene expression profile (log-transformed counts)
* X: design matrix for fixed effects
* Z: design matrix for random effects
```{r}
Y <- log(counts + 1)
X <- model.matrix(~ 0 + log(libsize) + cls + cls:trt, data = metadata)
Z <- model.matrix(~ 0 + sam, data = metadata)
d <- ncol(Z)
```
**LMM fitting**
a) Fit LMM by lmmfit using cell-level data.
```{r}
fit <- lmmfit(Y, X, Z, d = d)
```
b) Fit LMM by lmm using summary-level data computed as follows.
```{r}
#Computing summary statistics
n <- nrow(X)
XX <- t(X)%*%X; XY <- t(Y%*%X)
ZX <- t(Z)%*%X; ZY <- t(Y%*%Z); ZZ <- t(Z)%*%Z
Ynorm <- rowSums(Y*Y)
#Fitting LMM
fitss <- lmm(XX, XY, ZX, ZY, ZZ, Ynorm = Ynorm, n = n, d = d)
identical(fit, fitss)
```
c) Fit LMM by lmm using summary-level data computed by sslmm.
```{r}
#Computing summary statistics
ss <- sslmm(X, Y, Z)
#Fitting LMM
fitss <- lmm(summary.stats = ss, d = d)
identical(fit, fitss)
```
**Hypothesis tests**
```{r}
test <- lmmtest(fit)
#head(test)
#t-values
all(t(fit$t) == test[, grep("_t", colnames(test))])
fit$t[, 1:5]
##
#p-values
all(t(fit$p) == test[, grep("_p", colnames(test))])
fit$p[, 1:5]
```