
The expression level of each gene in a given cell was quantified by counting the total number of reads mapping to it.

To map the developmental trajectories of the multiple cell lineages arising from HBCs, scRNA-seq was performed on FACS-purified cells using the Fluidigm C1 microfluidics cell capture platform followed by Illumina sequencing. The scRNA-seq dataset we use as a case study was generated to study the differentiation of HBC stem cells into different cell types present in the olfactory epithelium. When a severe injury to the entire tissue happens, the olfactory epithelium can regenerate from normally quiescent stem cells called horizontal basal cells (HBC), which become activated to differentiate and reconstitute all major cell types in the epithelium. The olfactory epithelium contains mature olfactory sensory neurons (mOSN) that are continuously renewed in the epithelium via neurogenesis through the differentiation of globose basal cells (GBC), which are the actively proliferating cells in the epithelium. This workflow is illustrated using data from a scRNA-seq study of stem cell differentiation in the mouse olfactory epithelium (OE) ( Fletcher et al., 2017). Throughout the workflow, we use a single SummarizedExperiment object to store the scRNA-seq data along with any gene or cell-level metadata available from the experiment See Figure 1. Here, we propose an integrated workflow for dowstream analysis, with the following four main steps: (1) dimensionality reduction accounting for zero inflation and over-dispersion, and adjusting for gene and cell-level covariates, using the zinbwave Bioconductor package (2) robust and stable cell clustering using resampling-based sequential ensemble clustering, as implemented in the clusterExperiment Bioconductor package (3) inference of cell lineages and ordering of the cells by developmental progression along lineages, using the slingshot R package and (4) DE analysis along lineages. However, these workflows are mostly used to prepare the data for further downstream analysis and do not focus on steps such as cell clustering and lineage inference. In these workflows, single-cell expression data are organized in objects of the SCESet class allowing integrated analysis.


(2016) and the package scater ( McCarthy et al., 2017) are such examples based on open-source R software packages from the Bioconductor Project ( Huber et al., 2015). ScRNA-seq low-level analysis workflows have already been developed, with useful methods for quality control (QC), exploratory data analysis (EDA), pre-processing, normalization, and visualization. For example, the Chromium Single Cell 3’ Solution was recently used to sequence and profile about 1.3 million cells from embryonic mouse brains. This is all the more true with novel sequencing technologies that allow an increasing number of cells to be sequenced in each run. While each individual method is useful on its own for addressing a specific question, there is an increasing need for workflows that integrate these tools to yield a seamless scRNA-seq data analysis pipeline. To properly account for features specific to scRNA-seq, such as zero inflation and high levels of technical noise, several novel statistical methods have been developed to tackle questions that include normalization, dimensionality reduction, clustering, the inference of cell lineages and pseudotimes, and the identification of differentially expressed (DE) genes. Single-cell RNA sequencing (scRNA-seq) is a powerful and promising class of high-throughput assays that enable researchers to measure genome-wide transcription levels at the resolution of single cells. Using stem cell differentiation in the mouse olfactory epithelium as a case study, this integrated workflow provides a step-by-step tutorial to the methodology and associated software for the following four main tasks: (1) dimensionality reduction accounting for zero inflation and over dispersion and adjusting for gene and cell-level covariates (2) cell clustering using resampling-based sequential ensemble clustering (3) inference of cell lineages and pseudotimes and (4) differential expression analysis along lineages. However, such assays raise challenging statistical and computational questions and require the development of novel methodology and software. Novel single-cell transcriptome sequencing assays allow researchers to measure gene expression levels at the resolution of single cells and offer the unprecendented opportunity to investigate at the molecular level fundamental biological questions, such as stem cell differentiation or the discovery and characterization of rare cell types.
