Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Envisioning the Future of MultiomicsInnovative Tools Driving Research and Discovery
in Cancer and ImmunologyArticle Collection
Sponsored by:
Fundamentally alter your understanding of cancer and accelerate translational research with flexible and innovative solutions for single cell sequencing and spatially-resolved transcriptional profiling from 10x Genomics.
• Unravel the complexities of heterogeneous cancer samples to detect tumor clones and unique cellular states that drive malignancy
• Resolve the tumor microenvironment and explore the influence of cancer on its resident tissue
• Advance immunotherapies by characterizing the tumor immune response and the molecular mechanisms underlying therapeutic response and resistance
Resolve cancer
Chromium Single Cell Solutions
Single Cell Gene ExpressionSingle Cell Immune ProfilingSingle Cell Epigenomic ProfilingSingle Cell Protein ExpressionTargeted Gene Expression
Visium Spatial Solutions
Spatial Gene ExpressionSpatial Protein ExpressionTargeted Gene Expression
Learn more at 10xgenomics.com/cancer
with single cell and spatial multiomics
Contents4
Introduction
5
Single-Cell Sequencing in Translational Cancer Research and Challenges to Meet Clinical Diagnostic Needs BY ULRICH PFISTERER, JULIA BRÄUNIG, PER BRATTÅS, MARKUS
HEIDENBLAD, GÖRAN KARLSSON, THOAS FIORETOS
26 Identification of a Tumor–Specific Gene Regulatory Network in Human B-cell Lymphoma BY 10x GENOMICS
30Recent advances in single-cell multimodal analysis to study immune cells BY RAYMOND HY LOUIE & FABIO LUCIANI
41Genomic Cytometry and New Modalities for Deep Single-Cell InterrogationBY ROBERT SALOMON, LUCIANO MARTELOTTO, FATIMA VALDES-MORA,
DAVID GALLEGO-ORTEGA
51Computational Approaches for High-Throughput Single-Cell Data AnalysisBY HELENA TODOROV AND YVAN SAEYS
COVER IMAGE © 10x Genomics
3
Introduction
From cancer to immunology, single cell RNA-sequencing (RNA-seq) has dramatically changed how researchers approach biology. Single cell resolution
has progressed the concept of inherent heterogeneity of biological systems and led to novel advances in how we understand developmental processes, treat disease, and develop therapeutics. Now, biologists can further increase the breadth of their understanding with multiomic single cell analysis. In addition to a readout of mRNA abundance from single cell RNA-seq, single cell techniques can now be applied to profile DNA, chromatin state, and the proteome. To take it one step further, some methods enable next generation multiomics—the ability to capture multiple measurements simultaneously from the same single cell, rather than examining one readout at a time. This abundant data can provide novel insights, but it also presents new challenges, including how to collect, store, and manage data; integrate different modalities; and properly interpret findings.
This collection of articles provides an overview of the exciting innovations occurring at the forefront of multiomics. The first two articles focus on oncology. Cancer research is dedicated to improving cancer diagnostics, patient stratification, treatment monitoring, and therapeutic development. Single cell multiomics has provided increasingly detailed cell atlases that let researchers gain a better picture of tumor heterogeneity and investigate how that heterogeneity impacts disease progression and treatment response. Pfisterer et al. (2020) describes how the latest single cell multiomic techniques can be applied to cancer research, reviews the methods available for single cell isolation, and highlights recent multiomic single cell oncology studies. In our Data Spotlight from 10x Genomics, the simultaneous readout of epigenomic and transcriptomic data from the same cells enables the direct reconstruction of cell type–specific gene regulatory networks for B-cell lymphoma. This study highlights the power of using Chromium Single Cell Multiome ATAC + Gene Expression, the first commercial solution for paired ATAC-seq and RNA-seq analysis of single cells. The data from this study is available for download so you can continue to explore the possibilities yourself.
In Louie and Luciani (2021) our attention shifts to the immune system, another heterogeneous system that benefits from single cell investigation. Analysis of multiple modalities, including chromatin state, transcription status, and protein
expression, can provide greater stratification of immune cell states, including a cell’s ability to bind antigens, attack invading cells, and follow a path of differentiation. Cell states can change over time and across space, and multiomic technologies have been developed to evaluate each of these variables. This article discusses recent single cell multiomic applications to immunology, focusing on next generation multiomic techniques that enable simultaneous measurement of at least two distinct modalities from the same single cell. Of particular relevance for immunologists is the ability to track clonal differentiation of T or B cells using receptor sequencing in the context of CAR-T therapy, autoimmune disease, and lineage tracing of hematopoietic progenitor cells.
The proliferation of multiomic single cell approaches has been made possible by the confluence of several disparate technologies, including genomics, microfluidics, cytometry, and informatics. Genomic Cytometry, described by Salomon et al. (2020), is any technique that provides cell-by-cell measurement of multiple modalities, including protein, mRNA, DNA, and epigenetic states, through a sequencing-based readout, therefore overcoming the limitations of fluorescence and mass cytometry by opening up unlimited analytic space to quantify hundreds of thousands of different molecular species at once. Multiple methods exist to perform Genomic Cytometry, including plate-based, droplet-based microfluidics, solid microfluidics, in situ combinatorial indexing, image-based approaches, and spatial transcriptomics.
Gathering data is only the beginning, however. In Todorov and Saeys (2019), we examine the process underlying analyzing a single cell experiment, including power calculations performed during experimental design, inclusion of controls during data generation, pre-processing, checking for batch effects during data visualization, cell type identification, differential analysis, and more. This article reviews methods for dimensionality reduction and cell clustering, compares approaches for trajectory analysis, and provides an introduction to single cell multiomic data integration.
These articles are designed to provide a comprehensive understanding of the technical innovations happening right now in single cell multiomics, and highlight how these advances are fueling the future of biological research and medicine.
4
R E V I EW AR T I C L E
Single-cell sequencing in translational cancer research andchallenges to meet clinical diagnostic needs
Ulrich Pfisterer1,2 | Julia Bräunig1,2 | Per Brattås1,2 | Markus Heidenblad1,2 |
Göran Karlsson3 | Thoas Fioretos1,2,4
1Center for Translational Genomics, Lund
University, Lund, Sweden
2Clinical Genomics Lund, Science for Life
Laboratory, Lund University, Lund, Sweden
3Division of Molecular Hematology, Lund
Stem Cell Center, Lund University, Lund,
Sweden
4Division of Clinical Genetics, Department of
Laboratory Medicine, Lund University, Lund,
Sweden
Correspondence
Ulrich Pfisterer, Department of Laboratory
Medicine, Center for Translational Genomics,
Lund University, Lund, Sweden.
Email: [email protected]
Thoas Fioretos, Division of Clinical Genetics,
Department of Laboratory Medicine, Lund
University, Lund, Sweden.
Email: [email protected]
Funding information
Governmental ALF grants; Lund University
Cancer Center (LUCC); Medical Faculty Lund
University; SciLifeLab Stockholm;
StemTherapy Lund University
Abstract
The ability to capture alterations in the genome or transcriptome by next-generation
sequencing has provided critical insight into molecular changes and programs under-
lying cancer biology. With the rapid technological development in single-cell
sequencing, it has become possible to study individual cells at the transcriptional,
genetic, epigenetic, and protein level. Using single-cell analysis, an increased resolu-
tion of fundamental processes underlying cancer development is obtained, providing
comprehensive insights otherwise lost by sequencing of entire (bulk) samples, in
which molecular signatures of individual cells are averaged across the entire cell pop-
ulation. Here, we provide a concise overview on the application of single-cell analysis
of different modalities within cancer research by highlighting key articles of their
respective fields. We furthermore examine the potential of existing technologies to
meet clinical diagnostic needs and discuss current challenges associated with this
translation.
K E YWORD S
cancer research, clinical diagnostics, clinical utility, single-cell sequencing
1 | INTRODUCTION
Cancer represents highly complex and diverse pathological conditions,
characterized by aberrant genomic, epigenomic and transcriptomic
features, such as structural alterations, single nucleotide and copy
number variations (SNVs, CNVs),1 and altered epigenetic and tran-
scriptional signatures.2,3 Both intra- and intertumoral heterogeneity
contribute to the complexity of cancer with mutations in driver genes
adding on to clonal evolution4 and consequently to dynamic clonal
architecture throughout disease progression.5 Recent years have
witnessed a dramatic progress in studying the genetic and molecular
basis of human cancer, enabled, in part, by the rapid technological
developments in next-generation sequencing.6,7 Clinical applications
for these platforms span the areas of diagnostics, prognostics and
therapeutics using massively parallel sequencing for whole-genome
(WGS) or targeted DNA-sequencing (eg, whole-exome sequencing
WES), RNA-sequencing, chromatin immunoprecipitation (ChIP)-
sequencing, and DNA methylation assays for epigenetic mapping.
In order to further improve clinical application of sequencing-
based technology and to ultimately provide better cancer diagnosis,
patient stratification, treatment monitoring, and personalized therapy,
the recent initiative of the human tumor atlas network aims at the
generation of longitudinal cell atlases of various tumor types
employing single-cell and spatially resolved technologies.8
This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and distribution in any
medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.
© 2021 The Authors. Genes, Chromosomes and Cancer published by Wiley Periodicals LLC.
5
The considerable cellular heterogeneity present in most tumors is
likely to contribute to the currently ineffective and highly individual
responses of patients to therapeutic approaches. While bulk analyses
of tumor tissues have provided important insight into for example, the
transcriptional signature or overall genetic variability of a given
tissue,9-12 it does not resolve the cellular composition of malignant
and normal cells. Hence, resolving tumor composition at single-cell
resolution offers great potential not only to provide critical insights
into tumor biology per se, but also to shed light on other therapeuti-
cally relevant issues related to heterogeneity such as tumor microen-
vironment, cell-of-origin, and cancer stem cells. Thus, the advent of
single-cell analyses promises to improve diagnosis, facilitate monitor-
ing of both disease progression and treatment response and will,
hopefully, pave the way to more personalized therapeutic approaches
to realize the promises of precision medicine (Figure 1A-D).
The importance of elucidating cancer at single-cell resolution has
been demonstrated in a plethora of studies which have allowed inves-
tigators to assess tumor heterogeneity, to define cell types and states
in healthy specimen and tumors, as well as to examine heterogeneous
treatment response and drug resistance, among other clinically rele-
vant applications.13,14
Rapid technological development makes it feasible today to access
many different modalities in single-cells,15 in some cases to profile more
than one measure from a single-cell simultaneously16-23 and to perform
advanced computational analyses.24-26 Several different modalities have
been applied to cancer research using either dissociated single-cells or
intact tissue with spatial resolution (Figure 2). While single-cell sequenc-
ing is progressively applied to study clinical cancer samples, its broader
translation into clinical diagnostics has yet to come.
In order to translate single-cell analyses into reliable clinical appli-
cations, thorough assessment will be essential to define a technology's
overall clinical applicability, which relies on its demonstrated analytical
and clinical validity as well as clinical utility.27-29 Here, we define ana-
lytical validity as the confidence of a given test to measure the pres-
ence or absence of a disease-related alteration. In contrast, clinical
validity is determined as the accuracy and confidence with which a
detected variation can be related to a distinct disease phenotype.
Finally, clinical utility determines whether a test result will yield medi-
cal intervention to ultimately improve the patients' health or, where
treatment is unavailable, support clinical diagnosis of patients.28
This review provides an overview of currently available single-cell
sequencing technologies and how such technologies have been used
recently to provide important insights into the molecular basis of can-
cer. Furthermore, it discusses selected studies of different tumor
types, the results of which suggest that single-cell sequencing will
have great clinical utility in the near future, and highlights challenges
and hurdles that exist in order for single-cell sequencing to meet clini-
cal diagnostic needs.
F IGURE 1 Schematic representation of different scenarios where single-cell resolution is beneficial. A, Monitoring cellular tumor compositionfrom diagnosis through treatment to monitor treatment response and to potentially refine therapy. B, Immune profiling of tumors to decomposethe different immune cells and cell states infiltrating the tumor tissue. C, Large-scale analysis of tumor composition to decipher intratumorheterogeneity as well as heterogeneity of tumors of the same origin among patients. D, Monitoring clonal composition for example duringtreatment to determine whether a specific therapeutic approach is efficient
6
2 | APPLICATION OF SINGLE-CELL mRNA-SEQUENCING IN CANCER
Isolation of single cells may follow different principles: individual cells
may be handpicked or sorted into PCR plates by flow cytometry. It is
further possible to directly dispense cells into chips harboring several
thousand nanowells, to trap single cells in channels and capture sites
of microfluidic devices, as well as to encapsulate single cells in nano
oil-droplets using yet other microfluidic devices. With a growing
demand to study large cell numbers, microfluidic devices for droplet-
based cell capturing are at present among the most common
platforms used. Importantly, the principle of cell isolation does not
necessarily restrict the modalities which can be analyzed. An overview
of the most widely adopted single-cell isolation principles, platforms
and modalities can be found in Table 1.
Single-cell transcriptomics has greatly increased our understanding
of the composition of complex tissues30-33 and has facilitated the study
of a wide variety of human diseases at unprecedented depth.34-39 At
present, a large number of single-cell transcriptomics methodologies
and platforms are at hand (Smart-seq2,40 Smart-seq3,41 STRTseq,42
Cyto-seq,43 inDrop,44 Drop-seq,45 10X Genomics46 as well as CEL-
seq2,47 Quartz-seq,48 MARS-seq,23 Seq-Well49).
Combinatorial indexing,50 where individual cells undergo several
rounds of molecular barcoding, in combination with droplet-based
methods as exemplified in a preprint51 have furthermore greatly
increased the number of cells which can be profiled in a single experi-
ment. This elevated throughput leveraged large-scale studies as exem-
plified by profiling 690 000 single-cells of the adult mouse brain
giving rise to a comprehensive cell atlas of the rodent brain.52 While
the most relevant single-cell genomics applications have been used in
a plethora of different studies53 and even have been subjected to sys-
tematic comparison regarding cost and information content,54,55
single-cell transcriptomics is still in the early stage of clinical transla-
tion and application.
One of the very first attempts to assess the transcriptome of single
cancer cells was described with the development of the original Smart-
seq chemistry.56 The potential of single-cell RNA-sequencing for diag-
nostic purposes was initially demonstrated on a metastatic breast can-
cer cell line (MDA-MB-231) by monitoring clonal evolution manifested
in transcriptional alterations and mutation analyses inferred from
mRNA reads along treatment with Paclitaxel, which provided novel
insight into drug resistance dynamics.57 Furthermore, single-cell
sequencing of lung adenocarcinoma (ADC) cells identified a distinct
transcriptional signature of cells associated with resistance to anti-
F IGURE 2 Overview of different modalities for single-cell analysis of cancer tissues. Tumor tissues may be analyzed using either tissue-destructive methods following tissue dissociation or by maintaining the spatial location of the cells in a given tissue. To date, a plethora ofplatforms and chemistries exist to access different modalities in single-cells in tumor tissue. They enable metabolome analysis, genomesequencing, cell surface and immune cell receptor profiling, epigenetic modifications as well as sequencing of the transcriptome. To date, onlycertain modalities can be studied at spatial resolution (dotted line), whereas tissue-destructive methodologies are available to study all of thesingle-cell modalities depicted
7
cancer drugs.58 Similarly, single-cell resolution of the transcriptome in
renal cell carcinoma has shed new light on intratumor heterogeneity
and led to the derivation of a new, combinatorial therapeutic strategy.59
Moreover, two single-cell studies assessed the transcriptome of circu-
lating tumor cells (CTCs) in prostate cancer (PC)60 and breast cancer.61
This led to the identification of two distinct phenotypes of breast can-
cer CTCs with the capacity to interconvert and potentially contribute
to treatment resistance.61 Single-cell analysis further revealed great
diversity of PC CTCs among treated patients and identified splice vari-
ants and mutations in the androgen receptor (AR) gene, associating
failed AR inhibitor treatment to noncanonical Wnt signaling.60 Overall,
analysis of single CTCs may open up for exciting possibilities for future
non-invasive, single-cell diagnostics.
Tumor heterogeneity has been elucidated in glioblastomas,38
breast cancer,62 and large-scale tumor cell atlases have been gener-
ated in lung,63,64 renal,65 and pediatric brain tumors66 at single-cell
resolution. Interestingly, validating single-cell transcriptomics data
with bulk RNA-sequencing, proteomics and functional studies con-
firmed novel phenotypes of endothelial cells which in turn potentially
opens up for new therapeutic target points blocking tumor angiogene-
sis in lung cancer.64 Further technological advancement leveraging
increased cellular throughput, led to the identification of discrete tran-
scriptional programs and cellular compositions in relation to increasing
clinical grade tumors of glioma,37 distinct transcriptional programs of
tumor-associated macrophages in glioma67 and to the determination
of varying gene signatures in malignant cells in head and neck squa-
mous cell carcinoma.68 Unbiased clustering of single-cells obtained
from colorectal cancer not only discovered novel cancer-associated
fibroblast types and unmasked tumor heterogeneity, but importantly
also demonstrated that single-cell transcriptomics provides prognostic
insight previously hidden in bulk sequencing data.69 Furthermore, dec-
iphering of cellular composition of breast cancer patient-derived
xenografts (PDX) identified a stem-like cell type with high epidermal
growth factor receptor (EGFR) gene expression levels and further
linked high EGFR expression to an elevated mesenchymal gene
signature,70 similar to another study identifying elevated expression
of epithelial-to-mesenchymal (EMT) - associated genes in breast can-
cer cells.71 Following breast cancer samples along the treatment
course of several years, integrated single-cell genome and trans-
criptome analyses identified discrete phenotypes associated with
chemoresistance, with the most prominent upregulation being an
EMT gene signature.72
While combined analysis of DNA and RNA in individual cells is
feasible,22 current protocols are not amendable for high cellular
throughput and have therefore not frequently been used. Comple-
mentation of gene expression with inferred CNV from full-length
TABLE 1 Overview of different single-cell isolation principles and corresponding platforms most commonly used in the cited literature of thisreview
Single-cell isolation principle Examples of platforms or chemistries
Modalities studied in selected references
applying various platforms
Manual isolation and dispensation into
tubes or plates
Serial dilution SNV
Hand picking mRNA, inferred SNV
Mouth pipetting mRNA, TCR expression, SNV, CNV,
methylome
Fluorescence-activated cell sorting into
tubes or plates
Various chemistries (eg, Smart-seq2) mRNA, TCR expression, SNV, CNV,
methylome, ATAC
MARS-seq mRNA
TCR-seq TCR expression
QRP DOP-PCR CNV
Immunomagnetic cell separation MagSweeper SNV, CNV
Cell dispensation into nanowells iCell8cx mRNA, DNA
cellenONE,
sciFLEXARRAYER S3
CNV
Seq-well mRNA
Microfluidics with capture sites Fluidigm C1 mRNA, SNV, CNV, ATAC
DEPArray (Menarini Silicon Biosystems) CNA
Microfluidics with nanodroplets 10X Genomics mRNA, TCR expression, inferred SNV/CNV,
ATAC, cell surface proteins
inDrop mRNA, TCR expression
MissionBio SNV/CNV, cell surface proteins
Custom-built Chromatin immunoprecipitation
Abbreviations: ATAC, assay for transposase-accessible chromatin; CNV, copy number variation; MARS-seq, massively parallel RNA single-cell sequencing;
QRP DOP-PCR, quasi-random priming degenerate oligonucleotide primed polymerase chain reaction; SNV, single-nucleotide variant; TCR, T cell receptor;
TCR-seq, T cell receptor sequencing.
8
single-cell mRNA to distinguish malignant cells37,38,71,73,74 or targeted
genotyping35 present attractive alternatives to comprehensively study
human malignancies. Accordingly, chromosomal aberrations charac-
teristic for glioblastoma were inferred onto tumor cells38 and classifi-
cation of malignant cells in glioma were corroborated.37 In line with
this, cancer-specific genomic aberrations could be inferred from
single-cell transcriptomics data and were restricted to malignant gli-
oma cells. Haplotype inference additionally revealed heterozygous
loss of chromosome 14 alleles in glioma tumors.74 Interestingly, RNA-
inferred CNV information clearly distinguished immune from carci-
noma cells in breast cancer,71 opening up for the possibility for unre-
strained profiling of both cell types without usage of cell surface
markers. Deducting genomic alterations such as CNV from mRNA-
sequencing also aided the delineation of cellular hierarchies in
oligodendroglioma.75
More recently, high throughput single-cell mRNA approaches
such as Seq-well49 combined with targeted genotyping were used to
elucidate molecular hierarchies in acute myeloid leukemia (AML) and
confidently identified six malignant AML cell types with mutations
being absent in healthy donor samples.76 This study furthermore
combined both short- and long-read sequencing technologies to
determine genetic aberrations such as insertions, deletions and gene-
fusions in individual cells. It additionally employed a large cohort of
longitudinally collected AML samples (diagnosis, treatment,
remission),76 thereby suggesting that single-cell transcriptome analy-
sis may be applied to monitor treatment response and putatively aid
clinical decision making. However, in order for this approach to pro-
vide analytical and ultimately clinical validity, it needs to possess
greater detection sensitivity of mutation signatures. Hence, while this
work leveraged large-scale analysis, about 40% of the targeted sites
were not detected and mutations located in proximity to either the
30end of the mRNA or to an internal polyadenylation site were cap-
tured more efficiently. This is directly linked to the design of the
sequencing library preparation in Seq-well49 which preferentially
yields sequences toward the 30end of mRNA transcripts via polyT-
capture sequences. In line with this, Petti and co-workers utilized a
droplet-based platform to infer genomic information from single-cell
transcriptomes and were able to deduce SNV information in 23% of
the cells analyzed.77 In addition, the authors confidently distinguished
normal from tumor cells and successfully identified a cell-surface
marker (CD99) from the single-cell transcriptomics data, enabling for
the precise isolation of distinct clonal cells.77 While this study fell
short in identifying novel cell-surface markers, it nicely demonstrated
the possibility for precise isolation of malignant cells for refined
downstream analyses. Recently, uveal melanoma (UM) was studied
by integrated mRNA and B and T cell receptor (BCR and TCR)
expression.78 Inferring genomic aberrations present in the single-cell
transcriptomics data using the software inferCNV, both canonical
and non-canonical CNVs were identified across all samples, delineat-
ing clonal structures in the tumor tissue.78 This furthermore demon-
strates the applicability of single-cell transcriptome analysis to
deduce genomic variation in cancer.
Single-cell transcriptomics is a rapidly evolving technology which
already has yielded critical insight into cellular diversity of complex tis-
sues.31 The highlighted research in this section provided first insights
into pathological transcriptional changes underlying cancer develop-
ment and progression as well as response to therapy. These studies
clearly demonstrate a great potential for single-cell mRNA-sequencing
to become a clinical diagnostic tool in the near future. Most likely, the
first clinical applicability will be as a prognostic tool in the diagnostic
setting, for example in hematologic malignancies, to decipher the cel-
lular composition of normal and malignant tissue based on their tran-
scriptional signatures. However, this will require several large-scale
studies to demonstrate that cellular composition correlates with
important clinical parameters. Along with increased sensitivity and
reproducibility, single-cell mRNA-sequencing, in combination with
other modalities (see below) is likely to become increasingly important
in monitoring treatment response.
3 | SINGLE-CELL IMMUNE PROFILING INCANCER
In the thymus, lymphoid progenitors are molded into committed T
cells which in turn play an important role in shaping the adaptive
immune system. Besides the acquisition of somatic mutations
throughout life in normal cells of different tissues, contributing to
cancerogenesis, progressive decline in T cell production in the thymus
has been associated with an increased incidence of age-relate dis-
eases, including cancer.79 Moreover, the type of immune cells and
their location and density in a given tumor were postulated to possess
prognostic value, and suggested that high frequency of cytotoxic
memory T cells in a tumor tissue was indicative of disease relapse post
treatment.80 These results exemplified the potential clinical benefit of
comprehensive immune cell profiling of tumors and strengthened the
ultimate necessity to retain spatial information within the tumor
tissue.
Since the development of T cells involves both the differentiation
of T lymphocytes and the generation and maintenance of a diverse
TCR repertoire, precise comprehension of developmental processes
underlying T cell specification are of significant importance to under-
stand disease progression in cancer. While targeted TCR analysis via
nested PCR has allowed analysis of several hundreds of single-cells,81
recent developments have made it possible to probe even larger num-
bers of T cells in an unbiased fashion,73,82,83 as well as in combination
with targeted TCR analysis.84 In addition, simultaneous profiling of
both transcriptomic and TCR signatures from thousands of individual
cells has been reported.85-87
A recent study revealed bias in VDJ gene usage during recombi-
nation of TCRβ throughout differentiation toward mature T cells by
integrating transcriptional signatures of cell states with the expression
data on TCR chains α and β.88 This observed bias in TCR recombina-
tion might impact the adaptive immune response and consequently an
individual's response to antigenic stimuli.
9
In the attempt to elucidate the tumor microenvironment, single-
cell analysis enabled the identification of molecular signatures of
exhaustion programs in T cells, their associated markers, and linked
dysfunctional signatures to tumor reactivity in human mela-
noma.73,84,85 It further led to the determination of a transcriptional sig-
nature of specific immune cells which could in turn be linked to patient
survival and improved existing prognostication of breast cancer
patients.89 Moreover, integrated mRNA- and targeted TCR-sequencing
revealed that dysfunctional T cells exhibited prominent clonal expan-
sion with continuous proliferation in metastatic melanoma.84
Single-cell analysis further defined clonotypes of T cells while
suggesting their activation status in the human hepatocarcinoma
(HCC) microenvironment,90 and identified a distinct dendritic cell type
capable of migrating from the tumor tissue to the hepatic lymph
node.91 Furthermore, valuable insight into transcriptional signatures
of tumor-infiltrating myeloid cells in lung ADC has been obtained
using high throughput single-cell mRNA-sequencing.92
A recent study utilized single-cell technology to elucidate the
composition of immune cells in the tumor microenvironment of breast
cancer.86 High-throughput integrated mRNA- and TCR-sequencing rev-
ealed an increased phenotypic diversity of both lymphoid and myeloid
cells in tumorous tissue, as opposed to normal breast tissue, and
exhibited inter-patient variation in metabolic signatures.86 Corroborating
their findings using two different platforms (inDrop and 10X Genomics),
the authors identified continuous T cell activation, which in part could
be explained by broad stimuli activating TCR repertoire, and showed
that tumor residing T cells were comprised of different clonotype clus-
ters with varying activation states.86 Overall, distinct phenotypes are
shaped by the TCR repertoire in response to antigenic stimuli but diver-
sity is also mediated by environmental stimuli such as hypoxia.86
More recently, integrated analysis of mRNA and TCR repertoires
in 141 623 T cells was performed in four different types of cancers
(non-small-cell lung ADC, endometrial ADC, colorectal ADC and renal
clear cell carcinoma) as well as in histologically normal adjacent tissue
(NAT) and peripheral blood.87 This study led to the discovery that
diverse clonal expansion patterns across patients with clonotypes being
either expanded similarly in the tumor and NAT or following differing
patterns.87 Moreover, this work shed light on the existence of a strong
correlation between peripheral and intratumoral clone size, a finding
which was substantiated by re-analyzing data of related studies investi-
gating T cells in non-small-lung cancer93 and colorectal cancer cells.94
In addition, non-exhausted T cell clones were more likely to be blood-
associated as opposed to exhausted clones and different clonal expan-
sion patterns were correlated with the clinical response of patients.87
These findings suggest that the detection of clones in blood may
serve a useful proxy to determine the presence of clinically relevant,
expanded clones in the tumor, opening up for the possibility of “liquidbiopsies” for monitoring treatment response following therapy with
Atezolizumab, Sunitinib, or IMmotion150 using single-cell technology.
Utilizing the same combined mRNA- and TCR-sequencing
approach on basal and squamous cell carcinoma samples before and
after immune checkpoint blockade (ICB) treatment suggested that
novel T cells exert treatment response rather than T cell clones
pre-existing in the tumor.95 Analysis of patients with metastatic mela-
noma responsive to ICB treatment displayed a greater fraction of
large T cell clones as opposed to non-responsive patients.85 Interest-
ingly, transcriptional alterations and gene modules induced by ICB
treatment did not correlate with the clinical outcome observed in
patients,85 rendering simultaneous profiling of TCR clonality a neces-
sity to deduct clinically relevant information.
Despite the correlation of therapy response to clone size, T cell
clonal specificity to distinct tumor antigens yet needs to be deter-
mined and integrated to define lasting predictive markers for the out-
come of different ICB therapies. Interestingly, TCR repertoire
analysis of CD8+ T cells in UM revealed that these cells strongly
expressed the checkpoint marker gene LAG3, whereas, unexpectedly,
expression of PD1 was minimal.78 This may in part explain the lack
of responsiveness of UM to checkpoint immunotherapy targeting
PD1. Moreover, single-cell analysis of immune cells from glioblas-
toma combined with murine models identified a distinct macrophage
type which in turn appeared to be a potential target for combinato-
rial immune therapy.96
Very recently, the development of single-cell metabolic regulome
profiling (scMEP) made it possible to study the highly dynamic func-
tions exerted by immune cells manifested in metabolomic alterations
at spatial resolution, deciphering metabolic profiles of CD8+ T cells in
the tumor microenvironment.97 The ability to analyze immune cell
migration into diseased tissue, which is tightly regulated by the cells'
metabolism, holds great promise to understand immune cell-mediated
processes in the tumor following treatment. In addition to study
tumor immune cells based on mRNA and TCR expression or metabo-
lites, recent technological advances made it possible to generate cell
atlases of human tumors based on the expression of cell surface
markers complemented with single-cell transcriptome sequencing,
exemplified by the analysis of lung ADC.98
Taken together, single-cell immune profiling holds great potential
to refine existing therapies (Figure 1A) and has greatly increased our
understanding how clonal composition of immune cells, both within
the tumor and adjacent tissue, is encoded in the transcriptome and
receptor repertoire (Figure 1B). Single-cell resolution has further
offered insight into intra-tumoral heterogeneity of immune cells and
potential bias in responsiveness to treatment, how regulation of meta-
bolic pathways underlies immune cell function, and how these path-
ways may be exploited to device novel therapeutic strategies
enhancing the overall immune cell response. Given the dramatic
impact of ICB in cancer treatment during recent years and the realiza-
tion that the immune system plays a critical role in cancer develop-
ment and progression, single-cell immune profiling is most likely to
become one of the first strategies reaching clinical diagnostics.
4 | EPIGENETIC ANALYSES OF CANCER ATSINGLE-CELL RESOLUTION
Besides immune infiltration, transcriptomic and genomic alterations,
epigenetic changes underlie cancer development and evolution, but
10
also disease prognosis and treatment outcome.99,100 Epigenetics con-
stitute inheritable cellular regulation of gene expression, which occur
independently of the genetic information. Chromatin status, accessi-
bility and conformation are highly regulated by histone and genome
modifications, and by interactions between DNA and protein struc-
tures. DNA methylation and histone acetylation have been the subject
to intensive research. Hypermethylation of promoter regions, a gen-
eral reduction in genomic 5-methylcytosine levels as well as the loss
of histone acetylation are commonly observed in cancer cells101,102
and ultimately contribute to altered gene expression regulation. In
contrast to genomic mutations and aberrations, epigenetic marks and
their deregulation are often reversible.
To date, several DNA methyltransferase inhibitors (DMTIs)
and histone deacetylase inhibitors have been investigated as anti-
cancer drugs and are approved by the FDA for several cancers.103
First trials with DMTIs yielded promising treatment results, but they
also evoked severe side effects.104,105 Lower treatment dosages
of DMTIs were similarly successful, but no major demethylation
effect was observed in bulk sequencing experiments in contrast
to higher treatment concentrations.106,107 Analysis of monoclonal
populations of the human colon carcinoma cell line HCT116 showed
that every clone has a distinct partial demethylation pattern and
that the resulting changes in epigenetic regulation are sufficient to
slow cancer cell proliferation.107 This monoclonal analysis exempli-
fied the necessity for single-cell resolution in cancer epigenetics
in order to unravel cellular heterogeneity, to device novel therapies
and to monitor treatment. Today, several single-cell methods for
DNA methylation and chromatin accessibility are available to
study cancer (scATAC-seq, sciATAC-seq, scRRBS-seq, scChip-seq,
scTrio-seq).108-112
Single-cell reduced-representation bisulfite sequencing (scRRBS-
seq) was used to trace cancer evolution by measuring alterations in
the methylome in both healthy individuals and patients with chronic
lymphocytic leukemia (CLL) before and after treatment.113 Overall,
this study revealed impaired B cell development in diseased individ-
uals and increased cell-to-cell heterogeneity of B cells in CLL as
opposed to healthy controls and normal B cells.113,114
The application of single-cell assay for transposase-accessible
chromatin sequencing (scATAC-seq) showed that breast cancer cell
lines clustered separately before and after JQ1-treatment based on
their epigenetic state.115 Furthermore, scATAC-seq identified a sub-
population of a PD-1 immunotherapy responsive T cell population
and its underlying regulatory mechanism in basal cell carcinoma,116
and has pinpointed distinct transcription factor motifs which drive
cancer heterogeneity in leukemic cells.117
Unlike scATAC-seq, single-cell chromatin immunoprecipitation
(scChip-seq) also captures repressed regions of the chromatin in addi-
tion to accessible sites.110 Using this approach, a recent study con-
cluded that tumor cells resistant to the cytostatic drug Capecitabine
can be discriminated from non-resistant tumor cells based on their
chromatin status in a triple-negative breast cancer model, and that
distinct repressed H3K27me3 regions were associated with genes
responsible for therapy resistance.118
Besides monomodal single-cell approaches, multimodal methods
offer the possibility to assign an epigenome to a transcriptome,
genome, or proteome revealing the regulatory correlations between
them. Several scATAC-seq and single-cell bisulfite sequencing proto-
cols provided enough genome coverage to analyze CNVs.112,116 One
of the earliest single-cell study in cancer epigenetics utilized bisulfite
sequencing, CNV and transcriptome analysis (scTrio-seq) to investi-
gate hepatocellular carcinoma (HCC).112 The authors found that the
event of a CNV did not alter the methylation pattern of the affected
DNA region and that aberrantly methylated regions did not overlap
with the presence of CNVs, but that both influenced transcriptional
levels. These results additionally confirmed that DNA methylation in
promoter regions correlates negatively with gene expression, whereas
DNA methylation in the gene body correlates positively with tran-
scription as demonstrated in HepG2 and HCC cells. However, only
26 single HCC cells from one patient were investigated, thus limiting
the general conclusions possible to be drawn for HCC from this
study.112
ScRRBS-seq and Smart-seq2 data were obtained from the same
cell by separating mRNA and DNA, revealing an Ibrutinib sensitive B
cell subpopulation in CLL patients, which is expelled from the lymph
node upon treatment.113
Combining CITE-seq, Smart-seq2, and scATAC-seq to investi-
gate mixed-phenotype acute leukemia (MPAL) revealed that ana-
lyses based on either surface-protein expression, chromatin
accessibility or mRNA expression yielded reproducible and compa-
rable cell clusters.108 While MPAL is a rare disease displaying char-
acteristics of both AML and acute lymphoblastic leukemia (ALL),
MPAL patients are more responsive to ALL treatment compared to
AML therapies.119 Single-cell ATAC-seq and Smart-seq2 data pro-
vided the necessary resolution to show that distinct genes are uni-
versally upregulated in either MPAL or AML cancer cells,108
possibly explaining why AML treatments often fail in MPAL
patients. In addition, RUNX1 was associated with transcription fac-
tor binding motifs in MPAL cancer cells.108 Using single-cell combi-
natorial indexing ATAC-seq (sciATAC-seq), the potential regulatory
role of RUNX transcription factor motifs was investigated in a
murine lung ADC model, revealing that accessible RUNX transcrip-
tion factor motifs were mainly present during the metastatic stage.
Additionally, transcription factor scores were matched with differ-
ent tumor stages, as well as RUNX and NKX2.1 transcription factors,
which correlated with patient prognosis.109 Interestingly, while the
transcription factor NKX2.1 is used as a diagnostic marker in clinical
lung ADC,120 the metastatic sciATAC-seq cluster correlated better
with overall patient survival than the NKX2.1 cluster,109 suggesting
that the accessible chromatin status could be used as an improved
diagnostic marker.
Besides genomic DNA, mitochondrial DNA (mtDNA) is subjected
to epigenetic alterations, SNVs and CNVs, which play a role in tumori-
genesis, cancer progression and drug resistance.121 Modification of a
droplet-based scATAC-seq protocol facilitated capturing of mtDNA
(scmtATAC-seq) and demonstrated that a 50x coverage of the mito-
chondria genome can yield robust CNV and even SNV data in addition
11
to accessible chromatin information.122 This revealed mutations and
CNVs related to disease progression and drug resistance in CLL
patients, with individual subpopulations showing impaired methyla-
tion patterns in genes related to drug resistance such as TIAM1 and
ZNF257. Interestingly, the small size of the mitochondrial genome
with only 16 kb in size strongly reduces sequencing costs, potentially
facilitating broader application areas.
At present, published single-cell studies in the field of cancer epi-
genetics have demonstrated that available methods and protocols are
sufficient to distinguish between healthy and diseased cell types, and
to enlighten cancer heterogeneity, progression, and treatment effects.
Identified subpopulations, transcription factor motifs, and regulatory
mechanism could potentially predict patient outcome and drug resis-
tance suggesting sufficient analytical and potentially even clinical
validity. However, as this is a relatively young field, modalities need
further refinement to accomplish analytical validity. ScRRBS-seq
covers less CG islands than bulk bisulfite sequencing110 and in com-
parison with single-cell bisulfite sequencing methods, scChip-seq has
an overall lower genome coverage and a higher ratio of background
noise.123
Some methods, like scTrio-seq, offer a lower throughput
impeding the possibility to access cancer heterogeneity in its
entirety. In the recently developed method Cleavage Under Targets
and Tagmentation (Cut&Tag), antibodies target defined histone
modifications and conjugated Tn5 cuts accessible DNA, which
reduces unspecific signals. Overall, analytical validity of novel
methods such as Cut&Tag124 remains to be demonstrated in cancer
research.
Finally, integration of epigenetic modifications with other modali-
ties such as mRNA or cell surface protein expression will be of impor-
tance to gain more complete insight on how the disease is manifested
and regulated, as well as to explain the effect of cancer-induced epi-
genetic changes.
5 | ASSESSMENT OF CLONALHETEROGENEITY IN CANCER BYSINGLE-CELL DNA-SEQUENCING
Continuous gain of genetic variation in individual cells underlie tumor
initiation, maintenance and evolution. In particular, ongoing cell divi-
sion within tumor tissue fosters genetic mosaicism manifested in
CNVs, SNVs and gene breakpoints.1 While bulk DNA-sequencing has
demonstrated substantial genetic heterogeneity in cancers, such as
AML125 or primary renal carcinomas,126 determination of clonal struc-
ture of cancer types necessitates single-cell resolution.
Among the first methods to be used to interrogate clonal diversity
at single-cell resolution were PCR-based methods such as degenerate
oligonucleotide primed PCR (DOP-PCR),127 isothermal multiple dis-
placement amplification (MDA)128-130 as well as PicoPlex131 and mul-
tiple annealing and looping-based amplification cycles.132 Increased
cellular throughput was achieved by employing microfluidic devices130
and single-cell combinatorial indexed sequencing (sci-seq),133 with
microfluidics enabling stringent quality control via cell imaging, while
simultaneously reducing contaminating ambient DNA interfering with
genomic analyses.
Further optimization in part addressed the shortcomings of exis-
ting approaches with regard to low genomic coverage and allelic drop-
out rates, lack of uniformity, and polymerase-induced errors.134 As
such, recent methods have utilized DNA transposition in combination
with linear amplification135 or direct construction of sequencing-ready
libraries.136,137
Existing technologies facilitate the investigation of CNVs and
SNVs, however, other structural variations such as translocations and
inversions - relevant measures of disease prognosis - are more chal-
lenging to identify at single-cell resolution. Strand-seq enables the
generation of directional sequencing libraries and strand-specific
sequencing reads, yielding homolog resolution in single cells.138 This
was recently utilized to investigate evolutionary differences between
human and macaque based on genetic inversions139 and to develop
the analytical tool single-cell tri-channel processing (scTRIP)140
extracting and utilizing additional information from Strand-seq data.
While this approach enables for more comprehensive analysis of
genomic complexity, it relies on the possibility to label nascent DNA
during replication, which excludes its application to clinical samples
containing non-dividing cells or nuclei.
Single-cell DNA-sequencing has been used intensively to deci-
pher clonal structures in different cancer types and to augment our
knowledge on tumor clonal evolution. An early study applied DOP-
PCR on 100 single nuclei isolated from two human breast cancer
cases and demonstrated that clonal evolution patterns can be inferred
from shallow single-cell WGS.127 While this study did not provide suf-
ficient coverage to resolve SNVs in a genome-wide manner, subse-
quent utilization of G2/M nuclei yielded comparably higher genome
coverage and improved both allelic dropout and false positive rate in
breast cancer samples.141 In this study, the authors indicated that
structural genomic alterations occur early during breast cancer evolu-
tion, while SNVs are acquired progressively and gradually contribute
to clonal diversity.141 The finding, that the majority of single-cell
CNVs were clonal and stable during tumor growth of breast
cancer,142 additionally strengthened the notion that copy number
aneuploidy is acquired early during tumor evolution. Single-cell analy-
sis of breast cancer xenografts moreover corroborated that clonal
expansion dynamics represent reproducible trajectories, indicating
that clonal selection follows a non-random process with distinct muta-
tion genotypes defining clonal fitness and therefore clonal expansion
processes.143
In longitudinal breast cancer samples, bulk exome sequencing
integrated with single-cell DNA and RNA analyses provided insight
into clonal extinction in response to treatment and identified resistant
clones selectively expanded as a result to chemotherapy.144 Single-
cell analysis furthermore enabled the identification of patient-
individual clonal seeding patterns in colorectal cancer leading to the
metastatic tumor.145
Highly relevant with regard to clinical application was the dis-
covery that a large fraction of both trunk and metastatic mutations
12
could be recapitulated in CTCs from PC146 and that CNV pattern
on a whole-genome scale of CTCs was not altered during the treat-
ment course of lung cancer.147 Furthermore, CNVs detected in
CTCs of ADC and small-cell lung cancer (SCLC) were reproducible
between cells and individuals.147 In line with this, copy number
aberrations in CTCs of SCLC were used to determine classifiers
supporting categorization of chemosensitive or chemorefractory
SCLCs.148 Single-cell WGS of 88 CTCs generated classifiers with
sufficient power to assign the vast majority (>80%) of CTC test
samples to either a chemosensitive or chemorefractory treatment
response.148 This suggests an exciting possibility for single-cell
analysis to provide analytical validity for future diagnostic purposes
similar to single-cell transcriptome studies targeting CTCs,60,61
especially in the absence of primary tumor tissue. However, in
order to reach closer to clinical validity and utility, the persistence
of CNV patterns in lung cancer CTCs needs to be corroborated in
larger patient cohorts. In addition, molecular classifiers capable of
predicting treatment response of SCLCs will require a larger
starting number of cells covering a more complete space of geno-
mic alterations.
Single-cell sequencing of hematologic malignancies such as AML
gained insight into the clonal architecture underlying this heteroge-
nous disease entity, although in limited sample numbers.131 In con-
trast, droplet-based cell capturing and barcoding opened up for the
possibility to profile known genomic loci in AML at unprecedented
throughput.149,150 More recently, a similar approach leveraged analy-
sis of 735 483 single-cells obtained from 123 AML patients, unveiling
clonal evolution patterns and correlation of AML driver mutations.151
In total, a selection of 530 validated mutations were included in the
analysis, which in the case of a subset of longitudinal AML samples,
provided additional insight into the clonal evolution processes during
treatment.151 Furthermore, a very recent study utilized droplet-based,
targeted single-cell DNA-sequencing in AML on a large cohort of sam-
ples, providing insight into clonal complexity and co-occurring muta-
tions in epigenetic modifiers in AML along with changes in cell surface
protein expression underlying the pathogenesis of clonal
hematopoiesis.152
Integration of bulk exome and whole-genome sequencing on
51 cases of childhood ALL identified aberrant RAG recombinase activ-
ity as critical driving force for genomic aberrations underlying leuke-
mic transformation.153 Targeted genotyping of mutations and
structural variants derived from bulk exome sequencing allowed the
construction of phylogenetic trees. This confirmed that the fusion
gene ETV6-RUNX1, which is considered one of the initiating genomic
lesions in this form of ALL, was found in the root of both trees.153
Indication for RAG-mediated deletions in cells spanning the entire
phylogenetic tree further suggested that the genomic aberrations
observed were formed through a continuous process in these two
cases.153 Shortcomings of this study comprised the limited number of
single-cells processed; a relatively small number of genomic lesions
were analyzed and the dropout rates of mutant alleles were not thor-
oughly assessed. In contrast, microfluidic MDA targeted genome
sequencing of six patients of ALL provided higher cellular throughput
(1479 single-cells) and combined different computational approaches
for the identification of clonal structures and the removal of low qual-
ity cells due to WGA-induced noise.154 This allowed the authors to
identify clones co-occurring in most patients and to suggest a more
precise hierarchical clonal structure for ALL where the majority of
structural aberrations preceded point mutation acquisition and VDJ
recombination.154
The literature summarized above clearly demonstrates that
single-cell DNA-sequencing is capable to provide analytical validity,
for example, in elucidating tumor heterogeneity, and to monitor
clonal evolution in response to treatment (Figure 1A,C,D), features
of importance in personalized medicine. Particularly, in cases with a
high prevalence of genomic lesions specific for a given cancer type,
targeted genomic approaches hold great potential to become rou-
tine diagnostic application in the near future. Finally, it will be of
essence to understand how different clones and their respective
expansion patterns influence tumor evolution and treatment
response.
6 | SPATIAL RESOLUTION TO AID CANCERTISSUE ANALYSES AND DIAGNOSTICS
In order to truly understand tumor behavior, particularly in solid can-
cers, both disease-related transcriptional and genomic alterations
need to be related to the cells' phenotypes in the spatial context of
the tumor microenvironment. Retaining information of type, density
and location of immune cells in colorectal cancer tissue demonstrated
an association of spatial immune cell composition with clinical out-
come.80 It was suggested that such immunological criteria could be
relevant for a clinical application in cancers where the density of
tumor-infiltrating T cells is linked to favorable prognosis. Current tech-
nologies for massively parallel processing of mRNA or genomic alter-
ations lack spatial resolution as a consequence of tissue dissociation,
which in addition has been shown to potentially induce misleading
transcriptional signatures.155
Different approaches have evolved to profile mRNA or protein
expression with spatial resolution while either preserving the tissue
structure156-159 or destructing it by usage of molecular tags providing
spatial information,160 imaging mass cytometry (IMC),161 or laser cata-
pulting.162,163 Co-detection by indexing (CODEX) allows for highly
multiplexed profiling of protein markers and has been used to deci-
pher differences in tissue composition in murine normal and diseased
spleen at single-cell resolution.159 Its applicability to clinical human
samples, however, still needs to be shown in large-scale studies.
Moreover, multiplex immunohistochemistry has enabled parallel visu-
alization of distinct immune checkpoint molecules at single-cell
resolution.164
The GeoMx/DSP platform has been applied to identify protein
markers associated with treatment outcome in melanoma,156 to evalu-
ate the PC micro-environment,158 and to assess B and T cell pheno-
types in melanoma tumors.165 Furthermore, this platform has been
used to study B cell localization in tertiary lymphoid structures using
13
TABLE 2 Overview of technical details of translational research articles highlighted in this review which describe transcriptome or immuneprofiling of single-cells
14
multiplex protein analysis,166 and to profile mRNA and protein simul-
taneously in colorectal tumor tissue.167 While this approach allows for
combined multiplex mRNA and protein analysis, the GeoMx/DSP plat-
form lacks single-cell resolution and requires a priori knowledge of
target protein markers or mRNAs together with reliable markers to
visualize tissue structure. Interestingly, GeoMx-based analysis of B
cells was complemented by technologies assessing mRNA and surface
proteins at single-cell resolution.166
In contrast, high-definition spatial transcriptomics (HDST) enables
unbiased mRNA profiling with 49% of the spatial barcodes being
assigned to a single-cell type and was successfully used to distinguish
cell types in breast cancer,160 while offering greater spatial resolution
compared to similar approaches.168,169 Combining spatial trans-
criptomics with conventional high throughput single-cell sequencing
enabled more refined spatial cell type annotations in pancreatic ductal
ADC.170 Despite these promises in spatial transcriptomics, future
developments need to improve the current sparsity of HDST and to
demonstrate compatibility of this method with Formalin-Fixed
Paraffin-Embedded (FFPE) sections, which represent the dominant
form in which solid tumor specimen are preserved to date. Interest-
ingly, the commercially available Visium chemistry (10X Genomics)
has recently been applied successfully to FFPE sections of the mouse
brain and ovarian carcinosarcoma as exemplified by a study currently
available as a preprint,171 opening up for the possibility to perform
spatial transcriptome analysis on clinical FFPE samples.
Alternatively, highly specific in-situ hybridization (RNAscope)
enables detection of mRNA molecules in FFPE tissue157 and was used
successfully for automated, quantitative profiling of HER2 status in
breast carcinoma.172 While providing cellular and subcellular resolu-
tion, a priori knowledge of targets is necessary and highly multiplexed
tissue analysis is currently not possible. However, a higher degree of
multiplexed targeted gene mRNA detection in breast cancer has been
achieved using padlock sequencing.173 Expansion of RNAscope by the
usage of oligonucleotides conjugated to metal-chelated reporters to
bind RNA-probes during the final hybridization step facilitates simul-
taneous labeling of protein structures using metal-conjugated anti-
bodies. This in turn enables simultaneous profiling of mRNA and
proteins from the same section using IMC and was shown to success-
fully correlate with mRNA and protein expression levels in a large
cohort of samples providing architectural maps of breast cancer tissue
at spatial single-cell resolution.161
In addition to spatially resolved mRNA and protein expression,
recent technological advances linked the genomic profile of a single-
cell to its position in a given tissue.162,163 Similar to both approaches
is that the tumor tissue is subjected to hematoxylin & eosin (H&E)
staining to visualize structural elements, followed by subsequent isola-
tion of single-cells using UV laser162 or isolation of groups of cells
down to single-cells using single infra-red (IR) pulses.163 This made it
possible to spatially resolve genomic aberrations occurring during an
early stage tumor such as ductal breast carcinoma,162 and holds great
potential to increase our understanding of how tumor infiltration and
invasion processes occur at the single-cell level in the context of the
tumor microenvironment.
Taken together, a broad variety of technologies allowing single-
cell readouts in a spatial context are available, which have offered
highly relevant insights by integrating different modalities, such as
mRNA and protein expression or genomic aberrations within the
tissue context. These approaches differ in their capacity of cellular
throughput, compatibility with clinical samples, single-cell resolu-
tion, preservation of tissue integrity for downstream analyses, the
degree of multiplexed detection, and the necessity of a priori
knowledge on tissue-specific targets. Therefore, selection of a
methodology for spatial tissue analysis often necessitates a
compromise with regard to several aspects. For example, FFPE-
compatible mRNA analysis at single-cell analysis requires a trade-
off in the number of transcripts, which can be processed at the
same time.
Overall, spatial tissue analysis at single-cell resolution still needs
to overcome several limitations in order to become a widely used clin-
ical diagnostic technology but given the rapid development and dem-
onstrated high promise of this technology it is likely that we will
witness significant advancement toward clinical applicability in the
near future.
7 | CHALLENGES FOR CLINICALTRANSLATION OF SINGLE-CELLSEQUENCING
As described in previous sections, not only rapid technological pro-
gress but also the potential of analyzing tumor tissue routinely at
single-cell resolution has now become feasible in a research setting.
Single-cell analyses in cancer has opened up for the possibility to
putatively aid diagnostics,148,172 to monitor treatment
response72,76,85,87,149 or to refine treatment processes59-61,72 toward
personalized therapies and has thus spurred great interest to translate
such technologies into routine clinical applications.
Single-cell analyses regardless of modality or spatial resolution,
currently requires cost intensive, large-scale sequencing reactions in
order to process a clinically informative cohort of patient samples and
Abbreviations: ADC, lung adenocarcinoma; AML, acute myeloid leukemia; ATRT, atypical teratoid/rhabdoid tumors; BCC, basal cell carcinoma; BCR, B cell
receptor; ccRCC, clear cell renal carcinoma; CML, chronic myeloid leukemia; CNV, copy number variation; CRC, colorectal cancer; CTC, circulating tumor
cells; CyTOF, cytometry by time of flight; ETMR, embryonal tumors with multilayered rosettes; FACS, fluorescence-activated cell sorting; GBM,
glioblastoma multiforme; HCC, hepatocellular carcinoma; HNSCC, head and neck squamous cacrinoma; MACS, magnetic-activated cell sorting; MARS-seq,
massively parallel RNA single-cell sequencing; MM, metastatic melanoma; NSCLC, non-small cell lung cancer; NSCL ADC, non-small-cell lung
adenocarcinoma; PC, prostate cancer; PDX, patient-derived xenograft; RCC, renal cell carcinoma; SCC, squamous cell carcinoma; SNV, single-nucleotide
variant; TCR, T cell receptor; TCR-seq, T cell receptor sequencing; UM, uveal melanoma; WGA, Whole-genome amplification; WNT MB, WNT-subtype
medulloblastoma.
15
to extract sufficient numbers of single-cells to guarantee statistically
valid analyses to infer reliable diagnostic information. Novel strategies
for multiplexing single-cell transcriptomes50,51 or single-cell
genomes133 have greatly increased the possible cellular throughput
per sample. However, in particular WES and WGS of single-cells for
de novo SNV calling necessitate high sequencing depths, rendering
these approaches currently economically challenging for large-scale
studies. Additionally, increasing single-cell throughput together with
multiomic readouts and higher-dimensional data require sophisticated
expertise for data analysis in addition to large computational infra-
structure. Approaches of combined low and high coverage
analyses,130,141,147 targeted qPCR-based analyses153,174 and more
recently microfluidic droplet-based analysis149,175 present more cost
effective alternatives to assess genomic aberration in cancer.
Alternatively, inference of genomic alterations from less cost
intensive mRNA-sequencing strategies may provide an attractive
strategy to facilitate introduction of single-cell sequencing in a clinical
diagnostic setting.35,37,38,71,73,74,76 However, such approaches provide
only indirect genomic information, which is furthermore limited due
to the necessity that a genomic lesion needs to be manifested on the
mRNA level which on top of that can be captured by the chosen
single-cell chemistry. Also, most solid tumor samples are stored as
FFPE tissue blocks which often yield low quality mRNA thus render-
ing single-cell transcriptomics challenging.
While Smart-seq2-based full-length mRNA-sequencing at single-
cell resolution generates more transcriptional information which can
be used to infer genomic aberrations, its analysis cost per cell on
sorted plates amounts to approximately 30 USD as opposed to 4 USD
on commercial platforms such as the iCell8cx. In contrast, droplet-
based mRNA-sequencing on the 10X Chromium provides a more cost
effective alternative and processing cost of 0.5USD per cell. For fur-
ther comparison of different platforms, methodologies, cohort size in
selected reference literature see Tables 2 and 3.
Initially, single-cell studies focused on manual isolation of indi-
vidual cells,128,129 an approach that does not meet the required cellu-
lar throughput for clinical applications. Technological advancements
such as the use of microwells43,49 and nanodroplets44-46 have greatly
increased the cellular throughput, however, such methods require
rather large amounts of starting cell numbers and provide compara-
bly low capturing rates,45,46 thereby rendering these technologies
less favorable in a clinical setting when sample size may be small and
limiting. The necessity to enrich for distinct cell populations via anti-
body staining and flow cytometry provides another example of sam-
ple loss, which becomes particularly unfavorable when analyzing
low input samples and rare cell types such as CTCs.60,61,146-148 In
order to generate clinically valuable results, capturing of extremely
rare clones in a tumor sample must be guaranteed which in turn
requires processing of patient-derived samples in their entirety
agnostic of sample size. Platforms to isolate scarce clinical samples
for high throughput analysis, for example, the sciFLEXARRAYER S3
or cellenONE systems, are becoming available and have recently
been used successfully for low-coverage CNV profiling of human
breast cancer samples.137
Another important obstacle in order to achieve clinical translation
of single-cell technologies is sample comparability with regards to sam-
ple isolation, molecular characterization and downstream computational
analyses. It is known, that cell dissociation strategies can alter transcrip-
tional signatures155 and that cell isolation needs to be evaluated care-
fully prior to starting an experiment.176 Furthermore, it has been shown
that gene expression patterns are induced which are distinct for
biopsy- and autopsy-derived brain samples177 and which biases down-
stream analyses. Thus, the effect of sample isolation, storage and sam-
ple type on the modality analyzed (mRNA, DNA, protein) needs to be
systematically assessed in order to define common classifiers to facili-
tate comparability of results between different research centers and
across a large space of clinical samples. Integration of heterogeneous
data sets obtained at different research centers using different method-
ologies will require advanced batch correction24 and computational
tools to combine these data sets in a meaningful way while minimizing
overcorrection and maintaining relevant biological differences.178
Extensive benchmarking of single-cell technologies179 together
with interrogation of sampling artefacts180 are pivotal in order to
achieve overall comparability of test results. A recent study has carried
out a systematic comparison of different cell and nuclei isolation strate-
gies on a diverse range of clinical cancer samples. Testing several isola-
tion protocols per sample type, the authors based their evaluation
among other metrics on the cellular diversity a given protocol would
reproduce. Taken together, this study provides an extensive resource
suggesting highly specified isolation protocols for various cancer tissues
with the overarching goal to provide standardizable, robust and compa-
rable single-cell workflows for the use in a clinical setting.181
The results of an interrogation of clinical samples by single-cell
analysis may vary depending on the platform used. It is therefore cru-
cial to systematically assess existing platforms for single-cell genome
and spatial tissue analysis, similarly to single-cell transcriptome
chemistries,54,55 to determine most suitable applications and experi-
mental conditions for defined clinical questions.
With regards to mRNA analysis, definition of a statistically critical
number of cells for computational analyses, optimal sequencing depth in
addition to the minimum number of cells necessary to define a cell type
or state, among others, will be crucial for improved comparability. In order
to provide sequencing data, which allow for comparable mutation analysis
and variant calling, a unified way to define clonal structures will need to
be established.134 Novel computational tools are required to manage data
analysis in increasingly large data sets, as has been demonstrated previ-
ously.52,178 Computational challenges associated with the analysis of can-
cer samples by single-cell transcriptomics are reviewed elsewhere.26 In
addition, analysis of longitudinal clinical samples is likely to provide insight
into disease progression and treatment response.175 In order to relate lon-
gitudinal samples, existing trajectory inference methods182,183 need to be
developed further to integrate different modalities of the same sample
along a pseudo-time axis, to resolve multiple clonal subtypes and to per-
form high-dimensional comparative analyses against large patient cohorts
while maintaining the longitudinal order of cell states and clones.
Moreover, to reach clinical validity, existing single-cell methodolo-
gies need to possess optimal detection sensitivity and specificity. The
16
TABLE 3 Overview of technical details of translational research articles highlighted in this review, which describe genomic, epigenetic, orspatial analyses
17
Abbreviations: ADC, adenocarcinoma; AML, acute myeloid leukemia; ALL, acute lymphoblastic leukaemia; ATAC, assay for transposase-accessible
chromatin; BCC, basal cell carcinoma; CITEseq, cellular indexing of transcriptomes and epitopes by sequencing; CLL, chronic lymphocytic leukemia; CNA,
copy number alteration; CNV, copy number variation; DSP, digital spatial profiler; FACS, fluorescence-activated cell sorting; FL, follicular lymphoma; HDST,
high density spatial transcriptomics; HCC, hepatocellular carcinoma; MPAL, mixed-phenotype acute leukemia; MRD AML, minimal residual disease acute
myeloid leukemia; PC, protstate cancer; PDAC, pancreatic ductal adenocarcinoma; PHLI-seq, phenotype-based high-throughput laser-aided isolation and
sequencing; QRP DOP-PCR, quasi-random priming degenerate oligonucleotide primed polymerase chain reaction; RCC, renal cell carcinoma; SNV, single-
nucleotide variant; SCLC, small cell lung cancer; SS, synovial sacroma; ST, spatial transcriptomics; TNBC, triple-negative breast cancer; WES, whole-exome
sequencing; WGS, whole-genome sequencing.
F IGURE 3 Timeline illustrating key references utilizing single-cell technology in translational cancer research for transcriptome analysis andimmune profiling. Chronological appearance of selected references highlighted in this review in which RNA analysis and immune profiling in thecontext of various different cancer types were performed. At the bottom, schematic illustration of key platforms used over time to isolate and
process single cells such as FACS isolation into PCR plates, microfluidic devices such as the Fluidigm C1, droplet-based technologies such as 10XGenomics, laser-catapulting, imaging mass cytometry and spatially resolved analysis of the transcriptome. AML = acute myeloid leukemia;ATRT = atypical teratoid/rhabdoid tumors; CML = chronic myeloid leukemia; CTCs = circulating tumor cells; CyTOF = cytometry by time of flight;ETMR = embryonal tumors with multilayered rosettes; FACS = fluorescence-activated cell sorting; GBM = glioblastoma multiforme;HNSCC = head and neck squamous carcinoma; MARS-seq = massively parallel RNA single-cell sequencing; PDX = patient-derived xenograft;TCR-seq = T cell receptor sequencing; WGA = whole-genome amplification; WNT MB = WNT-subtype medulloblastoma (n.a.: not available/disclosed in article)
18
minute starting amounts of mRNA or DNA in single-cells often require
extensive amplification prior to sequencing which introduces technical
noise such as allelic dropouts or polymerase-induced errors,134 as well
as dropout events resulting from transcripts which were not captured
during reverse transcription.184 Such noise impairs overall detection
confidence. Both continuous improvement of sample preparation and
F IGURE 4 Timeline illustrating key references utilizing single-cell technology to study epigenetic alterations and genomic aberrations in singlecancer cells as well as selected references studying cancer tissues with spatial resolution. Chronological appearance of selected research articleshighlighted in this review assessing epigenetic and DNA changes in cancer samples and the progressive emergence of studies spatially resolving
different modalities such as mRNA and protein expression as well as genomic alterations in cancer tissue. Schematic representations of keyplatforms frequently used in various different single-cell studies such as FACS isolation into PCR plates, microfluidic devices such as the FluidigmC1, droplet-based technologies such as 10X Genomics, laser-catapulting, imaging mass cytometry and spatially resolved analysis of thetranscriptome. ALL = acute lymphoblastic leukaemia; AML = acute myeloid leukemia; ATAC-seq = assay for transposase-accessible chromatin;CITE-seq = cellular indexing of transcriptomes and epitopes by sequencing; CLL = chronic lymphocytic leukemia; CML = chronic myeloidleukemia; DOP PCR = degenerate oligonucleotide primed polymerase chain reaction; FACS = fluorescence-activated cell sorting; GeoMxDSP = GeoMX Digital spatial profiler; GS = genome sequencing; HDST = high density spatial transcriptomics; IP = immunoprecipitation;MALBAC = multiple annealing and looping based amplification cycles; MDA = multiple displacement amplification; MRD AML = minimal residualdisease AML; PDAC = pancreatic ductal adenocarcinoma; QRP DOP-PCR = quasi-random priming DOP-PCR; PHLI-seq = phenotype-based high-throughput laser-aided isolation and sequencing; RCA = rolling circle amplification; SCLC = small cell lung cancer; ST = spatial transcriptomics;TNBC = triple-negative breast cancer; Trio-seq = triple omics sequencing; WES = whole-exome sequencing; WGA = whole-genome amplification(n.a.: not available/disclosed in article)
19
computational models are required to correct for such errors for any
platform and chemistry which is intended to be used in a clinical set-
ting, as exemplified here for single-cell mRNA-sequencing.184,185
Advances in this field will facilitate not only the cellular
deconvolution of cancer tissues but also to build cancer-specific clas-
sifiers based on several modalities, thereby refining current cancer
classification and treatment of cancer. Ultimately, single-cell data of
distinct modalities will need to be put into the context of the tumor
tissue, where transcriptomic and genomic signatures are translated
into altered functionality of cells in the diseased state.
8 | CONCLUSIONS
The tremendous technological development in single-cell sequencing of
the past decade has yielded a broad toolbox to study many modalities in
cancer, such as mRNA,37,38,57,59 DNA alterations,141,144,146-149,154,175
immune cell composition of tumors,85-87,90,92,95 chromatin
changes116-118,122 and metabolic effectors97 in dissociated single-cells or
nuclei as well as within the context of diseased tissue156-164,167,172
(Figures 3 and 4). Mono-, bi-, or even-multimodal approaches have
within short time facilitated cancer research at unprecedented depth and
gained invaluable information on tumor composition and classification,
clonal evolution in cancer, disease progression and treatment response.
Nevertheless, many methods fall short in providing information
on tissue context, which can provide further information of prognostic
value.80,186 The achievement of clinical translation of spatially
resolved methodologies depends, among others, on the ability to com-
prehensively analyse high dimensional data comprised of information
on cell type and state, cell boundaries to adjacent cells together with
the cells location within the tissue. This in turn requires the develop-
ment of computational tools which allow for robust identification of
given patterns of cells in a tissue or tissue motifs,187 which then may
enable for in silico construction of tissue network structures and to
ultimately infer pathological processes, necessary to classify patients
and to aid diagnosis.
While current technical limitations prevent broad clinical application
of the aforementioned methodologies, it seems clear that single-cell ana-
lyses will become an integral part in clinical diagnostics, prognostication,
disease follow-up, and treatment selection in the next coming years. This
is strongly emphasized by the large number of studies and their diverse
scope employing transcriptome-sequencing of single cells (Figure 3). In
addition, existing single-cell DNA applications often present sufficient
analytical validity and additional refinement regarding detection sensitivity
and specificity of those methods may ultimately render bulk WGS/WES
obsolete, which are currently often used to either substantiate findings
obtained via single-cell sequencing144 or to nominate genomic lesions for
targeted single-cell analysis.149,154,175 Further, each existing technology
possesses specific opportunities but also technical shortcomings, which
will affect their analytical validity and which will therefore lead to varying
time frames for clinical translation. Nevertheless, the literature highlighted
in this review clearly demonstrates the applicability and usefulness of
single-cell analysis in cancer research and diagnostics.
CONFLICT OF INTEREST
The authors declare no potential conflict of interest.
DATA AVAILABILITY STATEMENT
Data sharing not applicable to this article as no datasets were gener-
ated or analysed during the current study.
ORCID
Ulrich Pfisterer https://orcid.org/0000-0002-4613-6427
REFERENCES
1. Burrell RA, Mcgranahan N, Bartek J, Swanton C. The causes and
consequences of genetic heterogeneity in cancer evolution. Nature.
2013;501:338-345. https://doi.org/10.1038/nature12625.
2. Jones PA, Issa JPJ, Baylin S. Targeting the cancer epigenome for
therapy. Nat Rev Genet. 2016;17:630-641. https://doi.org/10.1038/
nrg.2016.93.
3. Wouters BJ, Delwel R. Epigenetics and approaches to targeted epi-
genetic therapy in acute myeloid leukemia. Blood. 2016;127:42-52.
https://doi.org/10.1182/blood-2015-07-604512.
4. Landau DA, Carter SL, Stojanov P, et al. Evolution and impact of sub-
clonal mutations in chronic lymphocytic leukemia. Cell. 2013;152(4):
714-726. https://doi.org/10.1016/j.cell.2013.01.019.
5. Anderson K, Lutz C, Van Delft FW, et al. Genetic variegation of
clonal architecture and propagating cells in leukaemia. Nature. 2011;
469(7330):356-361. https://doi.org/10.1038/nature09650.
6. Metzker ML. Sequencing technologies the next generation. Nat Rev
Genet. 2010;11(1):31-46. https://doi.org/10.1038/nrg2626.
7. Shendure J, Ji H. Next-generation DNA sequencing. Nat Biotechnol.
2008;26(10):1135-1145. https://doi.org/10.1038/nbt1486.
8. Rozenblatt-Rosen O, Regev A, Oberdoerffer P, et al. The human
tumor atlas network: charting tumor transitions across space and
time at single-cell resolution. Cell. 2020;181(2):236-249. https://doi.
org/10.1016/j.cell.2020.03.053.
9. Ley TJ, Miller C, Ding L, et al. Genomic and epigenomic landscapes
of adult de novo acute myeloid leukemia. N Engl J Med. 2013;368
(22):2059-2074. https://doi.org/10.1056/NEJMoa1301689.
10. Abeshouse A, Adebamowo C, Adebamowo SN, et al. Comprehensive
and integrated genomic characterization of adult soft tissue sarco-
mas. Cell. 2017;171(4):950-965.e28. https://doi.org/10.1016/j.cell.
2017.10.014.
11. Koboldt DC, Fulton RS, McLellan MD, et al. Comprehensive molecu-
lar portraits of human breast tumours. Nature. 2012;490(7418):61-
70. https://doi.org/10.1038/nature11412.
12. Ciriello G, Gatza ML, Beck AH, et al. Comprehensive molecular por-
traits of invasive lobular breast Cancer. Cell. 2015;163(2):506-519.
https://doi.org/10.1016/j.cell.2015.09.033.
13. Tirosh I, Suvà ML. Deciphering human tumor biology by single-cell
expression profiling. Annu Rev Cancer Biol. 2019;3(1):151-166.
https://doi.org/10.1146/annurev-cancerbio-030518-055609.
14. Lawson DA, Kessenbrock K, Davis RT, Pervolarakis N, Werb Z.
Tumour heterogeneity and metastasis at single-cell resolution. Nat
Cell Biol. 2018;20(12):1349-1360. https://doi.org/10.1038/s41556-
018-0236-7.
15. Stuart T, Satija R. Integrative single-cell analysis. Nat Rev Genet.
2019;1:257-272. https://doi.org/10.1038/s41576-019-0093-7.
16. Fuzik J, Zeisel A, Mate Z, et al. Integration of electrophysiological
recordings with single-cell RNA-seq data identifies neuronal subtypes.
Nat Biotechnol. 2016;34:175-183. https://doi.org/10.1038/nbt.3443.
17. Cadwell CR, Palasantza A, Jiang X, et al. Electrophysiological, trans-
criptomic and morphologic profiling of single neurons using patch-seq.
Nat Biotechnol. 2015;34:199-203. https://doi.org/10.1038/nbt.3445.
20
18. Chen S, Lake BB, Zhang K. High-throughput sequencing of the
transcriptome and chromatin accessibility in the same cell. Nat Bio-
technol. 2019;37:1452-1457. https://doi.org/10.1038/s41587-019-
0290-0.
19. Stoeckius M, Hafemeister C, Stephenson W, et al. Simultaneous epi-
tope and transcriptome measurement in single cells. Nat Methods.
2017;14:865-868. https://doi.org/10.1038/nmeth.4380.
20. Peterson VM, Zhang KX, Kumar N, et al. Multiplexed quantification
of proteins and transcripts in single cells. Nat Biotechnol. 2017;35
(10):936-939. https://doi.org/10.1038/nbt.3973.
21. Macaulay IC, Ponting CP, Voet T. Single-cell Multiomics: multiple
measurements from single cells. Trends Genet. 2017;33:155-168.
https://doi.org/10.1016/j.tig.2016.12.003.
22. Macaulay IC, Haerty W, Kumar P, et al. G&T-seq: parallel sequencing
of single-cell genomes and transcriptomes. Nat Methods. 2015;12:
519-522. https://doi.org/10.1038/nmeth.3370.
23. Jaitin DA, Kenigsberg E, Keren-Shaul H, et al. Massively parallel
single-cell RNA-seq for marker-free decomposition of tissues into
cell types. Science. 2014;343:776-779. https://doi.org/10.1126/
science.1247651.
24. Luecken MD, Theis FJ. Current best practices in single-cell RNA-seq
analysis: a tutorial. Mol Syst Biol. 2019;15(6):e8746. https://doi.org/
10.15252/msb.20188746.
25. La Manno G, Soldatov R, Zeisel A, et al. RNA velocity of single cells.
Nature. 2018;560(7719):494-498. https://doi.org/10.1038/s41586-
018-0414-6.
26. Fan J, Slowikowski K, Zhang F. Single-cell transcriptomics in
cancer: computational challenges and opportunities. Exp Mol
Med. 2020;52(9):1452-1465. https://doi.org/10.1038/s12276-
020-0422-0.
27. Burke W. Clinical validity and clinical utility of genetic tests. Curr
Protoc Hum Genet. 2004;42:15.1-15.6. https://doi.org/10.1002/
0471142905.hg0915s42 Chap. 9.
28. Burke W. Genetic tests: clinical validity and clinical utility. Curr Pro-
toc Hum Genet. 2014;81:1-14. https://doi.org/10.1002/
0471142905.hg0915s81.
29. Katsanis SH, Katsanis N. Molecular genetic testing and the future of
clinical genomics. Nat Rev Genet. 2013;14(6):415-426. https://doi.
org/10.1038/nrg3493.
30. Han X, Wang R, Zhou Y, et al. Mapping the mouse cell atlas by
microwell-Seq. Cell. 2018;172(5):1091-1097.e17. https://doi.org/
10.1016/j.cell.2018.02.001.
31. Regev A, Teichmann SA, Lander ES, et al. The human cell atlas. Elife.
2017;6:e27041. https://doi.org/10.7554/eLife.27041.
32. Schaum N, Karkanias J, Neff NF, et al. Single-cell transcriptomics of
20 mouse organs creates a tabula Muris. Nature. 2018;562:367-372.
https://doi.org/10.1038/s41586-018-0590-4.
33. Rozenblatt-Rosen O, Stubbington MJT, Regev A, Teichmann SA.
The human cell atlas: from vision to reality. Nature. 2017;550:451-
453. https://doi.org/10.1038/550451a.
34. Velmeshev D, Schirmer L, Jung D, et al. Single-cell genomics iden-
tifies cell type–specific molecular changes in autism. Science. 2019;
364:685-689. https://doi.org/10.1126/science.aav8130.
35. Giustacchini A, Thongjuea S, Barkas N, et al. Single-cell trans-
criptomics uncovers distinct molecular signatures of stem cells in
chronic myeloid leukemia. Nat Med. 2017;23(6):692-702. https://
doi.org/10.1038/nm.4336.
36. Zhang F, Wei K, Slowikowski K, et al. Defining inflammatory cell
states in rheumatoid arthritis joint synovial tissues by integrating
single-cell transcriptomics and mass cytometry. Nat Immunol. 2019;
20:928-942. https://doi.org/10.1038/s41590-019-0378-1.
37. Venteicher AS, Tirosh I, Hebert C, et al. Decoupling genetics, line-
ages, and microenvironment in IDH-mutant gliomas by single-cell
RNA-seq. Science. 2017;355:eaai8478. https://doi.org/10.1126/
science.aai8478.
38. Patel AP, Tirosh I, Trombetta JJ, et al. Single-cell RNA-seq highlights
intratumoral heterogeneity in primary glioblastoma. Science. 2014;
344:1396-1401. https://doi.org/10.1126/science.1254257.
39. Skene NG, Bryois J, Bakken TE, et al. Genetic identification of brain
cell types underlying schizophrenia. Nat Genet. 2018;50:825-833.
https://doi.org/10.1038/s41588-018-0129-5.
40. Picelli S, Faridani OR, Bjorklund AK, Winberg G, Sagasser S,
Sandberg R. Full-length RNA-seq from single cells using smart-seq2.
Nat Protoc. 2014;9(1):171-181. https://doi.org/10.1038/nprot.
2014.006.
41. Hagemann-Jensen M, Ziegenhain C, Chen P, et al. Single-cell RNA
counting at allele- and isoform-resolution using smart-seq3. Nat Bio-
technol. 2020;38:708-714. https://doi.org/10.1038/s41587-020-
0497-0.
42. Islam S, Kjällquist U, Moliner A, et al. Highly multiplexed and strand-
specific single-cell RNA 50 end sequencing. Nat Protoc. 2012;7:813-
828. https://doi.org/10.1038/nprot.2012.022.
43. Fan HC, Fu GK, Fodor SP. Expression profiling. Combinatorial labeling
of single cells for gene expression cytometry. Science. 2015;347
(6222):1258367. https://doi.org/10.1126/science.1258367.
44. Klein AM, Mazutis L, Akartuna I, et al. Droplet barcoding for single-
cell transcriptomics applied to embryonic stem cells. Cell. 2015;161
(5):1187-1201. https://doi.org/10.1016/j.cell.2015.04.044.
45. Macosko EZ, Basu A, Satija R, et al. Highly parallel genome-wide
expression profiling of individual cells using nanoliter droplets. Cell.
2015;161:1202-1214. https://doi.org/10.1016/j.cell.2015.05.002.
46. Zheng GXY, Terry JM, Belgrader P, et al. Massively parallel digital
transcriptional profiling of single cells. Nat Commun. 2017;8:14049.
https://doi.org/10.1038/ncomms14049.
47. Hashimshony T, Senderovich N, Avital G, et al. CEL-Seq2: sensitive
highly-multiplexed single-cell RNA-Seq. Genome Biol. 2016;17(1):1-
7. https://doi.org/10.1186/s13059-016-0938-8.
48. Sasagawa Y, Danno H, Takada H, et al. Quartz-Seq2: a high-
throughput single-cell RNA-sequencing method that effectively uses
limited sequence reads. Genome Biol. 2018;19(1):29. https://doi.org/
10.1186/s13059-018-1407-3.
49. Gierahn TM, Wadsworth MH, Hughes TK, et al. Seq-well: portable,
low-cost rna sequencing of single cells at high throughput. Nat
Methods. 2017;14(4):395-398. https://doi.org/10.1038/nmeth.4179.
50. Rosenberg AB, Roco CM, Muscat RA, et al. Single-cell profiling of
the developing mouse brain and spinal cord with split-pool
barcoding. Science. 2018;360:176-182. https://doi.org/10.1126/
science.aam8999.
51. Datlinger P, Rendeiro AF, Boenke T, Krausgruber T, Barreca D,
Bock C. Ultra-high throughput single-cell RNA sequencing by combi-
natorial fluidic indexing. bioRxiv. 2019;1-27. https://doi.org/10.
1101/2019.12.17.879304.
52. Saunders A, Macosko EZ, Wysoker A, et al. Molecular diversity and
specializations among the cells of the adult mouse brain. Cell. 2018;
174:1015-1030.e16. https://doi.org/10.1016/j.cell.2018.07.028.
53. Svensson V, Vento-Tormo R, Teichmann SA. Exponential scaling of
single-cell RNA-seq in the past decade. Nat Protoc. 2018;13(4):599-
604. https://doi.org/10.1038/nprot.2017.149.
54. Ziegenhain C, Vieth B, Parekh S, et al. Comparative analysis of
single-cell RNA sequencing methods. Mol Cell. 2017;65(4):631-643.
e4. https://doi.org/10.1016/j.molcel.2017.01.023.
55. Zhang X, Li T, Liu F, et al. Comparative analysis of droplet-based
ultra-high-throughput single-cell RNA-Seq systems. Mol Cell.
2019;73(1):130-142.e5. https://doi.org/10.1016/j.molcel.2018.
10.020.
56. Ramskold D, Luo S, Wang YC, et al. Full-length mRNA-Seq from
single-cell levels of RNA and individual circulating tumor cells. Nat
Biotechnol. 2012;30(8):777-782. https://doi.org/10.1038/nbt.2282.
57. Lee MCW, Lopez-Diaz FJ, Khan SY, et al. Single-cell analyses of
transcriptional heterogeneity during drug tolerance transition in
21
cancer cells by RNA sequencing. Proc Natl Acad Sci U S A. 2014;111
(44):E4726-E4735. https://doi.org/10.1073/pnas.1404656111.
58. Kim KT, Lee HW, Lee HO, et al. Single-cell mRNA sequencing iden-
tifies subclonal heterogeneity in anti-cancer drug responses of lung
adenocarcinoma cells. Genome Biol. 2015;16(1):1-15. https://doi.
org/10.1186/s13059-015-0692-3.
59. Kim KT, Lee HW, Lee HO, et al. Application of single-cell RNA
sequencing in optimizing a combinatorial therapeutic strategy in
metastatic renal cell carcinoma. Genome Biol. 2016;17:80. https://
doi.org/10.1186/s13059-016-0945-9.
60. Miyamoto DT, Zheng Y, Wittner BS, et al. RNA-Seq of single pros-
tate CTCs implicates noncanonical Wnt signaling in antiandrogen
resistance. Science. 2015;349(6254):1351-1356. https://doi.org/10.
1126/science.aab0917.
61. Jordan NV, Bardia A, Wittner BS, et al. HER2 expression identifies
dynamic functional states within circulating breast cancer cells. Nature.
2016;537(7618):102-106. https://doi.org/10.1038/nature19328.
62. Gao R, Kim C, Sei E, et al. Nanogrid single-nucleus RNA sequencing
reveals phenotypic diversity in breast cancer. Nat Commun. 2017;8
(1):228. https://doi.org/10.1038/s41467-017-00244-w.
63. Lambrechts D, Wauters E, Boeckx B, et al. Phenotype molding of
stromal cells in the lung tumor microenvironment. Nat Med. 2018;24
(8):1277-1289. https://doi.org/10.1038/s41591-018-0096-5.
64. Goveia J, Rohlenova K, Taverna F, et al. An integrated gene expres-
sion landscape profiling approach to identify lung tumor endothelial
cell heterogeneity and Angiogenic candidates. Cancer Cell. 2020;37
(1):21-36.e13. https://doi.org/10.1016/j.ccell.2019.12.001.
65. Young MD, Mitchell TJ, Vieira Braga FA, et al. Single-cell trans-
criptomes from human kidneys reveal the cellular identity of renal
tumors. Science. 2018;361(6402):594-599. https://doi.org/10.1126/
science.aat1699.
66. Jessa S, Blanchet-Cohen A, Krug B, et al. Stalled developmental programs
at the root of pediatric brain tumors. Nat Genet. 2019;51:1702-1713.
https://doi.org/10.1038/s41588-019-0531-7.
67. Müller S, Kohanbash G, Liu SJ, et al. Single-cell profiling of human gli-
omas reveals macrophage ontogeny as a basis for regional differences
in macrophage activation in the tumor microenvironment. Genome
Biol. 2017;18(1):1-14. https://doi.org/10.1186/s13059-017-1362-4.
68. Puram SV, Tirosh I, Parikh AS, et al. Single-cell Transcriptomic analy-
sis of primary and metastatic tumor ecosystems in head and neck
Cancer. Cell. 2017;171(7):1611-1624.e24. https://doi.org/10.1016/
j.cell.2017.10.044.
69. Li H, Courtois ET, Sengupta D, et al. Reference component analysis
of single-cell transcriptomes elucidates cellular heterogeneity in
human colorectal tumors. Nat Genet. 2017;49(5):708-718. https://
doi.org/10.1038/ng.3818.
70. Savage P, Blanchet-Cohen A, Revil T, et al. A targetable EGFR-
dependent tumor-initiating program in breast Cancer. Cell Rep. 2017;
21(5):1140-1149. https://doi.org/10.1016/j.celrep.2017.10.015.
71. Chung W, Eum HH, Lee HO, et al. Single-cell RNA-seq enables com-
prehensive tumour and immune cell profiling in primary breast can-
cer. Nat Commun. 2017;8(May):1-12. https://doi.org/10.1038/
ncomms15081.
72. Brady SW, McQuerry JA, Qiao Y, et al. Combating subclonal evolu-
tion of resistant cancer phenotypes. Nat Commun. 2017;8(1):1231.
https://doi.org/10.1038/s41467-017-01174-3.
73. Tirosh I, Izar B, Prakadan SM, et al. Dissecting the multicellular eco-
system of metastatic melanoma by single-cell RNA-seq. Science.
2016;352(6282):189-196. https://doi.org/10.1126/science.aad0501.
74. Filbin MG, Tirosh I, Hovestadt V, et al. Developmental and oncogenic
programs in H3K27M gliomas dissected by single-cell RNA-seq. Science.
2018;360(6386):331-335. https://doi.org/10.1126/science.aao4750.
75. Tirosh I, Venteicher AS, Hebert C, et al. Single-cell RNA-seq supports
a developmental hierarchy in human oligodendroglioma. Nature.
2016;539(7628):309-313. https://doi.org/10.1038/nature20123.
76. Van Galen P, Hovestadt V, Ii MHW, et al. Single-cell RNA-Seq
reveals AML hierarchies relevant to disease progression and immu-
nity article single-cell RNA-Seq reveals AML hierarchies relevant to
disease progression and immunity. Cell. 2019;176:1-17. https://doi.
org/10.1016/j.cell.2019.01.031.
77. Petti AA, Williams SR, Miller CA, et al. A general approach for
detecting expressed mutations in AML cells using single cell RNA-
sequencing. Nat Commun. 2019;10(1):3660. https://doi.org/10.
1038/s41467-019-11591-1.
78. Durante MA, Rodriguez DA, Kurtenbach S, et al. Single-cell analysis
reveals new evolutionary complexity in uveal melanoma. Nat
Commun. 2020;11(1):496. https://doi.org/10.1038/s41467-019-
14256-1.
79. Palmer S, Albergante L, Blackburn CC, Newman TJ. Thymic involution
and rising disease incidence with age. Proc Natl Acad Sci U S A. 2018;
115(8):1883-1888. https://doi.org/10.1073/pnas.1714478115.
80. Pagès F, Galon J, Dieu-Nosjean MC, Tartour E, Sautès-Fridman C,
Fridman WH. Immune infiltration in human tumors: a prognostic fac-
tor that should not be ignored. Oncogene. 2010;29(8):1093-1102.
https://doi.org/10.1038/onc.2009.416.
81. Han A, Glanville J, Hansmann L, Davis MM. Linking T-cell receptor
sequence to functional phenotype at the single-cell level. Nat Bio-
technol. 2014;32(7):684-692. https://doi.org/10.1038/nbt.2938.
82. Zemmour D, Zilionis R, Kiner E, Klein AM, Mathis D, Benoist C.
Single-cell gene expression reveals a landscape of regulatory T cell
phenotypes shaped by the TCR article. Nat Immunol. 2018;19(3):
291-301. https://doi.org/10.1038/s41590-018-0051-0.
83. Moral JA, Leung J, Rojas LA, et al. ILC2s amplify PD-1 blockade by
activating tissue-specific cancer immunity. Nature. 2020;579(7797):
130-135. https://doi.org/10.1038/s41586-020-2015-4.
84. Li H, van der Leun AM, Yofe I, et al. Dysfunctional CD8 T cells form
a proliferative, dynamically regulated compartment within human
melanoma. Cell. 2019;176(4):775-789.e18. https://doi.org/10.1016/
j.cell.2018.11.043.
85. Fairfax BP, Taylor CA, Watson RA, et al. Peripheral CD8+ T cell char-
acteristics associated with durable responses to immune checkpoint
blockade in patients with metastatic melanoma. Nat Med. 2020;26
(2):193-199. https://doi.org/10.1038/s41591-019-0734-6.
86. Azizi E, Carr AJ, Plitas G, et al. Single-cell map of diverse immune
phenotypes in the breast tumor microenvironment. Cell. 2018;174
(5):1293-1308.e36. https://doi.org/10.1016/j.cell.2018.05.060.
87. Wu TD, Madireddi S, de Almeida PE, et al. Peripheral T cell
expansion predicts tumour infiltration and clinical response. Nature.
2020;579(7798):274-278. https://doi.org/10.1038/s41586-020-
2056-8.
88. Park JE, Botting RA, Conde CD, et al. A cell atlas of human thymic
development defines T cell repertoire formation. Science. 2020;367
(6480):eaay3224. https://doi.org/10.1126/science.aay3224.
89. Savas P, Virassamy B, Ye C, et al. Single-cell profiling of breast can-
cer T cells reveals a tissue-resident memory subset associated with
improved prognosis. Nat Med. 2018;24(7):986-993. https://doi.org/
10.1038/s41591-018-0078-7.
90. Zheng C, Zheng L, Yoo JK, et al. Landscape of infiltrating T cells in
liver Cancer revealed by single-cell sequencing. Cell. 2017;169(7):
1342-1356.e16. https://doi.org/10.1016/j.cell.2017.05.035.
91. Zhang Q, He Y, Luo N, et al. Landscape and dynamics of single
immune cells in hepatocellular carcinoma. Cell. 2019;179(4):
829-845.e20. https://doi.org/10.1016/j.cell.2019.10.003.
92. Zilionis R, Engblom C, Pfirschke C, et al. Single-cell Transcriptomics
of human and mouse lung cancers reveals conserved myeloid
populations across individuals and species. Immunity. 2019;50(5):
1317-1334.e10. https://doi.org/10.1016/j.immuni.2019.03.009.
93. Guo X, Zhang Y, Zheng L, et al. Global characterization of T cells in
non-small-cell lung cancer by single-cell sequencing. Nat Med. 2018;
24(7):978-985. https://doi.org/10.1038/s41591-018-0045-3.
22
94. Zhang L, Yu X, Zheng L, et al. Lineage tracking reveals dynamic rela-
tionships of T cells in colorectal cancer. Nature. 2018;564(7735):
268-272. https://doi.org/10.1038/s41586-018-0694-x.
95. Yost KE, Satpathy AT, Wells DK, et al. Clonal replacement of tumor-
specific T cells following PD-1 blockade. Nat Med. 2019;25(8):1251-
1259. https://doi.org/10.1038/s41591-019-0522-3.
96. Goswami S, Walle T, Cornish AE, et al. Immune profiling of human
tumors identifies CD73 as a combinatorial target in glioblastoma. Nat
Med. 2020;26(1):39-46. https://doi.org/10.1038/s41591-019-0694-x.
97. Hartmann FJ, Mrdjen D, McCaffrey E, et al. Single-cell metabolic
profiling of human cytotoxic T cells. Nat Biotechnol. 2020;39:
186–197. https://doi.org/10.1101/2020.01.17.909796.98. Lavin Y, Kobayashi S, Leader A, et al. Innate immune landscape in
early lung adenocarcinoma by paired single-cell analyses. Cell. 2017;
169(4):750-765.e17. https://doi.org/10.1016/j.cell.2017.04.014.
99. Wilting RH, Dannenberg JH. Epigenetic mechanisms in tumorigene-
sis, tumor cell heterogeneity and drug resistance. Drug Resist Updat.
2012;15(1-2):21-38. https://doi.org/10.1016/j.drup.2012.01.008.
100. Darwiche N. Epigenetic mechanisms and the hallmarks of cancer: an
intimate affair. Am J Cancer Res. 2020;10(7):1954-1978.
101. Ehrlich M. DNA methylation in cancer: too much, but also too little.
Oncogene. 2002;21(35):5400-5413. https://doi.org/10.1038/sj.onc.
1205651.
102. Audia JE, Campbell RM. Histone modifications and Cancer. Cold
Spring Harb Perspect Biol. 2016;8(4):a019521. https://doi.org/10.
1101/cshperspect.a019521.
103. Cheng Y, He C, Wang M, et al. Targeting epigenetic regulators for
cancer therapy: mechanisms and advances in clinical trials. Signal
Transduct Target Ther. 2019;4(1):62. https://doi.org/10.1038/
s41392-019-0095-0.
104. Issa JP, Garcia-Manero G, Giles FJ, et al. Phase 1 study of low-dose
prolonged exposure schedules of the hypomethylating agent 5-aza-
20-deoxycytidine (decitabine) in hematopoietic malignancies. Blood.
2004;103(5):1635-1640. https://doi.org/10.1182/blood-2003-03-
0687.
105. Kantarjian H, Oki Y, Garcia-Manero G, et al. Results of a randomized
study of 3 schedules of low-dose decitabine in higher-risk
myelodysplastic syndrome and chronic myelomonocytic leukemia.
Blood. 2007;109(1):52-57. https://doi.org/10.1182/blood-2006-05-
021162.
106. Issa JP, Kantarjian HM. Targeting DNA methylation. Clin Cancer Res.
2009;15(12):3938-3946. https://doi.org/10.1158/1078-0432.CCR-
08-2783.
107. Takeshima H, Yoda Y, Wakabayashi M, Hattori N, Yamashita S,
Ushijima T. Low-dose DNA demethylating therapy induces repro-
gramming of diverse cancer-related pathways at the single-cell level.
Clin Epigenetics. 2020;12(1):142. https://doi.org/10.1186/s13148-
020-00937-y.
108. Granja JM, Klemm S, McGinnis LM, et al. Single-cell multiomic analy-
sis identifies regulatory programs in mixed-phenotype acute leuke-
mia. Nat Biotechnol. 2019;37(12):1458-1465. https://doi.org/10.
1038/s41587-019-0332-7.
109. LaFave LM, Kartha VK, Ma S, et al. Epigenomic state transitions
characterize tumor progression in mouse lung adenocarcinoma. Can-
cer Cell. 2020;38(2):212-228 e13. https://doi.org/10.1016/j.ccell.
2020.06.006.
110. Guo H, Zhu P, Guo F, et al. Profiling DNA methylome landscapes of
mammalian cells with single-cell reduced-representation bisulfite
sequencing. Nat Protoc. 2015;10(5):645-659. https://doi.org/10.
1038/nprot.2015.039.
111. Rotem A, Ram O, Shoresh N, et al. Single-cell ChIP-seq reveals cell
subpopulations defined by chromatin state. Nat Biotechnol. 2015;33
(11):1165-1172. https://doi.org/10.1038/nbt.3383.
112. Hou Y, Guo H, Cao C, et al. Single-cell triple omics sequencing
reveals genetic, epigenetic, and transcriptomic heterogeneity in
hepatocellular carcinomas. Cell Res. 2016;26(3):304-319. https://
doi.org/10.1038/cr.2016.23.
113. Gaiti F, Chaligne R, Gu H, et al. Epigenetic evolution and lineage his-
tories of chronic lymphocytic leukaemia. Nature. 2019;569(7757):
576-580. https://doi.org/10.1038/s41586-019-1198-z.
114. Pastore A, Gaiti F, Lu SX, et al. Corrupted coordination of epigenetic
modifications leads to diverging chromatin states and transcriptional
heterogeneity in CLL. Nat Commun. 2019;10(1):1874. https://doi.
org/10.1038/s41467-019-09645-5.
115. Shu S, Wu HJ, Ge JY, et al. Synthetic lethal and resistance interac-
tions with BET Bromodomain inhibitors in triple-negative breast
Cancer. Mol Cell. 2020;78(6):1096–1113 e8. https://doi.org/10.
1016/j.molcel.2020.04.027.
116. Satpathy AT, Granja JM, Yost KE, et al. Massively parallel single-cell
chromatin landscapes of human immune cell development and
intratumoral T cell exhaustion. Nat Biotechnol. 2019;37(8):925-936.
https://doi.org/10.1038/s41587-019-0206-z.
117. Litzenburger UM, Buenrostro JD, Wu B, et al. Single-cell epigenomic
variability reveals functional cancer heterogeneity. Genome Biol.
2017;18(1):1-12. https://doi.org/10.1186/s13059-016-1133-7.
118. Grosselin K, Durand A, Marsolier J, et al. High-throughput single-cell
ChIP-seq identifies heterogeneity of chromatin states in breast can-
cer. Nat Genet. 2019;51(6):1060-1066. https://doi.org/10.1038/
s41588-019-0424-9.
119. Maruffi M, Sposto R, Oberley MJ, Kysh L, Orgel E. Therapy for chil-
dren and adults with mixed phenotype acute leukemia: a systematic
review and meta-analysis. Leukemia. 2018;32(7):1515-1528.
https://doi.org/10.1038/s41375-018-0058-4.
120. Stenhouse G, Fyfe N, King G, Chapman A, Kerr KM. Thyroid tran-
scription factor 1 in pulmonary adenocarcinoma. J Clin Pathol. 2004;
57(4):383-387. https://doi.org/10.1136/jcp.2003.007138.
121. Kim HK, Noh YH, Nilius B, et al. Current and upcoming mitochon-
drial targets for cancer therapy. Semin Cancer Biol. 2017;47:154-
167. https://doi.org/10.1016/j.semcancer.2017.06.006.
122. Lareau CA, Ludwig LS, Muus C, et al. Massively parallel single-cell
mitochondrial DNA genotyping and chromatin profiling. Nat Bio-
technol. 2020. https://doi.org/10.1038/s41587-020-0645-6.
123. Lo PK, Zhou Q. Emerging techniques in single-cell epigenomics and
their applications to cancer research. J Clin Genomics. 2018;1(1).
https://doi.org/10.4172/JCG.1000103.
124. Kaya-Okur HS, Wu SJ, Codomo CA, et al. CUT&tag for efficient epi-
genomic profiling of small samples and single cells. Nat Commun.
2019;10(1):1930. https://doi.org/10.1038/s41467-019-09982-5.
125. Ding L, Ley TJ, Larson DE, et al. HHS public. Access. 2012;481
(7382):506-510. https://doi.org/10.1038/nature10738.Clonal.
126. Gerlinger M, Rowan AJ, Sc B, et al. Intratumor heterogeneity and branched
evolution revealed by multiregion sequencing. N Engl J Med. 2012;366(10):
883-892. https://doi.org/10.1056/NEJMoa1113205.Intratumor.
127. Navin N, Kendall J, Troge J, et al. Tumour evolution inferred by
single-cell sequencing. Nature. 2011;472:90-94. https://doi.org/10.
1038/nature09807.
128. Hou Y, Song L, Zhu P, et al. Single-cell exome sequencing and mono-
clonal evolution of a JAK2-negative myeloproliferative neoplasm.
Cell. 2012;148:873-885. https://doi.org/10.1016/j.cell.2012.02.028.
129. Xu X, Hou Y, Yin X, et al. Single-cell exome sequencing reveals
single-nucleotide mutation characteristics of a kidney tumor. Cell.
2012;148(5):886-895. https://doi.org/10.1016/j.cell.2012.02.025.
130. Wang J, Fan HC, Behr B, Quake SR. Genome-wide single-cell analy-
sis of recombination activity and de novo mutation rates in human
sperm. Cell. 2012;150(2):402-412. https://doi.org/10.1016/j.cell.
2012.06.030.
131. Hughes AEO, Magrini V, Demeter R, et al. Clonal architecture of
secondary acute myeloid leukemia defined by single-cell sequencing.
PLoS Genet. 2014;10(7):e1004462. https://doi.org/10.1371/journal.
pgen.1004462.
23
132. Zong C, Lu S, Chapman AR, Xie XS. Genome-wide detection of
single-nucleotide and copy-number variations of a single human cell.
Science. 2012;338(6114):1622-1626. https://doi.org/10.1126/
science.1229164.
133. Vitak SA, Torkenczy KA, Rosenkrantz JL, et al. Sequencing thousands
of single-cell genomes with combinatorial indexing. Nat Methods.
2017;14:302-308. https://doi.org/10.1038/nmeth.4154.
134. Gawad C, Koh W, Quake SR. Single-cell genome sequencing: current
state of the science. Nat Rev Genet. 2016;17(3):175-188. https://
doi.org/10.1038/nrg.2015.16.
135. Chen C, Xing D, Tan L, et al. Single-cell whole-genome analyses
by linear amplification via transposon insertion (LIANTI). Science.
2017;356(6334):189-194. https://doi.org/10.1126/science.aak9787.
136. Zahn H, Steif A, Laks E, et al. Scalable whole-genome single-cell
library preparation without preamplification. Nat Methods. 2017;14
(2):167-173. https://doi.org/10.1038/nmeth.4140.
137. Laks E, McPherson A, Zahn H, et al. Clonal decomposition and DNA repli-
cation states defined by scaled single-cell genome sequencing. Cell. 2019;
179(5):1207-1221.e22. https://doi.org/10.1016/j.cell.2019.10.026.
138. Falconer E, Hills M, Naumann U, et al. DNA template strand
sequencing of single-cells maps genomic rearrangements at high res-
olution. Nat Methods. 2012;9(11):1107-1112. https://doi.org/10.
1038/nmeth.2206.
139. Maria Maggiolini FA, Sanders AD, Shew CJ, et al. Single-cell strand
sequencing of a macaque genome reveals multiple nested inversions
and breakpoint reuse during primate evolution. Genome Res. 2020;
30(11):1680-1693. https://doi.org/10.1101/gr.265322.120.
140. Sanders AD, Meiers S, Ghareghani M, et al. Single-cell analysis of
structural variations and complex rearrangements with tri-channel
processing. Nat Biotechnol. 2020;38(3):343-354. https://doi.org/10.
1038/s41587-019-0366-x.
141. Wang Y, Waters J, Leung ML, et al. Clonal evolution in breast cancer
revealed by single nucleus genome sequencing. Nature. 2014;512
(7513):155-160. https://doi.org/10.1038/nature13600.
142. Gao R, Davis A, McDonald TO, et al. Punctuated copy number evo-
lution and clonal stasis in triple-negative breast cancer. Nat Genet.
2016;48(10):1119-1130. https://doi.org/10.1038/ng.3641.
143. Eirew P, Steif A, Khattra J, et al. Dynamics of genomic clones in
breast cancer patient xenografts at single-cell resolution. Nature.
2015;518(7539):422-426. https://doi.org/10.1038/nature13952.
144. Kim C, Gao R, Sei E, et al. Chemoresistance evolution in triple-negative
breast Cancer delineated by single-cell sequencing. Cell. 2018;173(4):
879-893.e13. https://doi.org/10.1016/j.cell.2018.03.041.
145. Leung ML, Davis A, Gao R, et al. Single-cell DNA sequencing reveals
a latedissemination model in metastatic colorectal cancer. Genome
Res. 2017;27(8):1287-1299. https://doi.org/10.1101/gr.209973.116.
146. Lohr JG, Adalsteinsson VA, Cibulskis K, et al. Whole-exome
sequencing of circulating tumor cells provides a window into meta-
static prostate cancer. Nat Biotechnol. 2014;32(5):479-484. https://
doi.org/10.1038/nbt.2892.
147. Ni X, Zhuo M, Su Z, et al. Reproducible copy number variation pat-
terns among single circulating tumor cells of lung cancer patients.
Proc Natl Acad Sci U S A. 2013;110:21083-21088. https://doi.org/
10.1073/pnas.1320659110.
148. Carter L, Rothwell DG, Mesquita B, et al. Molecular analysis of
circulating tumor cells identifies distinct copy-number profiles in
patients with chemosensitive and chemorefractory small-cell lung
cancer. Nat Med. 2017;23(1):114-119. https://doi.org/10.1038/nm.
4239.
149. Pellegrino M, Sciambi A, Treusch S, et al. High-throughput single-cell
DNA sequencing of acute myeloid leukemia tumors with droplet
microfluidics. Genome Res. 2018;28(9):1345-1352. https://doi.org/
10.1101/gr.232272.117.
150. DiNardo CD, Tiong IS, Quaglieri A, et al. Molecular patterns of
response and treatment failure after frontline venetoclax combinations
in older patients with AML. Blood. 2020;135(11):791-803. https://doi.
org/10.1182/blood.2019003988.
151. Morita K, Wang F, Jahn K, et al. Clonal evolution of acute myeloid
leukemia revealed by high-throughput single-cell genomics. Nat
Commun. 2020;11(1):5327. https://doi.org/10.1038/s41467-020-
19119-8.
152. Miles LA, Bowman RL, Merlinsky TR, et al. Single-cell mutation anal-
ysis of clonal evolution in myeloid malignancies. Nature. 2020;587:
477-482. https://doi.org/10.1038/s41586-020-2864-x.
153. Papaemmanuil E, Rapado I, Li Y, et al. RAG-mediated recombination
is the predominant driver of oncogenic rearrangement in
ETV6-RUNX1 acute lymphoblastic leukemia. Nat Genet. 2014;46(2):
116-125. https://doi.org/10.1038/ng.2874.
154. Gawad C, Koh W, Quake SR. Dissecting the clonal origins of child-
hood acute lymphoblastic leukemia by single-cell genomics. Proc
Natl Acad Sci U S A. 2014;111(50):17947-17952. https://doi.org/10.
1073/pnas.1420822111.
155. Van Den Brink SC, Sage F, Vértesy �A, et al. Single-cell sequencing
reveals dissociation-induced gene expression in tissue subpopula-
tions. Nat Methods. 2017;14(10):935-936. https://doi.org/10.1038/
nmeth.4437.
156. Toki MI, Merritt CR, Wong PF, et al. High-Plex predictive marker
discovery for melanoma immunotherapy–treated patients using digi-
tal spatial profiling. Clin Cancer Res. 2019;25(18):5503-5512.
https://doi.org/10.1158/1078-0432.ccr-19-0104.
157. Wang F, Flanagan J, Su N, et al. RNAscope: a novel in situ RNA analy-
sis platform for formalin-fixed, paraffin-embedded tissues. J Mol Diagn.
2012;14(1):22-29. https://doi.org/10.1016/j.jmoldx.2011.08.002.
158. Ihle CL, Provera MD, Straign DM, et al. Distinct tumor microenviron-
ments of lytic and blastic bone metastases in prostate cancer
patients. J Immunother Cancer. 2019;7(1):1-9. https://doi.org/10.
1186/s40425-019-0753-3.
159. Goltsev Y, Samusik N, Kennedy-Darling J, et al. Deep profiling of
mouse splenic architecture with CODEX multiplexed imaging. Cell.
2018;174(4):968-981.e15. https://doi.org/10.1016/j.cell.2018.07.010.
160. Vickovic S, Eraslan G, Salmén F, et al. High-definition spatial trans-
criptomics for in situ tissue profiling. Nat Methods. 2019;16(10):987-
990. https://doi.org/10.1038/s41592-019-0548-y.
161. Schulz D, Zanotelli VRT, Fischer JR, et al. Simultaneous multiplexed
imaging of mRNA and proteins with subcellular resolution in breast
cancer tissue samples by mass cytometry. Cell Syst. 2018;6(1):25-36.
e5. https://doi.org/10.1016/j.cels.2017.12.001.
162. Casasent AK, Schalck A, Gao R, et al. Multiclonal invasion in breast
tumors identified by topographic single cell sequencing. Cell. 2018;
172(1-2):205-217.e12. https://doi.org/10.1016/j.cell.2017.12.007.
163. Kim S, Lee AC, Lee HB, et al. PHLI-seq: constructing and visualizing
cancer genomic maps in 3D by phenotype-based high-throughput
laser-aided isolation and sequencing. Genome Biol. 2018;19:158.
https://doi.org/10.1186/s13059-018-1543-9.
164. Gorris MAJ, Halilovic A, Rabold K, et al. Eight-color multiplex immu-
nohistochemistry for simultaneous detection of multiple immune
checkpoint molecules within the tumor microenvironment. J Immunol.
2018;200(1):347-354. https://doi.org/10.4049/jimmunol.1701262.
165. Cabrita R, Lauss M, Sanna A, et al. Tertiary lymphoid structures
improve immunotherapy and survival in melanoma. Nature.
2020;577(7791):561-565. https://doi.org/10.1038/s41586-019-
1914-8.
166. Helmink BA, Reddy SM, Gao J, et al. B cells and tertiary lymphoid
structures promote immunotherapy response. Nature. 2020;577
(7791):549-555. https://doi.org/10.1038/s41586-019-1922-8.
167. Merritt CR, Ong GT, Church SE, et al. Multiplex digital spatial profil-
ing of proteins and RNA in fixed tissue. Nat Biotechnol. 2020;38(5):
586-599. https://doi.org/10.1038/s41587-020-0472-9.
168. Rodriques SG, Stickels RR, Goeva A, et al. Slide-seq: a scalable tech-
nology for measuring genome-wide expression at high spatial
24
resolution. Science. 2019;363(6434):1463-1467. https://doi.org/10.
1126/science.aaw1219.
169. Stahl PL, Salmen F, Vickovic S, et al. Visualization and analysis of
gene expression in tissue sections by spatial transcriptomics. Science.
2016;353(6294):78-82. https://doi.org/10.1126/science.aaf2403.
170. Moncada R, Barkley D, Wagner F, et al. Integrating microarray-
based spatial transcriptomics and single-cell RNA-seq reveals tissue
architecture in pancreatic ductal adenocarcinomas. Nat Biotechnol.
2020;38(3):333-342. https://doi.org/10.1038/s41587-019-0392-8.
171. Villacampa EG, Larsson L, Kvastad L, Andersson A, Carlson J,
Lundeberg J. Genome-wide spatial expression profiling in FFPE tis-
sues. bioRxiv. 2020. https://doi.org/10.1101/2020.07.24.219758
172. Wang Z, Portier BP, Gruver AM, et al. Automated quantitative RNA
in situ hybridization for resolution of equivocal and heterogeneous
ERBB2 (HER2) status in invasive breast carcinoma. J Mol Diagn. 2013;
15(2):210-219. https://doi.org/10.1016/j.jmoldx.2012.10.003.
173. Ke R, Mignardi M, Pacureanu A, et al. In situ sequencing for RNA
analysis in preserved tissue and cells. Nat Methods. 2013;10(9):857-
860. https://doi.org/10.1038/nmeth.2563.
174. Potter N, Ermini L, Papaemmanuil E, et al. Single-cell mutational pro-
filing and clonal phylogeny in cancer. Genome Res. 2013;23:2115-
2125. https://doi.org/10.1101/gr.159913.113.23.
175. Ediriwickrema A, Aleshin A, Reiter JG, et al. Single-cell mutational profil-
ing enhances the clinical evaluation of AML MRD. Blood Adv. 2020;
4(5):943-952. https://doi.org/10.1182/bloodadvances.2019001181.
176. Nguyen QH, Pervolarakis N, Nee K, Kessenbrock K. Experimental
considerations for single-cell RNA sequencing approaches. Front Cell
Dev Biol. 2018;6:108. https://doi.org/10.3389/fcell.2018.00108.
177. Hodge RD, Bakken TE, Miller JA, et al. Conserved cell types with
divergent features in human versus mouse cortex. Nature. 2019;
573:61-68. https://doi.org/10.1038/s41586-019-1506-7.
178. Barkas N, Petukhov V, Nikolaeva D, et al. Joint analysis of heteroge-
neous single-cell RNA-seq dataset collections. Nat Methods. 2019;
16:695-698. https://doi.org/10.1038/s41592-019-0466-z.
179. Mereu E, Lafzi A, Moutinho C, et al. Benchmarking single-cell RNA-
sequencing protocols for cell atlas projects. Nat Biotechnol. 2020;38
(6):747-755. https://doi.org/10.1038/s41587-020-0469-4.
180. Massoni-Badosa R, Iacono G, Moutinho C, et al. Sampling
time-dependent artifacts in single-cell genomics studies.
Genome Biol. 2020;21(1):1-16. https://doi.org/10.1186/s13059-
020-02032-0.
181. Slyper M, Porter CBM, Ashenberg O, et al. A single-cell and single-
nucleus RNA-Seq toolbox for fresh and frozen human tumors. Nat
Med. 2020;26(5):792-802. https://doi.org/10.1038/s41591-020-
0844-1.
182. Bendall SC, Davis KL, Amir EAD, et al. Single-cell trajectory detec-
tion uncovers progression and regulatory coordination in human b
cell development. Cell. 2014;157:714-725. https://doi.org/10.
1016/j.cell.2014.04.005.
183. Trapnell C, Cacchiarelli D, Grimsby J, et al. The dynamics and regula-
tors of cell fate decisions are revealed by pseudotemporal ordering
of single cells. Nat Biotechnol. 2014;32:381-386. https://doi.org/10.
1038/nbt.2859.
184. Kharchenko PV, Silberstein L, Scadden DT. Bayesian approach to
single-cell differential expression analysis. Nat Methods. 2014;11(7):
740-742. https://doi.org/10.1038/nmeth.2967.
185. Grün D, Kester L, Van Oudenaarden A. Validation of noise models
for single-cell transcriptomics. Nat Methods. 2014;11(6):637-640.
https://doi.org/10.1038/nmeth.2930.
186. Pagès F, Mlecnik B, Marliot F, et al. International validation of the
consensus Immunoscore for the classification of colon cancer: a
prognostic and accuracy study. Lancet. 2018;391(10135):2128-
2139. https://doi.org/10.1016/S0140-6736(18)30789-X.
187. Bodenmiller B. Multiplexed epitope-based tissue imaging for discov-
ery and healthcare applications. Cell Syst. 2016;2(4):225-238.
https://doi.org/10.1016/j.cels.2016.03.008.
25
Sample prep GEM generation Sequencing Data processing Data visualization
PrepareNuclei
Suspension
Library construction
Identification of a tumor–specific gene regulatory network in human B-cell lymphoma
IntroductionSimultaneous readout of transcriptomic and epigenomic data from the same cell at single cell resolution allows for direct reconstruction of cell type–specific gene regulatory networks that does not rely on inference or assumptions to tie the two data types together. Here, we show how multio-mic analysis of paired RNA-seq and ATAC-seq data from the same single cells using Chromium Single Cell Multi-ome ATAC + Gene Expression enables direct linkage of differentially accessible DNA regions to proximal differen-tially expressed genes to identify putative regulatory targets. As a result, you can answer questions not only about what genes are expressed in a single cell, but how expression is regulated through associated open chroma-tin regions. In a diffuse small B-cell lymphoma sample, we confirmed Paired Box 5 (PAX5) as an important regulator in tumor B cells and identified a network of potential PAX5 target genes.
Figure 1. Experimental methods for nuclei isolation and multiomic data generation. Flash-frozen intra-abdominal lymph node tumor, with pathologist annotation of diffuse small B-cell lymphoma tissue, was acquired from BioIVT Asterand®. Nuclei were isolated following the Nuclei Isolation from Complex Tissues for Single Cell Multiome ATAC + Gene Expression Sequencing Demonstrated Protocol (CG000375). Isolated nuclei were flow sorted before permeabilization. Nuclei were transposed in bulk before single nuclei encapsulation in GEMs (Gel Bead-in-emulsion), where DNA fragments and the 3’ ends of mRNA were barcoded. Paired ATAC and gene expression libraries were generated from 14,000 total nuclei as described in the Chromium Next GEM Single Cell Multiome ATAC + Gene Expression User Guide (CG000338 Rev A) and sequenced on an Illumina NovaSeq™ 6000 v1.5.
Highlights • Distinguish tumor versus normal cells in
a heterogeneous sample
• Reconstruct cell type–specific gene regulatorynetwork
• Confirm PAX5 as a critical regulator specificto tumor B cells
• Identify putative target genes downstreamof PAX5
26
Annotate peaks linked to DEGsFeature linkage
Figure 2. Simultaneous measurement of gene expression and open chromatin profiles from the same single nuclei enables clustering based on either modality. A. Shown are clustering and manual annotation based on gene expression for all 14,000 nuclei (left); gene expression-derived annotations layered on ATAC projections (middle); and the gene expression plot on the left restricted to the T-cell populations (right). B. Highlighted are expression levels of select genes, including MS4A1, a canonical B-cell marker (left); BANK1, an attenuator of BCR activation pathway that is repressed in tumor cells relative to normal B cells(middle); and PAX5, required for B-cell differentiation (right).
Figure 3. Computational strategy for identification of cell type–specific gene regulatory networks. A. In 10x Genomics Cell Ranger ARC software, feature linkages are defined as pairs of genomic features, such as peaks and genes, that exhibit signifi-cant correlation in their chromatin accessibility and transcript level, respectively, across cells. Feature linkages can be positively or negatively correlated. For example, an open enhancer region may have a positive correlation with gene expression of its associated transcript (blue), while the binding of a repressor would result in a negatively correlated feature linkage (red). The greater the correlation between open chromatin signal and gene expression, the taller the arc. B. To identify a gene regulatory network in tumor B cells, genes were first filtered based on significant transcriptional upregulation in tumor B cells relative to normal B cells (p < 10-20), resulting in 198 differentially expressed genes (DEGs, green). Peaks associated with DEGs (green) were identified using feature linkages. Tumor B cell–specific enriched motifs were then identified using DEG-linked peaks. Enriched motifs and linked upregulated genes were used to define a B cell lymphoma–specific gene regulatory network (Figure 4).
A. B.
B
Fibroblasts
MonopDC
Stromal cells
TT cycling
Tumor B
Tumor B cycling
umap1
umap
2
Cell typeB
Fibroblasts
Mono
pDC
Stromal cells
T
T cycling
Tumor B
Tumor B cycling
NA
Gene Expression
B
Fibroblasts
Mono
pDC
Stromal cells
T
T cycling
Tumor B
Tumor B cycling
umap1um
ap2
Cell typeB
Fibroblasts
Mono
pDC
Stromal cells
T
T cycling
Tumor B
Tumor B cycling
NA
ATAC
CD4 cytotoxic
CD4 memory
CD4 Naive
CD4 Tfh
CD8 Exhausted
CD8 memoryCD8 Naive
NK
NKT
Proliferating T
Treg
umap1
umap
2
T cell subtypeCD4 cytotoxic
CD4 memory
CD4 Naive
CD4 Tfh
CD8 Exhausted
CD8 memory
CD8 Naive
NK
NKT
Proliferating T
Treg
Gene Expression, T cells
CD4 cytotoxic
CD4 memory CD4 Naive
CD4 TfhCD8 Exhausted
CD8 memory
CD8 Naive
NKNKT
Proliferating T
Treg
umap1
umap
2
T cell subtypeCD4 cytotoxic
CD4 memory
CD4 Naive
CD4 Tfh
CD8 Exhausted
CD8 memory
CD8 Naive
NK
NKT
Proliferating T
Treg
ATAC, T cells
B
Fibroblasts
MonopDC
Stromal cells
TT cycling
Tumor B
Tumor B cycling
umap1
umap
2
Cell typeB
Fibroblasts
Mono
pDC
Stromal cells
T
T cycling
Tumor B
Tumor B cycling
NA
Gene Expression
B
Fibroblasts
Mono
pDC
Stromal cells
T
T cycling
Tumor B
Tumor B cycling
umap1
umap
2
Cell typeB
Fibroblasts
Mono
pDC
Stromal cells
T
T cycling
Tumor B
Tumor B cycling
NA
ATAC
CD4 cytotoxic
CD4 memory
CD4 Naive
CD4 Tfh
CD8 Exhausted
CD8 memoryCD8 Naive
NK
NKT
Proliferating T
Treg
umap1
umap
2
T cell subtypeCD4 cytotoxic
CD4 memory
CD4 Naive
CD4 Tfh
CD8 Exhausted
CD8 memory
CD8 Naive
NK
NKT
Proliferating T
Treg
Gene Expression, T cells
CD4 cytotoxic
CD4 memory CD4 Naive
CD4 TfhCD8 Exhausted
CD8 memory
CD8 Naive
NKNKT
Proliferating T
Treg
umap1
umap
2
T cell subtypeCD4 cytotoxic
CD4 memory
CD4 Naive
CD4 Tfh
CD8 Exhausted
CD8 memory
CD8 Naive
NK
NKT
Proliferating T
Treg
ATAC, T cells
B
Fibroblasts
MonopDC
Stromal cells
TT cycling
Tumor B
Tumor B cycling
umap1
umap
2
Cell typeB
Fibroblasts
Mono
pDC
Stromal cells
T
T cycling
Tumor B
Tumor B cycling
NA
Gene Expression
B
Fibroblasts
Mono
pDC
Stromal cells
T
T cycling
Tumor B
Tumor B cycling
umap1
umap
2
Cell typeB
Fibroblasts
Mono
pDC
Stromal cells
T
T cycling
Tumor B
Tumor B cycling
NA
ATAC
CD4 cytotoxic
CD4 memory
CD4 Naive
CD4 Tfh
CD8 Exhausted
CD8 memoryCD8 Naive
NK
NKT
Proliferating T
Treg
umap1
umap
2
T cell subtypeCD4 cytotoxic
CD4 memory
CD4 Naive
CD4 Tfh
CD8 Exhausted
CD8 memory
CD8 Naive
NK
NKT
Proliferating T
Treg
Gene Expression, T cells
CD4 cytotoxic
CD4 memory CD4 Naive
CD4 TfhCD8 Exhausted
CD8 memory
CD8 Naive
NKNKT
Proliferating T
Treg
umap1
umap
2
T cell subtypeCD4 cytotoxic
CD4 memory
CD4 Naive
CD4 Tfh
CD8 Exhausted
CD8 memory
CD8 Naive
NK
NKT
Proliferating T
Treg
ATAC, T cellsA. Gene expression
UMAP 1 UMAP 1 UMAP 1
UM
AP 2
UM
AP 2
UM
AP 2
UM
AP 2
UM
AP 2
Tumor B cells
UM
AP 2
MS4A1
ATAC
BANK1
Gene expression, T cells
PAX5B.
UMAP 1 UMAP 1 UMAP 1
27
What to look forSince mRNA and ATAC data are generated from the same cells, cell-type annotations can be transferred from one modality to the other (Figure 2A, middle). In addition to the identification of B cells, monocytes, and T-cell sub-types using canonical cell markers like the B-cell marker MS4A1, tumor B cells were distinguishable from normal B cells based on upregulated CD40 expression (data not shown) and reduced BANK1 (Figure 2B). PAX5 was sig-nificantly upregulated in tumor B cells relative to normal B cells (Figure 2B), and has previously been identified as a core regulator of chronic lymphocytic leukemia (CLL) (Ott et al., 2018).
Paired gene expression and open chromatin signals pave the way for high-confidence gene regulatory network pre-dictions using feature linkages, which are calculated automatically in Cell Ranger ARC (Figure 3A). Feature linkages help build putative gene regulatory networks by providing correlated gene expression and open chromatin regions across the genome. To identify tumor B cell–spe-cific gene regulatory networks, we first annotated feature linkages by genes upregulated in tumor B cells to identify peaks that were potential drivers of differential expres-sion. We then identified motifs enriched in these peaks relative to a set of matched background motifs within tumor B cells (Figure 3B). Using this method, we found that the PAX1 motif was the most enriched (Figure 4).
PAX1 and PAX5 motifs are highly similar, however PAX1 is not expressed in tumor B cells, while PAX5 is highly expressed (Figure 4). Therefore, it is likely the PAX5 tran-scription factor is binding the identified PAX1 motif. This inference is only possible with paired gene expression and open chromatin information from the same cells.
To understand the role of PAX5 in tumor B cells, we zoomed in on the PAX5 locus, which is differentially expressed between B cells and tumor B cells (Figure 5). Expression of PAX5 is highly correlated with open PAX5 motif sites in a previously identified super-enhancer, sug-gesting autoregulation (Figure 5, dashed box). Additional feature linkages contribute further to the reconstruction of a putative tumor B cell–specific gene regulatory net-work, and suggest PAX5 may also regulate the immune transcription factor genes NFATC1, TCF4, IKZF1, and IRF8 (Figure 4). The importance of PAX5 and its position as a key genetic regulator in tumor B cells is consistent with previously published results showing that, of 147 transcription factors tested, loss of PAX5 had the great-est effect on cell proliferation in a CLL cell line (Ott et al., 2018). While confirmation of individual links in our predicted gene regulatory network requires functional tests, the confidence in regulatory connections is greatly increased by joint measurement of mRNA and ATAC data.
MOTIFSPAX5
ONECUT1PAX1CUX1PAX9CUX2
TCF4FOXP1
NFATC1
PAX5IKZF1
AHRTOX
IRF8POU2F2
TP63LEF1
CARD11
TFRCST6GAL1
BCL2DTX1
SKAP2
CD83SYK
IL4RRASGRP3
CDKN2A
DLG1KLHL6
BLNKSEMA4A
PAK2IGLC1
ADTRP
NFKBIZ
FCRL3
FCRL2
0
enrichment
0 2 4
log10 UMI
LinkageSignificance
10
2550100200
Immune TFs Other Immune Genes
Target genes
Figure 4. Feature linkages help build a tumor-specific gene regulatory network. The table summarizes significant feature linkages between motifs in the PAX/CUX/ONECUT family and a selection of immune-related transcription factors (TFs) and other immune genes that are differentially expressed in tumor B cells. At far left, the blue line plot shows motif enrichment scores, calculated using the analysis outlined in Figure 3. Gene expression levels of the transcription factors expected to bind each motif are indicated in the adjacent bar graph. For every differentially expressed gene–PAX/CUX/ONECUT motif pair, the significance of the most significant feature linkage is indicated by a colored square.
28
Figure 5. Loupe Browser enables visualization of feature linkages. Positively correlated feature linkages are denoted by arcs at top. Highlighted by the dotted box is a highly significant feature linkage between PAX5 and a previously annotated CLL super-enhancer that is depicted in black (Ott et al., 2018). Below the illustrated feature linkages are open chromatin peaks identified for each cell cluster across a 0.3 Mb region. Annotated cell types are color coded. On the right are plots showing the expression level of PAX5 (top) and accessibility of the linked super-enhancer (bottom) for each annotated cell type. Tumor B cells (blue), in contrast to normal B cells (red), have elevated PAX5 expression and open chromatin at this super-enhancer.
Contact us 10xgenomics.com | [email protected] © 2021 10x Genomics, Inc. FOR RESEARCH USE ONLY. NOT FOR USE IN DIAGNOSTIC PROCEDURES.LIT000110 - Rev A - Data Spotlight - Tumor–specific gene regulatory network in human B-cell lymphoma
Explore what you can doChromium Single Cell Multiome ATAC + Gene Expression helps you identify the critical regulators and pathways behind cell state. Putative gene regulatory networks can be built based on correlated gene expression and open chromatin sites with greater accuracy and confidence than would be possible with a single modality. At the same time, the identity of likely transcriptional regula-tors can be constrained by both expression level and motif availability. Multiomic readout at the transcrip-tional and epigenetic levels, particularly from the same single cell, takes much of the guesswork out of network reconstruction based on gene expression alone, enabling a deeper understanding of the molecular mechanisms underpinning disease progression, developmental differ-entiation, and therapeutic response.
ResourcesTo explore the dataset further, download the data here: https://support.10xgenomics.com/single-cell-multiome-atac-gex/datasets/1.0.0/lymph_node_lymphoma_14k
ReferencesOtt CJ, et al. Enhancer Architecture and Essential Core Regulatory Circuitry of Chronic Lymphocytic Leukemia. Cancer Cell. 34: 982–995, 2018.
29
SPECIAL FEATURE REVIEW
Recent advances in single-cell multimodal analysis to studyimmune cellsRaymond HY Louie & Fabio Luciani
School of Medical Sciences, The Kirby Institute, University of New South Wales (UNSW), Sydney, NSW, Australia
Keywords
cell state, cell–cell interaction, clonal analysis,
immune cells, lineage, multimodal analysis,
pseudotime, single-cell technology, temporal
analysis
Correspondence
Fabio Luciani, School of Medical Sciences and
the Kirby Institute, University of New South
Wales (UNSW), Sydney, NSW 2052,
Australia.
E-mail: [email protected]
Received 5 September 2020; Revised 30
October, 24 November and 9 December
2020; Accepted 9 December 2020
doi: 10.1111/imcb.12432
Immunology & Cell Biology 2021; 99:
157–167
Abstract
Recent advances in single-cell technologies have enabled the profiling of the
genome, epigenome, transcriptome and proteome, along with temporal and
spatial information of individual cells. These technologies have provided
unique opportunities to understand mechanisms underpinning the immune
system, such as characterizations of the molecular cell state, how the cell state
evolves along its lineage and the impact of spatial location on cell state. In this
review, we discuss how these mechanisms have been studied through recent
advances in single-cell multimodal technologies.
INTRODUCTION
Recent advances in single-cell technology has made it
possible to simultaneously extract different types of
information, or “modalities,” from the same single cell.
These modalities can arise from the genome, epigenome,
transcriptome and proteome (Figure 1a). In each of these
ome-layers, different modalities exist, such as mutations
and copy number variations at the genome layer, DNA
methylation and chromatin accessibility at the epigenome
layer and unspliced and spliced messenger RNA (mRNA)
at the transcriptome layer. Single-cell multimodal analysis
has been used for different immunological applications.
For example, a common application is to characterize the
molecular cell state, which can be described by a single
modality, for example, the expression of certain genes, or
by a combination of modalities spanning across the
genome, epigenome, transcriptome and proteome
(Figure 1a).
Single-cell multimodal analysis can also be used to
describe how the molecular cell state evolves along
different stages of cellular differentiation. This can be
achieved through combining “ome-layers” and temporal
modalities, thus capturing the information related to the
ordering of cells at different stages of differentiation.
Combining ome-layer and temporal modalities can
characterize how the molecular state of an immune cell
evolves along its lineage, from hematopoietic stem cells
(HSCs) in the bone marrow to its cell fate. A cell’s
lineage is created by the developmental history of the cell,
with each cell belonging to the same or sister clones.
Single-cell multimodal analysis can also inform on how
molecular cell state is location dependent, which requires
spatial modalities. These modalities include the (1) spatial
location of the cell in the body or (2) which cells are
interacting with each other. Cell-to-cell interactions can
be determined by examining receptor–ligand pair
interactions and are important, as neighboring cells can
modulate cell function through these interactions.
Single-cell multiple modalities can be obtained either
experimentally or bioinformatically. Experimental
modalities can be obtained through gathering cytometric
30
information before a destructive assay, separation of
cellular components or a conversion of cellular
information into a common molecular format.1 Detailed
descriptions of these methods are given in an excellent
review,1 and will not be discussed further. Bioinformatic
tools can also be used to extract multiple modalities. The
key difference from the experimental approach is that
these modes are extracted from data generated from the
same assay, where the data are typically sequenced reads.
For example, at the transcriptomics layer, reads aligned
to a transcriptome can be processed bioinformatically to
yield unspliced and spliced RNA.2 Data obtained at one
layer can also be used to computationally predict
modalities at another layer. For example, transcriptomic
data have been used to predict cell-to-cell interactions at
the spatial layer,3,4 the temporal ordering of cells5,6 and
the future states of cells2 at the temporal layer.
In this review, we will discuss recent single-cell
multimodal applications to human or mouse immune
cells, in contrast to broad reviews which focus on general
applications of single-cell multimodal analysis.7 We
define single-cell multimodal analysis as the analysis of
data sets arising from at least two modalities obtained
from the same cell, as opposed to bioinformatically
GenomeDNA mutations
Epigenomechromatinaccessibility,methylation
TranscriptomeRNA
Surface proteome
Intracellularproteome
Single-cell multi-modal applications
Molecular cell state
Temporal Spatial
Mode 1 Mode 2
Time
Location in bodyCell-cell interaction
(a)
(b)
Heterogeneous population
noisserpxe rekraM
)nietorp ,eneg ,.g.e(
Figure 1. Single-cell multimodal analysis: components and applications. (a) Cellular components of single-cell multimodal analysis. (b)
Applications of single-cell multimodal analysis to molecular cell state, temporal evolution and spatial analysis.
31
or similar protocols, as previously reviewed.9 However,
these methods are laborious and only applicable for small
cell number. New methods are available to isolate single
cells at high throughput, for instance, utilizing cellular
bar codes and unique molecular identifiers [e.g.
microfluidics technology (10x Chromium) or nano plates
(Rhapsody)].9 These methods require demultiplexing of
individual cells which is performed bioinformatically.
While these approaches were first developed to perform
single-cell RNA sequencing (scRNA-seq), more recently
these have been also developed to perform multimodal
analyses. For example, Cellular Indexing of
Transcriptomes and Epitopes by Sequencing (CITE-Seq)10
and AbSeq11 are two technologies which can
simultaneously extract intracellular (surface) protein and
gene expression in the same cell. These technologies have
been used to explore heterogeneous populations in both
healthy and disease samples.10–12 For example, CITE-
Seq10 was applied in combination with 10x Chromium to
identify cord blood mononuclear cells and successfully
identify natural killer cells based on the CD16 and CD56
surface markers, after which gene expression analysis
revealed differentially expressed signatures of natural
killer subtypes between healthy and disease samples,
including cytotoxic markers such as GZMB, GZMK and
PRF1.
Although technologies such as CITE-Seq and AbSeq
allow simultaneous measurements of the surface protein
and gene expression, extracting both the intracellular
protein and gene expression within the same single cell
remains largely unexplored. This is because these
measures require permeabilization of cell membrane
which may result in cell death, thus impairing the
possibility to utilize current approaches for combining
intracellular protein expression quantification with other
modalities, such as scRNA-seq. This roadblock has been
recently addressed by intracellular staining and
sequencing (INs-seq),13 which permits the measurement
of both intracellular protein and mRNA. INs-seq was
applied to several immune subsets, including dendritic
cells, myeloid cells and T cells. For the latter, intracellular
quantification of the transcription factors FOXP3, TCF7
and ID2 in combination with scRNA-seq data revealed
gene modules associated with these transcription factors,
for example, TCF7+ cells had gene modules associated
with na€ıve phenotype (CCR7, SELL and LEF1), whereas
ID2+ cells revealed genes related to cytotoxicity (GNLY,
GZMA/B, PRF1).
Identification of cell state in diseases
Identifying cell states using multimodal analysis can lead
to the discovery of novel correlates of disease, clinical
integrated data sets arising from different samples (addressed in previous reviews1). As summarized in Table 1, we will review recent advances demonstrating the impact of single-cell multimodal analysis in understanding molecular cell state, temporal and spatial location of immune cells (Figure 1b). Finally, we will discuss future research opportunities.
MOLECULAR CELL STATEThe molecular state of an immune cell can be characterized by a combination of modalities from the genome, epigenome, transcriptome and proteome (Figure 1a). A common application of multimodal information is to isolate cells with a certain state using one modality, and then examine the cell state of these isolated cells in another modality. This process is sometimes repeated multiple times at different modalities. For example, surface protein markers have been traditionally used to first isolate or sort cells by fluorescence-activated cell sorting, followed by analysis using gene expression, immune receptors, chromatin accessibility regions or combinations of these modalities. One of the key advantages of single-cell analysis is to dissect the cellular and molecular heterogeneity in a tissue or sample, and even to identify subsets within the same cell type. The identification of cell states using multimodal analysis has been applied to analyze immune cells in healthy and disease, pathogen infection, autoimmune and cancer samples.
Identification of cell state in healthy samples
Single-cell multimodal analysis can be used to isolate cell subsets and characterize their molecular signatures from healthy samples, which can then be used as a baseline reference when comparing with immune cells from disease samples. For example, a recent study explored T-cell composition in lymphoid and nonlymphoid tissues from both healthy humans and mice.8 By combining single-cell gene expression with T-cell receptor (TCR) sequences, this study showed distinctive signatures between regulatory and memory subsets across lymphoid and nonlymphoid tissues, and also similar subsets of regulatory T cells across humans and mice. Unexpectedly, this integrated analysis also revealed that the same T-cell clones (i.e. with identical TCR) could be identified in lymphoid and nonlymphoid samples, thus suggesting migration of regulatory cells between organs.
Single cells can be separated using high-purity fluorescence-activated cell sorting into wells, after which mRNA or DNA is extracted for single-cell analyses. This is the case for plate-based approaches such as Smart-seq2
32
Table
1.Overview
ofthecurren
tap
plicationsofsingle-cellmultim
odal
analysisto
studyim
munecells
Applications
toim
munology
Gen
ome
Epigen
ome
Tran
scriptome
Targeted
proteins
Componen
tdetails
Referen
ce
Molecularcellstate
UU
Surfaceprotein
(sorting)+mRNA
8
UU
Surfaceprotein
(sorting)+TF
binding+chromatin
accessibility
+clone(TCR)
29
UU
Surfaceprotein
(barcode)
+mRNA
10,51
UU
Surfaceprotein
(barcode)
+mRNA
+clone(BCRan
dTC
R)
26
UU
Intracellularprotein
(sorting)+mRNA
13
UmRNA
+clone(TCR)
15–19
UU
USu
rfaceprotein
(sorting)+mRNA
+somatic
mutations+clone(BCR)
27
UU
USu
rfaceprotein
(sorting)+mRNA
+somatic
mutations
30
UU
Surfaceprotein
(sorting)+TF
binding+chromatin
accessibility
+clone(TCR)
29
Temporal
UU
Surfaceprotein
(sorting)+mRNA
+pseudotime(m
RNA)
5,6
UU
Surfaceprotein
(sorting)+chromatin
accessibility
+pseudotime(chromatin
accessibility)
36
UmRNA
+pseudotime(m
RNA)
35,62
UmRNA
+pseudotime(m
RNA)+clone(TCR)
44
UU
Surfaceprotein
(sorting)+mRNA
+clone(barcode)
38
UU
Surfaceprotein
(sorting)+mRNA
+clone(gen
etics)
63
UU
USu
rfaceprotein
(sorting)+mRNA
+chromatin
accessibility
+clone(m
itochondrial
DNA)
34
UU
Surfaceprotein
(sorting)+chromatin
accessibility
+clone(m
itochondrial
DNA)
39
UU
Surfaceprotein
(sorting)+mRNA
+clone(TCR)
33
UU
Surfaceprotein
(sorting)+mRNA
+clone(BCR)
37
UU
Surfaceprotein
(sorting)+mRNA
+clone(Ag-specificTC
R)
64,65
Spatial
UU
Surfaceprotein
(sorting)+mRNA
+spatial(celllocation)
41
UmRNA
+spatial(cell–cell)
3,4,43,44
BCR,B-cellreceptor;mRNA,messenger
RNA;TC
R,T-cellreceptor;TF,tran
scriptionfactor.
33
molecular signatures of influenza-specific CD8+ T cells
across different stages of infection.
The importance of single-cell multimodal analysis has
led to several recent studies of coronavirus disease 2019
(COVID-19). Single-cell analysis of both gene expression
profile and immune receptor sequencing has been also
performed on bronchoalveolar lavage fluids from patients
with mild or severe disease.25 This analysis revealed that
patients with mild COVID-19 disease were characterized
by highly clonally expanded CD8+ T cells, and that
proinflammatory monocyte-derived macrophages were
abundant in the bronchoalveolar lavage fluid from severe
COVID-19 cases. The use of proteomics, gene expression
and clonal information has also been investigated.26 Here,
surface protein using CITE-Seq, in addition to scRNA-seq
and B-cell receptor and TCR information, was used to
investigate the peripheral blood mononuclear cell of
COVID-19 patients. These authors showed that a pre-
exhaustion phenotype in HLA-DR+CD38+-activated T
cells and an anti-inflammatory signature in monocytes
are associated with progressive disease, whereas a TCR
and B-cell receptor analysis revealed a skewed clonal
distribution of CD8+ T- and primary B-cell response.
Single-cell multimodal analyses have been recently
applied for the first time in rare pathogenic B cells secreting
autoantibodies in the context of Sj€ogren syndrome.27 In this
study, B cells were first sorted as CD19+CD27+IgD�
memory cells from patients with Sj€ogren syndrome, to
isolate clonally related cells responsible for autoantibodies
associated with cryoglobulinemic vasculitis. By utilizing
single-cell genome and transcriptome sequencing,28 full-
length gene expression data from each cell were analyzed
with VDJPuzzle22 to reconstruct the full-length heavy and
light chains of immunoglobulin B cell secreting
autoantibodies, thus demonstrating the expansion of a
single “rogue” clone dominating the observed phenotype.
Single-cell DNA was then utilized to identify lymphoma
driver somatic mutations present only within the rogue
clone of autoantibody-forming B cells. This study provided
the first direct evidence that somatic mutations drive loss of
tolerance and disease pathogenesis.
Single-cell multimodal analysis has also been useful to
investigate the epigenetic profile of T-cell subsets and
their clonal expansion in the context of leukemia.29 By
combining assay for transposase-accessible chromatin
using sequencing with TCR sequencing, this study first
identified regulatory elements and transcription factors
associated with each canonical T-cell subset in healthy
donors. Surprisingly, this study found that the epigenetic
profiles of canonical T cell subsets form a continuum of
states, suggesting significant regulatory variability within
cell surface marker-defined subpopulations. By applying
this approach to T cells derived from leukemia patients
parameters and outcome. For example, a study performed proteomic and transcriptomic analysis using CITE-Seq and scRNA-seq from the peripheral blood mononuclear cells of healthy individuals vaccinated with influenza or yellow fever vaccine.14 This analysis revealed a distinctive baseline signature across low and high responders following vaccination. Within each cell type identified by CITE-Seq protein data, gene expression was used to identify significant differences between low and high responders within the plasmacytoid dendritic cell and lymphocyte clusters, suggesting that people who respond well to vaccines have a distinct activation status of cells at baseline (i.e. before vaccination).
Single-cell multimodal analysis has also been utilized to simultaneously study gene expression and clonal expansion of T cells and B cells. For instance, gene expression and immune receptor sequencing from both of these subsets were simultaneously measured from peripheral blood mononuclear cells of patients with metastatic melanoma treated with anti-CTLA-4 and anti-PD-1 immunocheckpoint blockade.15 By employing machine learning techniques, the authors of this study showed that clonally expanded subset of peripheral CD8+
T cells was associated with a long-term treatment response. Single-cell gene expression and immune receptor have also been applied to discover new cell states in cancer such as hepatocellular carcinoma, colorectal cancer and lung cancer,16–18 as well as in tumor infiltrating T cells in the context of novel immunocheckpoint blockade therapies (e.g. in melanoma).19
In the case of viral infections, single-cell multimodal analysis has proven extremely useful in the identification of viral-specific T cells and B cells. These cells are generally found in low numbers within the pool of circulating and resident cells, which pose challenges for their identification and separation for molecular and phenotypic analyses. Single-cell analysis has provided a means to accurately characterize rare cell populations.20 Several teams, including ours, have applied single-cell multimodal analyses to separate viral-specific CD8+ T cells using tetramers and then utilized index sorting and scRNA-seq (Smart-seq2) to simultaneously identify their gene expression and full-length TCR in individuals infected with hepatitis C virus.21,22 These analyses were then used to identify the active and resting subsets within these viral-specific responses, along with their clonal expansion. Similar applications have been also utilized to study chronic HIV infection, for instance, to demonstrate the existence of HIV-specific CD8+ T cells that recognize epitopes within the HLA-II instead of class I23 and influenza-specific CD8+ T cells,24 to reveal evolving
34
HSCs are stem cells derived from the bone marrow
which give rise to myeloid and lymphoid lineages and are
thus a natural starting point to study pseudotime using
single-cell multimodal analysis. Several works have
utilized the natural inherent relationship between HSCs
and differentiated immune cells, to infer the
differentiation trajectories using single-cell genomics. For
instance, differentiation trajectories were obtained from
scRNA-seq data of HSCs from the bone marrow of
mouse,5,6 which revealed three differentiation
trajectories,5 originating from sorted CD48�CD150+
CD45+EPCR+ HSCs, and ending with erythroid,
granulocytes–macrophage and lymphoid progenitors.
Another natural avenue for pseudotime analysis is the
study of T-cell selection in the thymus. In a recent
study,35 transcriptomic data were obtained from
developing and postnatal thymus and postnatal samples
covering the entire period of active thymic function.
Pseudotime values obtained from scRNA-seq revealed
developmental marker genes for different cell types
during T-cell development, such as ST18 for early double
negative, and AQP3 for double positive. The TCR was
also obtained, which revealed that the dependence of
nonproductive on productive recombination events was
associated with different cell types. For example, there
was a higher amount of fully recombined TCRbcompared with nonproductive chains in double-negative
stages that dropped to basal levels as cells entered double-
positive stages, thus demonstrating the impact of thymic
selection on the TCR repertoire.
Although pseudotime trajectories have been mostly
derived from scRNA-seq data, this metric can be also
obtained from other modalities, such as single-cell
chromatin accessibility data. For example, in a recent
study, cellular populations were sorted from CD34+
human bone marrow cells, including myeloid, erythroid
and lymphoid lineages.36 Pseudotime was then generated
from chromatin accessibility data, which showed motif
accessibility dynamics along myeloid cell differentiation.
For example, they showed that accessibility at
transcription factor motifs associated with HOXB8 and
GATA1 was high in HSC and decreased through
differentiation to common myeloid progenitors.
Clonal differentiation
Although trajectory analyses using pseudotime have
provided important insights into cell state differentiation,
these approaches have limitations in revealing the true
cell lineage endpoint. To achieve this goal, novel methods
have been developed which can identify clonal markers in
individual cells, in addition to also measuring “omes”
the authors identified the state of abnormal clones, hence determining the mechanisms driving disease. In a separate study,30 mutations from scRNA-seq data were used to identify and isolate three clones in a bone marrow sample from a patient with acute myeloid leukemia. Gene expression was then used to identify the cell-type compositions of these clones, determining that these clones belonged to progenitor-like, monocyte-like and dendritic cell-like cells.
TEMPORAL ANALYSESAs discussed, multimodal measurements of immune cells can lead to a deeper understanding of the heterogeneity inherent in these immune cells, and the changes that a disease can cause to cell state. However, the molecular state of an immune cell is a dynamic process, from HSC generation in the bone marrow to its differential fate. Single-cell measurements are crucial in obtaining an accurate estimation of this temporal differentiation. This is because bulk samples contain a mixture of cells at various differential stages, thus tracking the average bulk expression across time may not reflect the terminal differential trajectory.31 In order to study the evolving state along a cell lineage, the molecular-state modalities will need to be coupled with temporal modalities. We define temporal modalities as information related to the time ordering of cells during their differentiation process. The ideal scenario would be to obtain measurements of cell states belonging to the same clone at different time points. However, this information is not always available, for instance, from cross-sectional studies. Recent single-cell technologies have attempted to address these issues which have allowed for the study of cell state32 and clonal differentiation.33,34
Molecular cell state differentiation
Numerous algorithms have been proposed to estimate time information for each cell with a metric known as “pseudotime” using either gene expression or chromatin accessibility data.32 This metric describes how a modality changes in a continuous differentiation process along a trajectory. To obtain this trajectory, a dimension reduction step is first performed so that each cell is embedded in a lower dimensional space. A trajectory is then formed in this space, with cells positioned along this trajectory depending on their transcriptional or accessibility profiles.32 Despite pseudotime values being only an estimate of how cell state evolves over time, and clonal information is not known for each cell, numerous immunological discoveries have already been made using this metric, which we will now review.
35
transposase-accessible chromatin using sequencing) to
cultured CD34+ HSCs, collected over the course of
20 days. Mitochondrial DNA was extracted, which
incrementally accumulates genetic mutations passed onto
daughter cells, and subsequently used for lineage tracing.
Combining lineage tracing with chromatin profiles
revealed possible fates of HSPCs, in particular
distinguishing bipotent progenitors from those biased in
favor of an erythroid versus monocytic fate. Lineage
tracing using mitochondrial DNA has also been applied
to study acute myeloid leukemia.39 Clones were first
isolated based on mutations in the mitochondrial DNA
from assay for transposase-accessible chromatin using
sequencing data, taken from primary blood samples of a
patient with acute myeloid leukemia. This allowed new
insights into “preleukemic” HSCs, adding to the evidence
that this cell population is heterogeneous with multiple
clones, and that the lineage giving rise to acute myeloid
leukemia is not the lineage with the optimal potential
among pluripotent HSCs.
SPATIAL ANALYSES AND CELL–CELLCOMMUNICATION
Multimodal applications at the single-cell level have also
been applied to study how molecular-state modalities are
affected by spatial modalities, in particular a cell’s spatial
location within a tissue, and its location relative to other
cells. Technologies which measure spatial location are
becoming increasingly available,40,41 and some of these
have been already applied in immunology. For example,
single-cell spatially resolved transcriptomics was applied
to mice bone marrow niches using an improved version
of laser-capture microdissection coupled with
sequencing.41 This allowed the transcriptional profile of
major bone marrow cell types to be determined, and
their spatial location in distinct bone marrow niches.
This analysis also showed that Cxcl12-abundant-reticular
cell subsets differentially localize to sinusoidal and
arteriolar surfaces and act locally as “professional
cytokine-secreting-cells.”
Some studies have utilized single-cell multiomics to
investigate cell–cell interactions, and recently applied to
COVID-19. For example, cell–cell interaction was
estimated using CellPhoneDB,42 which was applied to
scRNA-seq data obtained from nasopharyngeal and
bronchial samples in patients with moderate or critical
disease.43 This analysis revealed a higher number of
epithelium–immune cell interactions in patients with
critical COVID-19, in particular for CD8+ T cells,
nonresident macrophages and monocyte-derived
macrophages, thus likely contributing to clinical
observations of heighted inflammatory tissue damage.
and temporal information in the same single cell. We will discuss several of these approaches.
The most natural way to track clones in T cells and B cells is by their unique cell receptor. In the context of cellular immunotherapies, such as chimeric antigen receptor (CAR) T cells, single-cell multimodal analysis have been recently applied to study clonality, gene signatures and kinetics. TCRs were used to track CAR-T cells in patients undergoing anti-CD19 CAR-T immunotherapy in leukemia, in order to understand characteristics of clonally expanded CAR-T cells.33 In this study, CAR-T cells were sorted from blood samples from patients with B-cell acute or chronic lymphoblastic leukemia to isolate CD8+ CAR-T cells using a truncated version of the epidermal growth factor receptor, which is coexpressed with the CAR on the T-cell surface. A decrease in TCR diversity was observed after CAR-T infusion, suggesting that CAR-T cells underwent clonal expansion. Gene expression analysis showed clones which increase in frequency after infusion displayed higher expression of cytotoxic genes. Gene expression analysis of the infusion product showed distinct clusters distinguished by expression of activation, cytotoxicity, mitochondrial and cell cycle-associated genes. Tracking clones via their immune receptor has been also applied to autoimmune diseases.37 Transitional IgDlow B cells were first sorted from peripheral blood mononuclear cell collected longitudinally from patients with myasthenia gravis who relapsed after treatment with rituximab, a B-cell-depleting drug. B-cell receptor clones were then isolated using the gene expression data, which were shown to be related to clones identified previously from untreated patients. This then allowed identification of persistent B cells. Clustering using gene expression revealed 820 persistent clones in both memory B-cell and antibody-secreting cell clusters.A recent approach has been to identify clones with
“barcodes,” which can be identified at the single-cell level. This approach has been applied to HSC using a lentiviral delivery system.38 Cells cultured in vitro and cells transplanted in vivo were collected over several days, and then sorted to isolate oligopotent and multipotent progenitor cells using flow cytometric markers. In this study, the early transcriptional signature of HSC was linked to the clonal fates via barcoding. This high-
throughput system allowed mapping of more than 300 000 cells and 10 968 distinct clones, and identified genes correlating with fate, revealing two routes of monocyte differentiation that give rise to distinct subsets in immune compartments.
Another promising approach to track clones is the use of mitochondrial DNA.34 This was performed by applying single-cell chromatin accessibility assay (assay for
36
to identify the target genes which are linked to a
transcription factor, as this can lead to a better
understanding of the molecular network modules that
drive immune-cell lineages and their differentiation. Two
multimodal technologies can potentially address this
issue. The first is thiol(SH)-linked alkylation of the
metabolic sequencing of RNA, which integrates scRNA-
seq with metabolic RNA labeling to provide two
modalities in the transcriptome: total RNA levels and
recently transcribed RNA.47 When combined with
perturbation methods, thiol(SH)-linked alkylation of the
metabolic sequencing of RNA can identify target genes of
transcriptional regulators, as has been shown in cancer
cells.48 Single-nucleus chromatin accessibility and mRNA
expression sequencing can also be used to understand
gene regulation, as this technology provides high-
throughput sequencing of the transcriptome and
chromatin accessibility in the same cell.49 Other advances
can be applied for a different problem of characterizing
molecular cell state, and include those utilizing
transcriptomics (mRNA) as one of its modalities, in
addition to either DNA,50 protein expression,10,51
chromatin accessibility52,53 or DNA methylation.54–57
Lineage tracing of immune cells that do not carry a
natural bar code, such as TCR or B-cell receptor, can also
benefit from recent technologies using synthetic barcodes,
such as CellTagging which uses a lentiviral approach.58
This approach offers advantages over alternate
approaches where gene editing is challenging.58
Despite the rapid increase of single-cell multimodal
approaches, several computational and technical caveats
still need to be addressed for optimal analysis of these
data. For example, the sequencing output maybe too
shallow to identify the immune receptor, and optimal
gene expression requires a deeper coverage to control for
technical noise and drop out of low-expressing genes.59
Similarly, better tools with deeper sequencing are
required for identification of more complex gene
expression quantities, such as isoforms. Multimodal
technologies also carry a substantial level of technical
noise which can blur the true biological variation that
exist,1 for instance, between gene and protein expression,
and are an important area for future work. Finally, a
significant challenge is the development of bioinformatics
tools to permit integration of these data, as recently
reviewed.1
The rapid growth of single-cell multimodal
technologies has also generated debate about the precise
definition of how a “mode” or an “omic” is defined. We
have opted for a broad definition, considering a mode as
any type of information from the same single cell, which
is consistent with a previous definition in a highly cited
review.1 We have thus included temporal and spatial
Both spatial and temporal single-cell multimodal analysis have also been also performed on bronchoalveolar lavage fluid from mild and critical patients.44 TCR clonal information, gene expression and pseudotime analysis revealed that patients with mild COVID-19 were characterized by fully differentiated resident memory T cells undergoing active clonal expansion, whereas in critical COVID-19 patients, these resident memory T cells fail to differentiate or expand. In the same study, the authors also applied CellPhoneDB to show differences in immune cell-type interactions between mild and severe COVID-19. For example, they showed that interactions between monocytes/macrophages and neutrophils almost always involve promigratory interactions in critical COVID-19, but interleukin signaling in mild COVID-19. Other non-COVID applications include those which studied cellular interactions between melanoma and head and neck cancer cells and various immune cells, including T cells and macrophages3 and isolated T-cell subsets from transcriptomics data.4
CONCLUSIONS AND FUTURE DIRECTIONSSingle-cell multimodal technologies have led to exciting discoveries on the mechanisms underpinning the immune system. We have highlighted some of these discoveries, which have provided insight into the molecular state of immune cells, how these states evolve over time and the impact of spatial location. These technologies are becoming increasingly available and easy to apply, as exemplified by the recent publications in the field of COVID-19 research in the last few months.25,26,43,44
These technologies can also be used to answer more general questions, such as quantifying the relationship between transcripts and proteins, or dissecting the landscape of post-translational modifications. In this area, there remains more work to be done, as shown by recent studies investigating promoter accessibility–gene expression45 and gene expression–protein46 correlation. In immunology, there have already been significant advances in the last decade using single-cell multimodal analysis, as we have reviewed. However, despite these achievements, there remains promising avenues to be explored. For example, it is conceivable that with the development of new multimodal technologies, further cellular states which comprise of a combination of modalities across different “omes” will be discovered. Combined with spatial modalities, these cellular states may also be spatially dependent. Other promising avenues will now be discussed.
Novel multimodal technologies are being proposed every year, with some yet to be applied to immune cells. For example, an important and yet unresolved problem is
37
1. Stuart T, Satija R. Integrative single-cell analysis. Nat RevGenet 2019; 20: 257–272.
2. Manno GL, Soldatov R, Zeisel A, et al. RNA velocity ofsingle cells. Nature 2018; 560: 494–498.
3. Ren X, Zhong G, Zhang Q, Zhang L, Sun Y, Zhang Z.Reconstruction of cell spatial organization from single-cellRNA sequencing data based on ligand-receptor mediated.Cell Res 2020; 30: 763–778.
4. Braga FV, Kar G, Berg M, Carpaij O, Polanski K. Acellular census of healthy lung and asthmatic airway wallidentifies novel cell states in health and disease. Nat Med2019; 25: 1153–1163.
5. Nestorowa S, Hamey FK, Pijuan Sala B, et al. A single-cellresolution map of mouse hematopoietic stem andprogenitor cell differentiation. Blood 2016; 128: 20–32.
6. Tikhonova AN, Dolgalev I, Hu H, et al. The bone marrowmicroenvironment at single-cell resolution. Nature 2019;569: 222–228.
7. Macaulay IC, Ponting CP, Voet T. Single-cell multiomics:multiple measurements from single cells. Trends Genet2017; 33: 155–168.
8. Miragaia RJ, Gomes T, Chomka A, et al. Single-celltranscriptomics of regulatory T cells reveals trajectories oftissue adaptation. Immunity 2019; 50: 493–504.
9. Hwang B, Lee JH, Bang D. Single-cell RNA sequencingtechnologies and bioinformatics pipelines. Exp Mol Med2018; 50: 1–14.
10. Stoeckius M, Hafemeister C, Stephenson W, et al.Simultaneous epitope and transcriptome measurement insingle cells. Nat Methods 2017; 14: 865–868.
11. Mair F, Erickson JR, Voillet V, et al. A targeted multi-omic analysis approach measures protein expression andlow-abundance transcripts on the single-cell level. Cell Rep2020; 31: 1–13.
12. Granja JM, Klemm S, McGinnis LM, et al. Single-cellmultiomic analysis identifies regulatory programs in mixed-phenotype acute leukemia. Nat Biotechnol 2019; 37: 1458–1465.
13. Katzenelenbogen Y, Sheban F, Katzenelenbogen Y, et al.Coupled scRNA-Seq and intracellular protein activityreveal an immunosuppressive role of TREM2 in cancer.Cell 2020; 182: 1–14.
14. Kotliarov Y, Sparks R, Martins AJ, et al. Broad immuneactivation underlies shared set point signatures for vaccineresponsiveness in healthy individuals and disease activityin patients with lupus. Nat Med 2020; 26: 618–629.
15. Fairfax BP, Taylor CA, Watson RA, et al. Peripheral CD8+
T cell characteristics associated with durable responses toimmune checkpoint blockade in patients with metastaticmelanoma. Nat Med 2020; 26: 193–199.
16. Zhang Q, He Y, Luo N, et al. Landscape and dynamics ofsingle immune cells in hepatocellular carcinoma. Cell2019; 179: 829–845.
17. Wu TD, Madireddi S, de Almeida PE, et al. Peripheral Tcell expansion predicts tumour infiltration and clinicalresponse. Nature 2020; 579: 274–278.
18. Guo X, Zhang Y, Zheng L, et al. Global characterizationof T cells in non-small-cell lung cancer by single-cellsequencing. Nat Med 2018; 24: 978–985.
19. Sade-Feldman M, Yizhak K, Bjorgaard SL, et al. DefiningT cell states associated with response to checkpointimmunotherapy in melanoma. Cell 2018; 175: 998–1013.
20. Nguyen A, Phan TG. Single cell RNA sequencing of rareimmune cell populations. Front Immunol 2018; 9: 1–11.
information as separate modalities, in addition to different information obtained from the same data set. Temporal and spatial information has also been considered by other authors as its own separate omic or modality.60,61 We envisage that future research in this field will lead to more advanced approaches and methods to better quantify the temporal and spatial modalities of immune cells and converge in adopting a less confusing language, thus increasing the involvement of immunologists in the field of single cell.Single-cell data sets are growing remarkably fast. For
instance, the Human Cell Atlas (https://data.humancellatlas. org), which comprises already approximately 2.7 million cells, and 19.68 TB of data across multiple tissues in both healthy and human diseases. This initiative has already significantly contributed to immunology by providing novel data sets across thymus, spleen and other organs, as well as characterizing novel subsets of lymphocytes and monocytes through development and into adulthood. A major aim of data-gathering initiatives such as the Human Cell Atlas is to increase sample size and permit interrogation of single-cell multimodal data for more complex questions, such as identifying the entire cell composition of the human body, or to predict with machine learning algorithms the clinical outcome in disease. We envisage that single-cell multimodal technologies will pervade basic and translational immunology research and will become a tool to discover mechanisms and new cell states, as well as to mold novel immune therapies to effectively target specific molecular pathways in disease and allow the identification of target cells.
ACKNOWLEDGMENTS
This research was supported by a NHMRC Project grant (APP1121643 to FL). FL is funded by an NHMRC CDA fellowship (APP1128416).
AUTHOR CONTRIBUTIONS
Raymond HY Louie: Conceptualization; Investigation; Writing-original draft; Writing-review & editing. Fabio Luciani: Conceptualization; Investigation; Supervision; Writing-review & editing.
CONFLICT OF INTERESTThe authors declare no conflicts of interest.
REFERENCES
38
21. Eltahla AA, Rizzetto S, Pirozyan MR, et al. Linking the Tcell receptor to the single cell transcriptome in antigen-specific human T cells. Immunol Cell Biol 2016; 94: 604–611.
22. Rizzetto S, Koppstein DNP, Samir J, et al. B-cell receptorreconstruction from single-cell RNA-seq with VDJPuzzle.Bioinformatics 2018; 16: 2846–2847.
23. Ranasinghe S, Lamothe PA, Soghoian DZ, et al. AntiviralCD8+ T Cells restricted by human leukocyte antigen classII exist during natural HIV infection and exhibit clonalexpansion. Immunity 2016; 45: 917–930.
24. Wang Z, Zhu L, Nguyen THO, et al. Clonally diverseCD38+HLA-DR+CD8+ T cells persist during fatal H7N9disease. Nat Commun 2018; 9: 1–12.
25. Liao M, Liu Y, Yuan J, et al. Single-cell landscape ofbronchoalveolar immune cells in patients with COVID-19.Nat Med 2020; 26: 842–844.
26. Unterman A, Sumida TS, Nouri N, et al. Single-cell omicsreveals dyssynchrony of the innate and adaptive immunesystem in progressive COVID-19. medRxiv 2020. https://doi.org/10.1101/2020.07.16.20153437. [Epub ahead ofprint].
27. Singh M, Jackson KJL, Wang JJ, et al. Lymphoma drivermutations in the pathogenic evolution of an iconic humanautoantibody. Cell 2020; 180: 878–894.
28. Macaulay IC, Haerty W, Kumar P, et al. G&T-seq: parallelsequencing of single-cell genomes and transcriptomes. NatMethods 2015; 12: 519–522.
29. Satpathy AT, Saligrama N, Buenrostro JD, et al.Transcript-indexed ATAC-seq for precision immuneprofiling. Nat Med 2018; 24: 580–590.
30. van Galen P, Hovestadt V, Wadsworth MH, et al. Single-cell RNA-Seq reveals AML hierarchies relevant to diseaseprogression and immunity. Cell 2019; 176: 1265–1281.
31. Trapnell C. Defining cell types and states with single-cellgenomics. Genome Res 2015; 25: 1491–1498.
32. Saelens W, Cannoodt R, Todorov H, Saeys Y. Acomparison of single-cell trajectory inference methods.Nat Biotechnol 2019; 37: 547–554.
33. Sheih A, Voillet V, Hana L, et al. Clonal kinetics andsingle-cell transcriptional profiling of CAR-T cells inpatients undergoing CD19 CAR-T immunotherapy. NatCommun 2020; 11: 219.
34. Lareau CA, Ludwig LS, Muus C, et al. Massively parallelsingle-cell mitochondrial DNA genotyping and chromatinprofiling. Nat Biotechnol 2020; https://doi.org/10.1038/s41587-020-0645-6
35. Park JE, Botting RA, Conde CD, et al. A cell atlas ofhuman thymic development defines T cell repertoireformation. Science 2020; 367: eaay3224.
36. Buenrostro JD, Corces MR, Lareau CA, et al. Integratedsingle-cell analysis maps the continuous regulatorylandscape of human hematopoietic differentiation. Cell2018; 173: 1535–1548.
37. Jiang R, Fichtner ML, Hoehn KB, et al. Single-cellrepertoire tracing identifies rituximab-resistant B cellsduring myasthenia gravis relapses. JCI insight 2020; 5:1–18.
38. Weinreb Caleb, Rodriguez-Fraticelli Alejo, CamargoFernando D, Klein AM. Lineage tracing on transcriptionallandscapes links state to fate during differentiation. Science2020; 367: eaaw3381.
39. Xu J, Nuno K, Litzenburger UM, et al. Single-cell lineagetracing by endogenous mutations enriched in transposaseaccessible mitochondrial DNA. Elife 2019; 8: 1–14.
40. Codeluppi S, Borm LE, Zeisel A, et al. Spatial organizationof the somatosensory cortex revealed by osmFISH. NatMethods 2018; 15: 932–935.
41. Baccin C, Al-Sabah J, Velten L, et al. Combined single-celland spatial transcriptomics reveal the molecular, cellularand spatial bone marrow niche organization. Nat Cell Biol2020; 22: 38–48.
42. Efremova M, Vento-Tormo M, Teichmann SA, Vento-Tormo R. Cell PhoneDB: inferring cell–cellcommunication from combined expression of multi-subunit ligand–receptor complexes. Nat Protoc 2020; 15:1484–1506.
43. Chua RL, Lukassen S, Trump S, et al. COVID-19 severitycorrelates with airway epithelium–immune cellinteractions identified by single-cell analysis. NatBiotechnol 2020; 38: 970–979.
44. Wauters E, Van Mol P, Garg A, et al. Discriminating mildfrom critical COVID-19 by innate and adaptive immunesingle-cell profiling of bronchoalveolar lavages. bioRxiv2020. https://doi.org/10.1101/2020.07.09.196519. [Epubahead of print].
45. Starks RR, Biswas A, Jain A, Tuteja G. Combined analysis ofdissimilar promoter accessibility and gene expression profilesidentifies tissue-specific genes and actively repressednetworks. Epigenetics and Chromatin 2019; 12: 1–16.
46. Liu Y, Beyer A, Aebersold R. On the dependency ofcellular protein levels on mRNA abundance. Cell 2016;165: 535–550.
47. Herzog VA, Reichholf B, Neumann T, et al. Thiol-linkedalkylation of RNA to assess expression dynamics. NatMethods 2017; 14: 1198–1204.
48. Muhar M, Ebert A, Neumann T, et al. SLAM-seq definesdirect gene-regulatory functions of the BRD4- MYC axis.Science 2018; 360: 800–805.
49. Chen S, Lake BB, Zhang K. High-throughput sequencingof the transcriptome and chromatin accessibility in thesame cell. Nat Biotechnol 2019; 37: 1452–1457.
50. Dey SS, Kester L, Spanjaard B, Bienko M, VanOudenaarden A. Integrated genome and transcriptomesequencing of the same cell. Nat Biotechnol 2015; 33: 285–289.
51. Shahi P, Kim SC, Haliburton JR, Gartner ZJ, Abate AR.Abseq: Ultrahigh-throughput single cell protein profilingwith droplet microfluidic barcoding. Sci Rep 2017; 7: 1–12.
52. Liu L, Liu C, Quintero A, et al. Deconvolution of single-cell multi-omics layers reveals regulatory heterogeneity.Nat Commun 2019; 10: 1–10.
53. Cao J, Cusanovich DA, Ramani V, et al. Joint profiling ofchromatin accessibility and gene expression in thousandsof single cells. Science 2018; 361: 1380–1385.
39
54. Hu Y, Huang K, An Q, et al. Simultaneous profiling oftranscriptome and DNA methylome from a single cell.Genome Biol 2016; 17: 1–11.
55. Angermueller C, Clark SJ, Lee HJ, et al. Parallel single-cellsequencing links transcriptional and epigeneticheterogeneity. Nat Methods 2016; 13: 229–232.
56. Clark SJ, Argelaguet R, Kapourani CA, et al. ScNMT-seqenables joint profiling of chromatin accessibility DNAmethylation and transcription in single cells. Nat Commun2018; 9: 1–9.
57. Wang Y, Yuan P, Yan Z, et al. Single-cell multiomicssequencing reveals the functional regulatory landscape ofearly embryos. bioRxiv 2019. https://doi.org/10.1101/803890. [Epub ahead of print].
58. Kong W, Biddy BA, Kamimoto K, Amrute JM, Butka EG,Morris SA. Cell Tagging: combinatorial indexing tosimultaneously map lineage and identity at single-cellresolution. Nat Protoc 2020; 15: 750–772.
59. Rizzetto S, Eltahla AA, Lin P, et al. Impact of sequencingdepth and read length on single cell RNA sequencing dataof T cells. Sci Rep 2017; 7: 1–11.
60. Lederer AR, La Manno G. The emergence and promise ofsingle-cell temporal-omics approaches. Curr OpinBiotechnol 2020; 63: 70–78.
61. Bingham GC, Lee F, Naba A, Barker TH. Spatial-omics:Novel approaches to probe cell heterogeneity and extracellularmatrix biology.Matrix Biol 2020; 91–92: 152–166.
62. Schulte-Schrepping J, Reusch N, Paclik D, et al. SevereCOVID-19 is marked by a dysregulated myeloid cellcompartment. Cell 2020; 182: 1–22.
63. Upadhaya S, Sawai CM, Papalexi E, et al. Kinetics of adulthematopoietic stem cell differentiation in vivo. J Exp Med2018; 215: 2815–2832.
64. Yao C, Sun HW, Lacey NE, et al. Single-cell RNA-seqreveals TOX as a key regulator of CD8+ T cell persistencein chronic infection. Nat Immunol 2019; 20: 890–901.
65. Koutsakos M, Illing PT, Nguyen THO, et al. HumanCD8+ T cell cross-reactivity across influenza A, B and Cviruses. Nat Immunol 2019; 20: 613–625.
ª 2021 Australian and New Zealand Society for Immunology, Inc.
40
Genomic Cytometry and New Modalities for DeepSingle-Cell Interrogation
Robert Salomon,1,2* Luciano Martelotto,3 Fatima Valdes-Mora,4,5 David Gallego-Ortega1,4,6
� AbstractIn the past few years, the rapid development of single-cell analysis techniques hasallowed for increasingly in-depth analysis of DNA, RNA, protein, and epigenetic states,at the level of the individual cell. This unprecedented characterization ability has beenenabled through the combination of cytometry, microfluidics, genomics, and informat-ics. Although traditionally discrete, when properly integrated, these fields create thesynergistic field of Genomic Cytometry. In this review, we look at the individualmethods that together gave rise to the broad field of Genomic Cytometry. We furtheroutline the basic concepts that drive the field and provide a framework to understandthis increasingly complex, technology-intensive space. Thus, we introduce GenomicCytometry as an emerging field and propose that synergistic rationalization of dispa-rate modalities of cytometry, microfluidics, genomics, and informatics under onebanner will enable massive leaps forward in the understanding of complex biology.© 2020 International Society for Advancement of Cytometry
� Key termsgenomic cytometry; technology; cytometry; genomicsmicrofluidics; single-cell
THE cell is the basic unit of life and is capable of a vast array of biological complex-ity. In order to understand how different populations of cells can functionally coexistto form organs, organisms, and indeed disease, it is critical to profile all aspects ofthe individual cells. The capability to perform in-depth single-cell analysis has pro-vided us with a more complete understanding of disease, development, and normalfunction. Moreover, the application of single-cell genomic technologies has alreadyidentified many of the molecular features of cell populations within tissues, organs,and diseases.
Techniques that together comprise the field of Genomic Cytometry havealready been used to reveal a fundamental aspect of biology. Most notably, that cellpopulations are more heterogeneous than ever imagined. Each individual cell isunique in terms of space (e.g., physical position in tissues and/or organs), time(e.g., phases of cell cycle, activation or developmental state), and molecular profile.This uniqueness makes understanding the underlying biology a significant challenge.
While in the past, scientists could interrogate, enumerate, and classify cell typesaccording to their appearance under the microscope, this analysis is limited in thenumber of characteristics that can be simultaneously probed, the rate at whichobservations can be made has relied heavily on the individual interpreting the data.Modern flow cytometry emerged to give an additional level of detail to the classifica-tion process. By making use of multi-parameter, multi-laser instruments, flow cyto-metry has redefined cell classification at the molecular level, aided the discovery anddefinition of major and minor cell subsets, and has quickly become an essential toolfor dissecting the functional complexity of cell populations. In general, however, it isprimarily used to identify cellular protein expression profiles and despite being able
1Institute for Biomedical Materials andDevices, The University of TechnologySydney, Ultimo, New South Wales, 2006,Australia2ACRF Child Cancer Liquid Biopsy Program,Children’s Cancer Institute. Lowy CancerResearch Centre, University of New SouthWales (UNSW) Sydney, Randwick, NewSouth Wales, 2031, Australia3Centre for Cancer Research, University ofMelbourne, Parkville, Victoria, Australia4St Vincent’s Clinical School, Faculty ofMedicine, University of New South Wales(UNSW) Sydney, Darlinghurst, New SouthWales, 2010, Australia5Cancer Epigenetic Biology and Therapeutics.Personalised Medicine Theme. Children’sCancer Institute. Lowy Cancer ResearchCentre, University of New South Wales(UNSW) Sydney, Randwick, New SouthWales, 2031, Australia6Tumour Development Lab, The KinghornCancer Centre, Garvan Institute of MedicalResearch, Darlinghurst, New South Wales,2010, Australia
Received 6 November 2019; Revised 28June 2020; Accepted 7 August 2020*Correspondence to: Robert Salomon,Institute for Biomedical Materials andDevices, The University of TechnologySydney, Ultimo New South Wales 2006,Australia Email: [email protected] [email protected]
Published online 5 September 2020 inWiley Online Library(wileyonlinelibrary.com)
DOI: 10.1002/cyto.a.24209
© 2020 International Society forAdvancement of Cytometry
REVIEW ARTICLE
41
to process many millions of cells in a rapid manner, it is alsohampered by limited dimensionality and the inherent loss ofanatomical context.
Limits around fluorochrome uniqueness and detectornumbers result in characterization ability topping out aroundthe 30-parameter mark. In line with advances in fluorescentcytometry and the development of spectral cytometers (1),instruments such as the CYTOF (2) have matured and arenow capable of 40+ parameters (3, 4). While there is somedebate around the benefits and trade-offs associated with theuse of mass cytometry (5), the advent of scanning ablationand ion beam systems (6–8) has helped to bridge the gapbetween imaging and flow cytometry. In doing so, they haveprovided tools that allow 2D reconstruction of tissue sectionssuch that anatomical location of protein expression can beperformed down to the micrometer range. A recent study byKeren et al. has improved this resolution down to 260 nm(9). These imaging systems, however, tend to be much slowerthan traditional cytometry and have their own uniquechallenges.
Given that even the most advanced methods in fluores-cence and mass cytometry are still limited, it is clear that newmethods must emerge to allow deep single-cell characteriza-tion. In order to be widely applicable in biological studies,these systems should provide throughputs similar to currentfluorescent flow cytometric techniques while also providingimproved dimensionality (hundreds to thousands of parame-ters simultaneously). By combining advances in cytometrywith the tools emerging from the field of single-cell genomics,we are entering a new era of Genomic Cytometry. With thetools and workflows being created by today’s emerging geno-mic cytometrists, we now can understand, in a concertedmanner, many aspects of individual cells.
Genomic Cytometry techniques, while focused on thesingle cell, allow us to identify and characterize a group ofsingle cells that share a similar function. The characteristicsable to be probed are no longer limited to protein expressionprofiles, but now include aspects such as DNA, RNA, pro-teins, metabolites, and even epigenetic modifications. Thisunprecedented ability to sensitively interrogate large numbersof individual cells at a reduced cost is accelerating discoveryand challenging existing paradigms in cytometry. Perhapsmore importantly, this technological leap is transforming howwe understand basic and translational biology.
The single-cell multi-omics revolution has fostered a par-allel development of computational approaches, necessary tointegrate and understand the data generated from single-cellgenomic techniques. These methodologies and approacheshave been described elsewhere (10–14). In this review, weanalyze the factors and motivations that have given rise to thefield of Genomic Cytometry. We also provide an overview ofthe tools currently available in this space.
UNRAVELING CELLULAR COMPLEXITY
Cells are complex assemblies of macromolecules andchemicals that function as a single unit during homeostasis,
development, and disease. Currently, there are a multitude ofdifferent tools and methodologies that can be used to charac-terize a cell. These tools can measure the physical characteris-tics of a cell (such as size, deformability, electrical impedance,and density) as well as biochemical aspects such as DNA,RNA, and protein (concentration, monomer composition,and chemical status including mutation, acetylation, phos-phorylation, methylation, etc.). Importantly, emerging toolsare increasingly allowing the simultaneous characterization ofthese parameters. This is known as multi-omics.
Although many aspects of a cell are able to be assessedby traditional cytometry, it has primarily been leveraged tocharacterize protein expression profiles at the level of theindividual cell. From the early advent of fluorescence cyto-metry in the late 1960s (15) and cell sorting in 1965 (16), theunderlying technology has remained relatively static. Instru-ment manufacturers have added additional laser lines andincreased detector numbers in order to improve multiplexedsingle-cell characterization; however, flow cytometry is stillhampered by a lack of spectrally unique fluorochromes.Recent developments in dye technology, particularly aroundtunable polymer-based dyes (17), have allowed flow cyto-metry assays to reach into the 28 color range (18–22). How-ever, if we look at the total cellular complexity, it is clear thateven high dimensional fluorescent flow cytometry is incapableof completely characterizing the full range of cellular identi-ties and cellular states.
To understand the challenge of fully characterizing a sin-gle cell, we must look at the complexity within cells (Table 1).The human genome is composed of 3 billion nitrogenousbases. These are structurally organized into regions that canbe transcribed to RNA and subsequently translated to protein.These regions are known as genes. Although there is still con-jecture around the number of genes (24, 25), studies suggestthat the number of human genes sits somewhere in excess of19,000 (26–29). Of the estimated 19,000 protein-coding genes,it is possible to make many different proteins, some authorssuggest as many as 100 different proteins can be made from
Table 1. Potential complexity of the individual human cell
MEASURABLE CHARACTERISTIC
ESTIMATED OBSERVABLE
NUMBERS
DNA 3,000,000,000 (bases)Epigenetic states
Open chromatin regions(enhancers andpromoters)
100,000–150,000 (peaks)
DNA methylation 25,000 (CpG islands)Three-dimensional genomearchitecture
�7,000,000 (long-rangecontacts) (23)
RNA 19,000 (coding genes)–100,000 (noncodingRNAs)
Proteins >19,000CD markers >400
REVIEW ARTICLE
42
each gene (30). To date, the Human Cell Differentiation Mol-ecule (HCDM) group has defined over 400 cluster of differen-tiation markers (31).
In addition to regions that code for proteins, noncodingregions of the DNA also exist. These regions includeenhancers, insulators, and promoters, which are key for geneexpression regulation and thus important markers of cell-type. Epigenetic mechanisms like DNA methylation, histonepost-translational modifications, expression of noncodingRNAs, three-dimensional, structure and nucleosome position-ing all shape the conformation of the chromatin to regulategene transcription adding an additional layer of complexity tothe characteristics of the cell (32).
COMMON GENOMIC CYTOMETRY APPROACHES
Broadly speaking, it is possible to arrange Genomic Cyto-metry techniques into five main methodology categories.These categories are shown in Figure 1, they are:
1. Plate-based approaches (making use of traditional Fluo-rescent Activated Cell Sorting [FACS]).
2. Microfluidics: (1) Droplet-based microfluidics (aqueousreaction chambers created within an oil-in-water droplet);and (2) solid microfluidics (miniaturized single-cell han-dling tools with associated molecular workflows fordownstream characterization).
3. In situ combinatorial indexing (using the cell as the reac-tion chamber itself).
4. Image-based approaches (making use of direct imagingor spatially traceable barcodes to create high dimensional,anatomically relevant images).
5. Spatial transcriptomics (combining basic imaging withnovel positionally traceable cellular barcodes).
Plate-Based ApproachesPlate-based assays are the most familiar to the traditionalcytometrist and are one of the few high throughput GenomicCytometry methods that currently allow active single-celldeposition. Active cell deposition is usually achieved usingFACS, which allows selective deposition of cells based oncharacteristics measurable by traditional flow cytometrytechniques.
Mechanically, plate sorting is most commonly achievedthrough the use of electrostatic droplet-based cell sorting. Inthese systems, single cells are sequentially flown through aninterrogation point, characterized and deflected into the wellof a microtiter plate. By incorporating a system capable ofmoving the microtiter plate with repeated micron-level accu-racy, it is possible to target individual wells sequentially. Cellsare deposited into 96- or 384-well microtiter plates; however,in some cases higher density plates can be used. In additionto ensuring the target cell is deposited into the correct well,most instruments will allow the operator to control the likeli-hood that (1) a cell is in the deflected drop, (2) more thanone target cell is not deposited, and (3) the nontarget cellcontamination is minimized. For traditional FACS, these arecontrolled through the application of a sort mask and allowthe operator to balance cellular throughput and deflectionaccuracy with the requirements of high-speed cell sorting.
As current cytometers have not yet overcome therandomness of cell arrival times, many cells that meet theselection criteria are not deposited into the sort well. Single-cell masks look at the predicted position of the cell in theindividual drop and will abort the sort if the cell is located ineither the leading or trailing edge of the drop. This meansthat single cells on the periphery of the drop are not deflectedand adds to the cell losses associated with the requirement toabort sort packets that contain coincident events. The abilityto deterministically control cell location with relation to timeand space will remove inefficiencies associated with the Poissondistribution of cells in drops and will result in higher through-put, lower loss single-cell approaches, while still retaining thecharacterization complexity afforded by traditional FACS.
While electrostatic droplet-based FACS is by far themost common method for depositing cells into microtiterplates, emerging technologies such as the CellenONE and theWOLF cell sorters are providing alternatives. Both of thesesystems use a low-pressure microfluidics-based approach andcan thus be used on highly friable cell types that may be sen-sitive to the stresses of traditional FACS. The CellenONE sys-tem is a unique ultra-low volume liquid handler that utilizesan active image-based cell sorting approach to improve celldeposition accuracy while simultaneously minimizing cell loss(sort aborts are simply collected without dilution for subse-quent reanalysis and deposition). Both the WOLF and theCellenONE systems are slow when compared to FACS, andcan only handle limited cell numbers, for this reason, they
Plate based
Dropletmicrofluidic
microfluidicSpatial
Imaging
Solid
In situ
Genomiccytometry
methodologycategories
Figure 1. The main methodology categories that comprise the field of Genomic Cytometry. [Color figure can be viewed at wileyonlinelibrary.com]
REVIEW ARTICLE
43
tend to have specific applications and often require pre-enrichment steps when dealing with rare cell populations.
Modern FACS instruments also include a software mod-ule that tracks the characteristics of the cell sorted and linksthis to the well coordinates. This process, known as indexsorting, is critical to multi-omic studies as it allows proteinexpression profiles (captured as part of the sort decision) tobe cross-correlated to the genomic data generated in down-stream assays.
Assays that take advantage of a plate-based approachinclude: Smart-Seq (33), Smart-Seq2 (34), Smart-Seq3 (35),STRT-seq (36), STRT-seq-2i (37), Cell-Seq, Cell-Seq2 (38),MARS-Seq (39), mcSCRB-seq (40), Qartz-seq (41), Qartz-seq2 (42), scBS-seq (43), and single-cell HiC (44).
MicrofluidicsMicrofluidics have expanded massively in popularity in thepast two decades (45). In recent years, the field has also madea significant contribution to both our understanding of biol-ogy and to many areas of health care (45–48). Using micro-fluidics, entirely new assays can be created and traditionalassays miniaturized. With reactions performed in the nano topico-liter range (49), microfluidic-driven miniaturization canresult in log fold difference in the reaction volume. Becauseminiaturization can improve reaction efficiencies by simulta-neously reducing reagent and sample input, microfluidics isbecoming increasingly critical to our ability to perform high-throughput, high-resolution, high-sensitive assays in a cost-effective manner.
Microfluidics is used in a range of technologies but withreference to genomics, its application in massively parallelsequencing technologies was a significant contributor to theprecipitous drop in sequencing cost. It is also being used inmost of today’s commercially available, high-throughputGenomic Cytometry platforms, such as the 10× GenomicsChromium and BD Rhapsody, Dolomite Bio Nadia, Mis-sionbio Tapestri, ICell8, Biorad ddseq, InDrops, and FluidgmC1 systems. To assist with the categorization of the manymicrofluidic approaches available, we have split the tech-niques into two subcategories, those that involve droplets andthose that utilize miniaturized solid reaction chambers.
Droplet MicrofluidicsThe realization that droplet microfluidics is useful in thestudy of biology came of age with the simultaneous publica-tion of two seminal papers out of Harvard and the BroadInstitutes in 2015 (50, 51). These papers showed, for the firsttime, the application of high-throughput droplet-based gener-ators in single-cell RNA-seq (scRNA-seq). Since then, com-mercial systems such as the 10x Genomics Chromium, Bioradddseq, Dolomite Bio Nadia, and the Missionbio Tapestri sys-tems have been released. Among these, the 10× GenomicsChromium system has the broadest acceptance. This is likelydue to the fact that it was the first to include a highly definedkit-based approach combined with an accessible data inter-face. At the time, this created a uniquely user-friendly ecosys-tem. With this, a biologist without deep expertise in
microfluidics and genomics could generate single-cell datawith relative ease. As the field becomes more mature andcompetitors increasingly enter the market, we expect thedominance of a single platform to be significantly challenged.
Mechanistically, droplet-based microfluidic systems workby mixing two immiscible liquids to create a water-in-oilemulsion. The oil forms a self-contained reaction vesselaround an aqueous phase. The aqueous phase contains bothcells and a bead containing uniquely barcoded mRNA captureprobes in lysis buffer. For 30 scRNA-seq assays, the captureprobe contains a poly dT region of around 22–25 nucleotides,which binds to polyadenylated transcripts released upon celllysis. Thus, as the mRNA is released the polyadenylatedregion of the transcript is immediately bound to an oligo con-taining a (1) a cell barcode, (2) a Unique Molecular Identifier(UMI), and (3) a nucleotide region that assists with subse-quent transcript amplification. As the aim of these systems isto co-locate a single bead with a single cell to a single droplet,the RNA profile for each captured cell can be obtained byinformatically pooling of cell barcodes. The UMI tracks indi-vidual transcripts allowing for correction of amplificationbias. The utilization of the dual barcode approach allows digi-tal transcript counting at the level of the single cell.
Droplet volume is dependent on the flow rates and thechip geometry and while this can be used to create a widerange of droplet sizes, the droplets used in scRNA-seq appli-cations generally range from a few hundred pico-liters to afew nano-liters (52). Droplet volume has been shown to beinversely related to the number of transcripts detected in thefinal library, for this reason, applications such as DroNC-Seq(designed for polyadenylated RNA transcripts from the cellnucleus) are better suited to systems that produce smallerdroplets (75 vs 120 μm diameter droplets) (53).
Droplet microfluidics have been extensively utilized inthe context of Genomic Cytometry. In addition to the workmentioned above, it has been used to profile transcriptomesat single-cell resolution (54), and other non–RNA-basedapplications. These include, (1) single-cell epigeneticapproaches such as: single-cell ChIP-seq (55, 56), dscATAC-seq (57, 58) ChIA-Drop (59) single-cell ATAC seq (23); and(2) single-cell DNA approaches like single-cell gDNA-seq (60–62) and a variety of multi-omic workflows includingCITE-Seq (63), REAP-seq (64) (protein and transcriptome),ECCITE-seq (65) (transcriptome, protein, clonotypes, andCRISPR perturbations), and SNARE-seq (66) (chromatin andtranscriptome).
Solid MicrofluidicsSolid microfluidic platforms use physical barriers to createindividual reaction chambers, often at high physical densitiesbut always with ultra-low volumes. These chambers can bemade from a variety of materials but commonly include plas-tic, metal or polydimethylsiloxane (PDMS). Because solidmicrofluidics uses a physical confinement on solid substrates,it is possible to perform imaging on the cells in the well. Ifeach well location can be associated with the unique cellbarcode, then it is also possible to associate this data with the
REVIEW ARTICLE
44
indexing ATAC seq (74), sci-RNA-seq (single-cell combinato-rial indexing RNA sequencing) (75), split-pool ligation-basedtranscriptome sequencing (SPLiT-seq) (76), single-cell combi-natorial indexed sequencing (SCI-seq) (77), sci-CAR (78)single-cell transposome hypersensitive sites sequencing (THS-seq) (79), single-cell DNA methylation (sci-MET) (80),droplet-based sci-ATAC (57), and single-cell Hi-C (Sci-Hi-C)(81). Recently, SplitBio announced commercial release of asingle-cell RNA sequencing kit utilizing in situ combinatorialindexing.
Image-Based ApproachesIn contrast to spatial transcriptomic systems that rely primar-ily on spatially attributable cell barcodes, image-based Geno-mic Cytometry techniques rely on in situ imaging of cells.These systems have been used to directly image the locationof both RNA and protein in tissue sections. Because samplehandling is reduced and solid tissue does not require diges-tion, such systems may provide the most representativemethod to study cellular composition in solid tissues. Exam-ple systems include the Codex and a number of highly multi-plexed fluorescent in situ hybridization (FISH)-basedapproaches.
Highly multiplexed FISH approaches take advantage ofspecially designed probes combined with multiple rounds ofhybridization and imagining to build anatomically localizedtranscript maps on tissue sections. Examples of suchapproaches include MERFISH (82), STAR-map (83), Seq-Fish+ (84), or DNA microscopy (84).
The Codex system (85) can perform high-dimensionalimage-based protein detection with the use of oligo-conjugated antibodies. The system has been adapted for bothslide imaging, super-resolution imaging, and has also beenshown to work with volumetric imaging. By using a series offluorescently labeled bases and relying on the specificity ofcomplementary binding of fluorescently labeled base pairsequence to the oligo attached to the antibody, it is possibleto perform highly multiplexed protein detection in tissue.Codex has been validated in both FFPE and frozen samplesand can detect more than 40 proteins from the sameindividual cell.
Spatial TranscriptomicsSpatial transcriptomic workflows are complicated and requirecomplex bioinformatics pathways. However, they can be sim-plified to a number of key steps, (1) a tissue section is cut,(2) section is laid on a solid imageable surface containingimmobilized region-specific capture probes (these are akin tothe cellular barcode used in other methods), (3) the section isimaged, (4) the sample is then permeabilized, and finally(5) the polyadenylated mRNA is captured by spatial probes.Following this, cDNA is synthesized, libraries are created, andthen sequenced. As the location of the unique oligo sequencefor the capture probe can be traced back to a discrete physicallocation, it is possible to create a single-cell transcriptomiclibrary that retains anatomical information. The resolution ofthe system is governed by both the spot size of the deposited
downstream genomic characterization. Systems that allow this tend to have lower throughput and include the Fluidgm C1™ and ICell8™ systems.
The Fluidgm C1™ system is perhaps the best know solid microfluidics platform. The C1 utilizes an intricate micro-fluidics architecture to provide high-level control of the com-plex molecular reactions required for single-cell analysis. The C1 system has been used in a number of studies characteriz-ing single cells at the level of RNA, DNA, and epigenetic changes (67–69). Despite the systems advanced approach, problems have been identified and care should be taken with its use (70). The ICell8™ system is a commercially miniatur-
ized plate-based system that allows high-density fluid han-dling to achieve microfluidic scale single-cell genomics.
Recently, Becton Dickinson has released a high through-put scRNA-seq system, the BD Rhapsody. It uses a similar approach to the CytoSeq (71) and Seqwell protocols (72). By using a microwell approach, the Rhapsody system can place a single bead in virtually every well and does not expose the cells to the same pressures associated with droplet generation. This may be an important consideration when working with cells highly sensitive to pressure-related stress.
In addition to high recovery rates, Rhapsody workflows also allow both a whole transcriptome as well as a targeted transcriptomics approach. While commercial modifications have occurred, the molecular workflow is similar to that used in many droplet-based scRNA-seq techniques. The targeted scRNA-seq approach, however, is currently unique to the Rhapsody and while it requires a priori knowledge of the sys-tem being interrogated, it allows transcripts of interest to be deeply probed without incurring the high sequencing cost associated with reading common housekeeping and lowly informative transcripts. Depending on the panel, it is possible to obtain the same sequencing saturation, with up to 10 times less sequencing reads than that obtained when using a WTA approach (73).
In Situ Combinatorial IndexingIn situ single-cell methods provide an ingenious way to use the inherent structure of the cell or nuclei as the reaction chamber itself. This is achieved by first fixing the cell using methanol, or beginning with an intact nucleus, and subjecting these to multiple sequential barcoding steps using a split-pool approach. Through successive integration of molecular barcodes into the cell/nucleus itself, in situ combinatorial methods are capable of building up a library of uniquely barcoded single cells. For these methods to work effectively, it is critical to ensure that the number of barcodes that can be created is well in excess of the number of cells/nuclei being labeled. As the total number of barcodes possible is a combi-nation of (1) the number of unique starting oligos and (2) the number of successive split-barcode-pool-split steps, these methods require careful balancing of cell inputs to available barcodes. Failure to do this will result in cells/nuclei sharing the same barcode.
Notable examples of in situ combinatorial approaches include, a 2015 method to perform single-cell combinatorial
REVIEW ARTICLE
45
capture probes and the distance between the centers of adja-cent capture probe spots. The very first spatial transcriptomicsystem (86) had a spot size of 100 μm, with a distancebetween spot centers of 200 μm, and an estimated 200 millioncapture oligos per spot. Academic systems with spot sizesapproaching that of the single-cell include Slide-seq (87) andHDST (88). These systems have a resolution of 10 and 2 μm,respectively. Recently, alterations to the molecular compo-nent, including “the bead barcode synthesis, array sequencingpipeline and the enzymatic processing of cDNA” of the Slide-seq method, were used to improve sensitivity by an order ofmagnitude (Slide-seqV2) and allow better transcript represen-tation (89).
Commercial methods such as the Visium from 10×Genomics are currently available but not yet in widespreaduse. These methods are also not yet at the level of the singlecell. Instead, they have spot sizes that contain many cells andhave large gaps between the spots. The Visium platform usesspot sizes of 55 μm, with the separation between spot centersbeing 100 μm. One caveat of systems like this is the need ofpermeabilization time optimization which will vary fromsample to sample. We expect that as spatial transcriptomicsare further developed, they will become a valuable method fordeeply characterizing patient disease. However, until stan-dardized protocols across a number of tissue types can bedetermined, the widespread clinic adoption of such systemswill likely be hindered.
MULTI-OMICS
Multi-omics is the science of combining measurementsafforded by the different omics modalities on the same sam-ple. In Genomic Cytometry, multi-omics involves the mea-surement of more than one class of cellular characteristics atthe level of the single cell simultaneously. Generally, thisincludes the measurement of (1) RNA with protein, (2) RNAwith DNA, (3) DNA with protein, or (4) epigenetics analysiswith protein. However, approaches allowing three modalitiesto be probed simultaneously are emerging.
Low throughput multi-omics has been possible since theadvent of FACS-based index sorting for downstream scRNA-seq applications. By simply varying the downstream genomicanalysis method, it is possible to use index sorting for a multi-tude of single-cell multi-omic studies. This approach is oftenused in mid throughput scRNA-seq plate-based assays suchas Smart-Seq (33, 34), Cell-Seq2 (38), and MARS-seq (39).Even inherently multi-omics methods such as G&T-seq (90)can be combined with index sorting to add a protein dimen-sion to the multi-omic analysis. The idea of using indexsorting to boost multi-omics identification of single-cell at thelevel of RNA, DNA, and protein has recently been leveragedin the TARGET-Seq (91) protocol.
In order to facilitate high-throughput multi-omic approachesinvolving protein detection, a number of oligonucleotide-conjugated antibody techniques have been developed. Theseinclude CITE-seq (63), REAP-seq (64), and Ab-seq (92). Theuse of oligonucleotide labeled antibodies has allowed a
substantial step forward in the ability to perform high dimen-sional single-cell protein detection. By incorporating a uniqueoligo onto the antibody, it is possible to detect the extra-cellular protein expression on the cell using common 30
scRNA-seq. The oligos attached to the antibodies contain(1) an antibody specific base pair sequence (to identify anti-gen specificity), (2) a PCR handle (to allow amplification dur-ing library preparation), and (3) a poly-A sequence (to allowthe antibody conjugated oligo to be captured by the polyTregion of the capture probe). This approach has been com-mercialized by both Biolegend and Becton Dickinson.
The use of oligonucleotide-conjugated antibodies hasbeen shown to be effective at detecting many antigens. How-ever, the technology is relatively new, and care should still betaken when designing panels. While we do not yet haveguidelines for panel design, factors such as (1) epitope expres-sion density, (2) cell numbers stained, (3) sequencing depth,(4) relative expression ratios, and (5) library complexity arelikely to affect the outcome of oligo antibody characterizationstudies.
One of the criticisms of oligonucleotide-conjugated anti-bodies is that the sequence allocation required to detect allbound antibodies is dependent on the relative expressionacross all proteins in the panel. In panels that contain a fewvery high-expressing antigens, most of the sequence reads canbe taken up by a small number of antigens. In this case, thedynamic range of the remaining antibodies is significantlyreduced. While there are a number of ways to approach this(including antibody titration and spiking in cold, unlabeledantibody), one approach that will undoubtedly become popu-lar is to first sort populations of cells defined by highexpressing antigens using fluorescently labeled antibodiesprior to labelling sorted fractions with oligo-antibodies toidentify the remaining antigen profiles.
This FACS-assisted sequencing approach ensures anefficient use of sequencing reads and when combined withhashtag antibodies (93) or lipid modified oligo or cholesterolmodified oligo (94) (to molecularly barcode each sortedpopulation), it becomes a powerful multi-omics strategy withhigh-throughput. A comparison of this approach, includingits impact on sequencing read allocation, is modeled inFigure 2.
APPLICATION OF GENOMIC CYTOMETRY
Since around 2015, there has been an explosion of methodsaimed at single-cell genomic characterization. Alongside this,there have been an increasing number of studies making useof scRNA-seq approaches; see review (95). Indeed, followingthe completion of the human genome project (96), scientistshave become increasingly aware that bulk genomicapproaches lack the precision to unravel subtle changes at thelevel of the individual cell. This is critically important in dis-eases such as cancer and immune disorders where a singlerogue cell can be the base of disease. It is also important forthe understanding of many developmental processes wheresingle cells give rise to many cells.
REVIEW ARTICLE
46
identifying unique cell types. These studies have formed thebasis of the Human Cell Atlas (HCA) project (100). The HCAis a multicenter, international effort aiming to create a databaseof all cell types in the human body using single-cellapproaches. This is an important effort and is the next logicalstep following on from the human genome project. Just as thehuman genome project provides the reference data that hasallowed deep interrogation of the biology associated with geno-mic changes, the completion of the HCA should provide thereference data to allow classification of individual cells fromtheir unique omic signatures. This is particularly important, asmany of the databases that we are currently using to interpretsingle-cell genomic studies are based on bulk genomics.
While there is clear virtue in these types of studies, thisshotgun approach is only designed to provide a fundamental
Figure 2. FACS assisted sequencing provides an efficient and targeted multi-omics approach. (A) A comparison of standard full oligoantibody panel (unbiased) (top), with FACS assisted sequencing (targeted) (bottom) using a combination of fluorescently labeledantibodies (for pre selection of populations) followed by oligo antibody labelling. (B) Read sequencing utilization in a mock panel. Animaginary 30 plex panel was created. The panel consisted of 5 high-expression epitopes, 14 medium-density epitopes, and 11 lowexpressors. To compare the effect of removing the high-expressing antigens from the sequencing run, we compared the relativeproportion of sequencing reads used by each oligo tag under both conditions. Each of the concentric circles in the radar plots indicates asingle percentage of sequencing reads used up by the marker. This model predicts that when highly expressed antigens were removedfrom the oligo antibody panel, it is clear that low-expressing antigens are associated with higher read counts when FACS assistedsequencing was used. [Color figure can be viewed at wileyonlinelibrary.com]
Whole transcriptome analysis, the method used by the majority of scRNA-seq studies to date, allows global profiling of many of the RNA species found in cells in an unbiased manner and without the need of a priori knowledge of the cells or the cell system to be studied. Although single-cell transcriptomic methods are not capable of amplifying every single mRNA, even relatively poorly performing methods are proving capable of accurately identifying many existing and novel cell populations (97). Furthermore, many of the original methodologies are being improved with molecular techniques aimed at increasing transcript detection sensitivity. Notable examples include Chromium V3, Smart-Seq 3, Quartz-Seq2, and Seq-Well S^3 (35, 42, 98, 99).
Early studies have tended to be descriptive efforts, pri-marily aimed at uncovering the cellular heterogeneity and
REVIEW ARTICLE
47
base for more nuanced approaches. The approach required forbiologically directed studies will depend on the (1) biology ofthe system, (2) the questions being asked, (3) the technicalexpertise of the scientists running the experiment, and (4) thefunds available. While the decision of which technology is bestsuited to the biological question being asked is not alwaysstraightforward, we have outlined some of the more commonquestions involved in the decision-making process in Figure 3.
CONCLUSION
As we move into the age of Genomic Cytometry, we are nowlooking to synergistically leverage the modalities of genomics,informatics, microfluidics, and cytometry toward a single aim.To do this, we must develop ways to work in a cross disci-plinary fashion such that microfluidics and FACS-based tech-niques can be seamlessly integrated into molecular workflowsand high-dimensional data analysis frameworks. The combi-nation of these four, traditionally distinct expertise areas, iswhat provides the foundation for the new field of GenomicCytometry.
With Genomic Cytometry, it is possible to study cellularcharacteristics more deeply than ever before. The new tools
emerging to allow RNA-seq, DNA-seq, epigenetic analysis,and protein detection at the level of the single cell will funda-mentally change what we know about biological processesand how quickly we can deeply interrogate complex biologicalsystems. We are beginning to see a systems-based approachthat will allow us to do accurate single-cell multi-omics stud-ies with the sensitivity, efficiency, and cost that means truebiology can be uncovered. This deep characterization is all-owing us to unravel cellular complexity in highly heteroge-neous samples and to find the root cause of disease andunravel the cellular complexity of development. Eventually,we believe it will give us the power to analyze the DNA,RNA, protein, and epigenetic states of individual cells atthroughputs that will rival that of current flow cytometers.
As the field of single-cell genomics matures, and webegin to embrace the broader field of Genomic Cytometry, itwill become increasingly more evident that results fromsingle-cell omics studies will need to be supported and vali-dated by alternate systems. These systems will include tradi-tional imaging, lineage tracing, and fluorescence cytometrymethods. This will create a circle of discovery and validationthat unites the field of genomics and cytometry. For this rea-son, although we envision a dramatic shift in the tools
Figure 3. Flowchart for determining the most suitable Genomic Cytometry method for the biological question. [Color figure can beviewed at wileyonlinelibrary.com]
REVIEW ARTICLE
48
available to the traditional cytometrist, cytometry will stillhold a critical place in the emerging application of single-cellgenomics. It is, for this reason, Genomic Cytometry willbecome the modality of choice for single-cell analysis.
CONFLICT OF INTERESTThe authors declared no potential conflict of interest.
AUTHOR CONTRIBUTIONS
Luciano Martelotto: Conceptualization; writing-review andediting. Fatima Valdes-Mora: Conceptualization; writing-review and editing. David Gallego-Ortega: Conceptualiza-tion; supervision; writing-original draft; writing-review andediting.
LITERATURE CITED
1. Futamura K, Sekino M, Hata A, Ikebuchi R, Nakanishi Y, Egawa G, Kabashima K,Watanabe T, Furuki M, Tomura M. Novel full-spectral flow cytometry with multi-ple spectrally-adjacent fluorescent proteins and fluorochromes and visualization ofin vivo cellular movement. Cytometry A 2015;87(9):830–842.
2. Bendall SC, Simonds EF, Qiu P, Amir el AD, Krutzik PO, Finck R, Bruggner RV,Melamed R, Trejo A, Ornatsky Ol, et al. Single-cell mass cytometry of differentialimmune and drug responses across a human hematopoietic continuum. Science2011;332(6030):687–696.
3. Simoni Y, Chng MHY, Li S, Fehlings M, Newell EW. Mass cytometry: A powerfultool for dissecting the immune landscape. Curr Opin Immunol 2018;51:187–196.
4. Bandura DR, Baranov VI, Ornatsky OI, Antonov A, Kinach R, Lou X, Pavlov S,Vorobiev S, Dick JE, Tanner SD. Mass cytometry: Technique for real time singlecell multitarget immunoassay based on inductively coupled plasma time-of-flightmass spectrometry. Anal Chem 2009;81(16):6813–6822.
5. Bendall SC, Nolan GP, Roederer M, Chattopadhyay PK. A deep profiler’s guide tocytometry. Trends Immunol 2012;33(7):323–332.
6. Angelo M, Bendall SC, Finck R, Hale MB, Hitzman C, Borowsky AD,Levenson RM, Lowe JB, Liu SD, Zhao S, et al. Multiplexed ion beam imaging ofhuman breast tumors. Nat Med 2014;20(4):436–442.
7. Cornett DS, Reyzer ML, Chaurand P, Caprioli RM. MALDI imaging mass spec-trometry: Molecular snapshots of biochemical systems. Nat Methods 2007;4(10):828–833.
8. Schober Y, Guenther S, Spengler B, Rompp A. Single cell matrix-assisted laserdesorption/ionization mass spectrometry imaging. Anal Chem 2012;84(15):6293–6297.
9. Keren L, Bosse M, Thompson S, Risom T, Vijayaragavan K, McCaffrey E,Marquez D, Angoshtari R, Greenwald NF, Fienberg H, et al. MIBI-TOF: A multi-plexed imaging platform relates cellular phenotypes and tissue structure. Sci Adv2019;5(10):eaax5851.
10. Luecken MD, Theis FJ. Current best practices in single-cell RNA-seq analysis: Atutorial. Mol Syst Biol 2019;15(6):e8746.
11. Chen G, Ning B, Shi T. Single-cell RNA-seq technologies and related computa-tional data analysis. Front Genet 2019;10:317.
12. Hwang B, Lee JH, Bang D. Single-cell RNA sequencing technologies and bioinfor-matics pipelines. Exp Mol Med 2018;50(8):96.
13. Huang X, Liu S, Wu L, Jiang M, Hou Y. High throughput single cell RNA sequenc-ing, bioinformatics analysis and applications. Adv Exp Med Biol 2018;1068:33–43.
14. Ji F, Sadreyev RI. Single-cell RNA-seq: Introduction to bioinformatics analysis.Curr Protoc Mol Biol 2019;127(1):e92.
15. Van Dilla MA, Trujillo TT, Mullaney PF, Coulter JR. Cell microfluorometry: Amethod for rapid fluorescence measurement. Science 1969;163(3872):1213–1214.
16. Fulwyler MJ. Electronic separation of biological cells by volume. Science 1965;150(3698):910–911.
17. Chattopadhyay PK, Gaylord B, Palmer A, Jiang N, Raven MA, Lewis G,Reuter MA, Nur-ur Rahman AK, Price DA, Betts MR, et al. Brilliant violetfluorophores: A new class of ultrabright fluorescent compounds for immunofluo-rescence experiments. Cytometry A 2012;81(6):456–466.
18. Nettey L, Giles AJ, Chattopadhyay PK. OMIP-050: A 28-color/30-parameter fluo-rescence flow cytometry panel to enumerate and characterize cells expressing awide array of immune checkpoint molecules. Cytometry A 2018;93(11):1094–1096.
19. Liechti T, Roederer M. OMIP-060-30-parameter flow cytometry panel to assess Tcell effector functions and regulatory T cells. Cytometry A 2019;95:1129–1134.
20. Liechti T, Roederer M. OMIP-051 – 28-color flow cytometry panel to characterizeB cells and myeloid cells. Cytometry A 2019;95(2):150–155.
21. Liechti T, Roederer M. OMIP-058: 30-parameter flow cytometry panel to charac-terize iNKT, NK, unconventional and conventional T cells. Cytometry A 2019;95(9):946–951.
22. Mair F, Prlic M. OMIP-044: 28-color immunophenotyping of the human dendriticcell compartment. Cytometry A 2018;93(4):402–405.
23. Buenrostro JD, Wu B, Litzenburger UM, Ruff D, Gonzales ML, Snyder MP,Chang HY, Greenleaf WJ. Single-cell chromatin accessibility reveals principles ofregulatory variation. Nature 2015;523(7561):486–490.
24. Salzberg SL. Open questions: How many genes do we have? BMC Biol 2018;16(1):94.
25. Willyard C. New human gene tally reignites debate. Nature 2018;558(7710):354–355.
26. Ezkurdia I, Juan D, Rodriguez JM, Frankish A, Diekhans M, Harrow J, Vazquez J,Valencia A, Tress ML. Multiple evidence strands suggest that there may be as fewas 19,000 human protein-coding genes. Hum Mol Genet 2014;23(22):5866–5878.
27. Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO,Yandell M, Evans CA, Holt RA, et al. The sequence of the human genome. Science2001;291(5507):1304–1351.
28. International Human Genome Sequencing Consortium. Finishing the euchromaticsequence of the human genome. Nature 2004;431(7011):931–945.
29. Clamp M, Fry B, Kamal M, Xie X, Cuff J, Lin MF, Kellis M, Lindblad-Toh K,Lander ES. Distinguishing protein-coding and noncoding genes in the humangenome. Proc Natl Acad Sci USA 2007;104(49):19428–19433.
30. Ponomarenko EA, Poverennaya EV, Ilgisonis EV, Pyatnitskiy MA, Kopylov AT,Zgoda VG, Lisitsa AV, Archakov AI. The size of the human proteome: The widthand depth. Int J Anal Chem 2016;2016:7436849.
31. Engel P, Boumsell L, Balderas R, Bensussan A, Gattei V, Horejsi V, Jin BQ,Malavasi F, Mortari F, Schwartz-Albiez R, et al. CD nomenclature 2015: Humanleukocyte differentiation antigen workshops as a driving force in immunology.J Immunol 2015;195(10):4555–4563.
32. Allis CD, Jenuwein T. The molecular hallmarks of epigenetic control. Nat RevGenet 2016;17(8):487–500.
33. Ramskold D, Luo S, Wang YC, Li R, Deng Q, Faridani OR, Daniels GA,Khrebtukova I, Loring JF, Laurent LC, et al. Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells. Nat Biotechnol 2012;30(8):777–782.
34. Picelli S, Faridani OR, Björklund ÅK, Winberg G, Sagasser S, Sandberg R. Full-length RNA-seq from single cells using Smart-Seq2. Nat Protoc 2014;9(1):171–181.
35. Hagemann-Jensen M, Ziegenhain C, Chen P, Ramsköld D, Hendriks G-J, LarssonAJM, Faridani OR, Sandberg R. Single-cell RNA counting at allele- and isoform-resolution using Smart-Seq3. bioRxiv 2019:817924.
36. Islam S, Kjallquist U, Moliner A, Zajac P, Fan JB, Lonnerberg P, Linnarsson S.Characterization of the single-cell transcriptional landscape by highly multiplexRNA-seq. Genome Res 2011;21(7):1160–1167.
37. Hochgerner H, Lönnerberg P, Hodge R, Mikes J, Heskol A, Hubschle H, Lin P,Picelli S, la Manno G, Ratz M, et al. STRT-seq-2i: Dual-index 50 single cell andnucleus RNA-seq on an addressable microwell array. Sci Rep 2017;7(1):16327.
38. Hashimshony T, Senderovich N, Avital G, Klochendler A, de Leeuw Y, Anavy L,Gennert D, Li S, Livak KJ, Rozenblatt-Rosen O, et al. CEL-Seq2: Sensitive highly-multiplexed single-cell RNA-Seq. Genome Biol 2016;17:77.
39. Jaitin DA, Kenigsberg E, Keren-Shaul H, Elefant N, Paul F, Zaretsky I, Mildner A,Cohen N, Jung S, Tanay A, et al. Massively parallel single-cell RNA-seq formarker-free decomposition of tissues into cell types. Science 2014;343(6172):776–779.
40. Bagnoli JW, Ziegenhain C, Janjic A, Wange LE, Vieth B, Parekh S, Geuder J,Hellmann I, Enard W. Sensitive and powerful single-cell RNA sequencing usingmcSCRB-seq. Nat Commun 2018;9(1):2937.
41. Sasagawa Y, Nikaido I, Hayashi T, Danno H, Uno KD, Imai T, Ueda HR. Quartz-Seq: A highly reproducible and sensitive single-cell RNA sequencing method,reveals non-genetic gene-expression heterogeneity. Genome Biol 2013;14(4):3097.
42. Sasagawa Y, Danno H, Takada H, Ebisawa M, Tanaka K, Hayashi T,Kurisaki A, Nikaido I. Quartz-Seq2: A high-throughput single-cell RNA-sequencingmethod that effectively uses limited sequence reads. Genome Biol 2018;19(1):29.
43. Smallwood SA, Lee HJ, Angermueller C, Krueger F, Saadeh H, Peat J, Andrews SR,Stegle O, Reik W, Kelsey G. Single-cell genome-wide bisulfite sequencing forassessing epigenetic heterogeneity. Nat Methods 2014;11:817–820.
44. Stevens TJ, Lando D, Basu S, Atkinson LP, Cao Y, Lee SF, Leeb M, Wohlfahrt KJ,Boucher W, O’Shaughnessy-Kirwan A, et al. 3D structures of individual mamma-lian genomes studied by single-cell Hi-C. Nature 2017;544:59–64.
45. Sackmann EK, Fulton AL, Beebe DJ. The present and future role of microfluidicsin biomedical research. Nature 2014;507(7491):181–189.
46. Kulasinghe A, Wu H, Punyadeera C, Warkiani ME. The use of microfluidic tech-nology for cancer applications and liquid biopsy. Micromachines (Basel) 2018;9(8):19.
47. Guo MT, Rotem A, Heyman JA, Weitz DA. Droplet microfluidics for high-throughput biological assays. Lab Chip 2012;12(12):2146–2155.
48. Velve-Casquillas G, le Berre M, Piel M, Tran PT. Microfluidic tools for cell biologi-cal research. Nano Today 2010;5(1):28–47.
49. Collins DJ, Neild A, deMello A, Liu AQ, Ai Y. The Poisson distribution andbeyond: Methods for microfluidic droplet production and single cell encapsulation.Lab Chip 2015;15(17):3439–3459.
50. Macosko EZ, Basu A, Satija R, Nemesh J, Shekhar K, Goldman M, Tirosh I,Bialas AR, Kamitaki N, Martersteck EM, et al. Highly parallel genome-wide expres-sion profiling of individual cells using Nanoliter droplets. Cell 2015;161(5):1202–1214.
51. Klein AM, Mazutis L, Akartuna I, Tallapragada N, Veres A, Li V, Peshkin L,Weitz DA, Kirschner MW. Droplet barcoding for single-cell transcriptomicsapplied to embryonic stem cells. Cell 2015;161(5):1187–1201.
52. Salomon R, Kaczorowski D, Valdes-Mora F, Nordon RE, Neild A, Farbehi N,Bartonicek N, Gallego-Ortega D. Droplet-based single cell RNAseq tools: A practi-cal guide. Lab Chip 2019;19:1706–1727.
REVIEW ARTICLE
49
53. Habib N, Avraham-Davidi I, Basu A, Burks T, Shekhar K, Hofree M,Choudhury SR, Aguet F, Gelfand E, Ardlie K, et al. Massively parallel single-nucleus RNA-seq with DroNc-seq. Nat Methods 2017;14(10):955–958.
54. Zheng GX, Terry JM, Belgrader P, Ryvkin P, Bent ZW, Wilson R, Ziraldo SB,Wheeler TD, McDermott GP, Zhu J, et al. Massively parallel digital transcriptionalprofiling of single cells. Nat Commun 2017;8:14049.
55. Rotem A, Ram O, Shoresh N, Sperling RA, Goren A, Weitz DA, Bernstein BE. Sin-gle-cell ChIP-seq reveals cell subpopulations defined by chromatin state. Nat Bio-technol 2015;33(11):1165–1172.
56. Grosselin K, Durand A, Marsolier J, Poitou A, Marangoni E, Nemati F,Dahmani A, Lameiras S, Reyal F, Frenoy O, et al. High-throughput single-cellChIP-seq identifies heterogeneity of chromatin states in breast cancer. Nat Genet2019;51(6):1060–1066.
57. Lareau CA, Duarte FM, Chew JG, Kartha VK, Burkett ZD, Kohlway AS,Pokholok D, Aryee MJ, Steemers FJ, Lebofsky R, et al. Droplet-based combinatorialindexing for massive-scale single-cell chromatin accessibility. Nat Biotechnol 2019;37(8):916–924.
58. Satpathy AT, Granja JM, Yost KE, Qi Y, Meschi F, McDermott GP, Olsen BN,Mumbach MR, Pierce SE, Corces MR, et al. Massively parallel single-cell chroma-tin landscapes of human immune cell development and intratumoral T cell exhaus-tion. Nat Biotechnol 2019;37(8):925–936.
59. Zheng M, Tian SZ, Capurso D, Kim M, Maurya R, Lee B, Piecuch E, Gong L,Zhu JJ, Li Z, et al. Multiplex chromatin interactions with single-molecule precision.Nature 2019;566(7745):558–562.
60. Pellegrino M, Sciambi A, Treusch S, Durruthy-Durruthy R, Gokhale K, Jacob J,Chen TX, Geis JA, Oldham W, Matthews J, et al. High-throughput single-cellDNA sequencing of acute myeloid leukemia tumors with droplet microfluidics.Genome Res 2018;28(9):1345–1352.
61. Velazquez-Villarreal EI, Maheshwari S, Sorenson J, Fiddes IT, Kumar V, Yin Y,Webb M, Catalanotti C, Grigorova M, Edwards PA. Resolving sub-clonal heteroge-neity within cell-line growths by single cell sequencing genomic DNA. bioRxiv2019:757211.
62. Hosokawa M, Nishikawa Y, Kogawa M, Takeyama H. Massively parallel wholegenome amplification for single-cell sequencing using droplet microfluidics. SciRep 2017;7:5199.
63. Stoeckius M, Hafemeister C, Stephenson W, Houck-Loomis B, Chattopadhyay PK,Swerdlow H, Satija R, Smibert P. Simultaneous epitope and transcriptome mea-surement in single cells. Nat Methods 2017;14(9):865.
64. Peterson VM, Zhang KX, Kumar N, Wong J, Li L, Wilson DC, Moore R,McClanahan TK, Sadekova S, Klappenbach JA. Multiplexed quantification of pro-teins and transcripts in single cells. Nat Biotechnol 2017;35(10):936–939.
65. Mimitou EP, Cheng A, Montalbano A, Hao S, Stoeckius M, Legut M, Roush T,Herrera A, Papalexi E, Ouyang Z, et al. Multiplexed detection of proteins, trans-criptomes, clonotypes and CRISPR perturbations in single cells. Nat Methods2019;16(5):409–412.
66. Chen S, Lake BB, Zhang K. Linking transcriptome and chromatin accessibility innanoliter droplets for single-cell sequencing. bioRxiv 2019:692608.
67. Li H, Courtois ET, Sengupta D, Tan Y, Chen KH, Goh JJL, Kong SL, Chua C, HonLK, Tan WS, et al. Reference component analysis of single-cell transcriptomes elu-cidates cellular heterogeneity in human colorectal tumors. Nat Genet 2017;49(5):708–718.
68. Proserpio V, Piccolo A, Haim-Vilmovsky L, Kar G, Lonnberg T, Svensson V,Pramanik J, Natarajan KN, Zhai W, Zhang X, et al. Single-cell analysis of CD4+ T-cell differentiation reveals three major cell states and progressive acceleration ofproliferation. Genome Biol 2016;17:103.
69. Szulwach KE, Chen P, Wang X, Wang J, Weaver LS, Gonzales ML, Sun G, UngerMA, Ramakrishnan R. Single-cell genetic analysis using automated microfluidics toresolve somatic mosaicism. PLoS One 2015;10(8):e0135007.
70. Xin Y, Kim J, Ni M, Wei Y, Okamoto H, Lee J, Adler C, Cavino K, Murphy AJ,Yancopoulos GD, et al. Use of the Fluidigm C1 platform for RNA sequencing ofsingle mouse pancreatic islet cells. Proc Natl Acad Sci USA 2016;113(12):3293–3298.
71. Fan HC, Fu GK, Fodor SP. Expression profiling. Combinatorial labeling of singlecells for gene expression cytometry. Science 2015;347(6222):1258367.
72. Gierahn TM, Wadsworth MH II, Hughes TK, Bryson BD, Butler A, Satija R,Fortune S, Love JC, Shalek AK. Seq-well: Portable, low-cost RNA sequencing ofsingle cells at high throughput. Nat Methods 2017;14(4):395–398.
73. Mair F, Erickson JR, Voillet V, Simoni Y, Bi T, Tyznik AJ, Martin J, Gottardo R, NewellEW, Prlic M. A targeted multi-omic analysis approach measures protein expressionand low abundance transcripts on the single cell level. bioRxiv 2019:700534.
74. Cusanovich DA, Daza R, Adey A, Pliner HA, Christiansen L, Gunderson KL,Steemers FJ, Trapnell C, Shendure J. Multiplex single cell profiling of chromatinaccessibility by combinatorial cellular indexing. Science 2015;348(6237):910–914.
75. Cao J, Packer JS, Ramani V, Cusanovich DA, Huynh C, Daza R, Qiu X, Lee C,Furlan SN, Steemers FJ, et al. Comprehensive single-cell transcriptional profiling ofa multicellular organism. Science 2017;357(6352):661–667.
76. Rosenberg AB, Roco CM, Muscat RA, Kuchina A, Sample P, Yao Z, Graybuck LT,Peeler DJ, Mukherjee S, Chen W, et al. Single-cell profiling of the developing
mouse brain and spinal cord with split-pool barcoding. Science 2018;360(6385):176–182.
77. Vitak SA, Torkenczy KA, Rosenkrantz JL, Fields AJ, Christiansen L, Wong MH,Carbone L, Steemers FJ, Adey A. Sequencing thousands of single-cell genomes withcombinatorial indexing. Nat Methods 2017;14(3):302–308.
78. Cao J, Cusanovich DA, Ramani V, Aghamirzaie D, Pliner HA, Hill AJ, Daza RM,McFaline-Figueroa JL, Packer JS, Christiansen L, et al. Joint profiling of chromatinaccessibility and gene expression in thousands of single cells. Science 2018;361(6409):1380–1385.
79. Lake BB, Chen S, Sos BC, Fan J, Kaeser GE, Yung YC, Duong TE, Gao D, Chun J,Kharchenko PV, et al. Integrative single-cell analysis of transcriptional and epige-netic states in the human adult brain. Nat Biotechnol 2018;36(1):70–80.
80. Mulqueen RM, Pokholok D, Norberg SJ, Torkenczy KA, Fields AJ, Sun D,Sinnamon JR, Shendure J, Trapnell C, O’Roak BJ, et al. Highly scalable generationof DNA methylation profiles in single cells. Nat Biotechnol 2018;36(5):428–431.
81. Ramani V, Deng X, Qiu R, Lee C, Disteche CM, Noble WS, Shendure J, Duan Z.Sci-Hi-C: A single-cell Hi-C method for mapping 3D genome organization in largenumber of single cells. Methods 2019;170:61–68.
82. Chen KH, Boettiger AN, Moffitt JR, Wang S, Zhuang X. RNA imaging. Spatiallyresolved, highly multiplexed RNA profiling in single cells. Science 2015;348(6233):aaa6090.
83. Wang X, Allen WE, Wright MA, Sylwestrak EL, Samusik N, Vesuna S, Evans K,Liu C, Ramakrishnan C, Liu J, et al. Three-dimensional intact-tissue sequencing ofsingle-cell transcriptional states. Science 2018;361(6400):eaat5691.
84. Eng CL, Lawson M, Zhu Q, Dries R, Koulena N, Takei Y, Yun J, Cronin C,Karp C, Yuan GC, et al. Transcriptome-scale super-resolved imaging in tissues byRNA seqFISH. Nature 2019;568(7751):235–239.
85. Goltsev Y, Samusik N, Kennedy-Darling J, Bhate S, Hale M, Vazquez G,Black S, Nolan GP. Deep profiling of mouse splenic architecture with CODEXmultiplexed imaging. Cell 2018;174(4):968–981.e15.
86. Stahl PL, Salmen F, Vickovic S, Lundmark A, Navarro JF, Magnusson J,Giacomello S, Asp M, Westholm JO, Huss M, et al. Visualization and analysis ofgene expression in tissue sections by spatial transcriptomics. Science 2016;353(6294):78–82.
87. Rodriques SG, Stickels RR, Goeva A, Martin CA, Murray E, Vanderburg CR,Welch J, Chen LM, Chen F, Macosko EZ. Slide-seq: A scalable technology for mea-suring genome-wide expression at high spatial resolution. Science 2019;363(6434):1463–1467.
88. Vickovic S, Eraslan G, Salmen F, Klughammer J, Stenbeck L, Schapiro D, Ajio T,Bonneau R, Bergenstrahle L, Navarro JF, et al. High-definition spatial trans-criptomics for in situ tissue profiling. Nat Methods 2019;16(10):987–990.
89. Stickels RR, Murray E, Kumar P, Li J, Marshall JL, Di Bella D, Arlotta P, MacoskoEZ, Chen F. Sensitive spatial genome wide expression profiling at cellular resolu-tion. bioRxiv 2020:2020.03.12.989806.
90. Macaulay IC, Haerty W, Kumar P, Li YI, Hu TX, Teng MJ, Goolam M, Saurat N,Coupland P, Shirley LM, et al. G&T-seq: Parallel sequencing of single-cell genomesand transcriptomes. Nat Methods 2015;12(6):519–522.
91. Rodriguez-Meira A, Buck G, Clark SA, Povinelli BJ, Alcolea V, Louka E,McGowan S, Hamblin A, Sousos N, Barkas N, et al. Unravelling Intratumoral het-erogeneity through high-sensitivity single-cell mutational analysis and parallelRNA sequencing. Mol Cell 2019;73(6):1292.
92. Shahi P, Kim SC, Haliburton JR, Gartner ZJ, Abate AR. Abseq: Ultrahigh-throughput single cell protein profiling with droplet microfluidic barcoding. SciRep 2017;7:44447.
93. Stoeckius M, Zheng S, Houck-Loomis B, Hao S, Yeung BZ, Mauck WM,Smibert P, Satija R. Cell hashing with barcoded antibodies enables multiplexingand doublet detection for single cell genomics. Genome Biol 2018;19(1):224.
94. McGinnis CS, Patterson DM, Winkler J, Conrad DN, Hein MY, Srivastava V,Hu JL, Murrow LM, Weissman JS, Werb Z, et al. MULTI-seq: Sample multiplexingfor single-cell RNA sequencing using lipid-tagged indices. Nat Methods 2019;16(7):619–626.
95. Svensson V, da Veiga Beltrame E, Pachter L. A curated database reveals trends insingle-cell transcriptomics. bioRxiv 2019:742304.
96. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K,Dewar K, Doyle M, FitzHugh W, et al. Initial sequencing and analysis of thehuman genome. Nature 2001;409(6822):860–921.
97. Mereu E, Lafzi A, Moutinho C, Ziegenhain C, MacCarthy DJ, Alvarez A, Batlle E,Sagar, Grün D, Lau JK, et al. Benchmarking single-cell RNA sequencing protocolsfor cell atlas projects. bioRxiv 2019:630087.
98. Hughes TK, Wadsworth MH, Gierahn TM, Do T, Weiss D, Andrade PR, Ma F, deAndrade Silva BJ, Shao S, Tsoi LC, et al. Highly efficient, massively-parallel single-cell RNA-seq reveals cellular states and molecular features of human skin pathol-ogy. bioRxiv 2019:689273.
99. Ding J, Adiconis X, Simmons SK, Kowalczyk MS, Hession CC, Marjanovic ND,Hughes TK, Wadsworth MH, Burks T, Nguyen LT, et al. Systematic comparativeanalysis of single cell RNA-sequencing methods. bioRxiv 2019:632216.
100. Rozenblatt-Rosen O, Stubbington MJT, Regev A, Teichmann SA. The human cellatlas: From vision to reality. Nature 2017;550(7677):451–453.
REVIEW ARTICLE
50
REVIEW ARTICLE
Computational approaches for high-throughput single-celldata analysisHelena Todorov1,2,3 and Yvan Saeys1,2
1 Data Mining and Modelling for Biomedicine, VIB Center for Inflammation Research, Ghent, Belgium
2 Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Belgium
3 Centre International de Recherche en Infectiologie, Inserm, U1111, Universit�e Claude Bernard Lyon 1, CNRS, UMR5308, �Ecole Normale
Sup�erieure de Lyon, Univ Lyon, France
Keywords
bioinformatics; computational tools;
proteome; single cell; transcriptome
Correspondence
Y. Saeys, Department of Applied
Mathematics, Computer Science and
Statistics, Ghent University,
Technologiepark 927, 9052 Gent, Belgium
Fax: +32 9 221 76 73
Tel: +32 9 331 37 40
E-mail: [email protected]
(Received 22 February 2018, revised 4 June
2018, accepted 25 July 2018)
doi:10.1111/febs.14613
During the past decade, the number of novel technologies to interrogate
biological systems at the single-cell level has skyrocketed. Numerous
approaches for measuring the proteome, genome, transcriptome and epi-
genome at the single-cell level have been pioneered, using a variety of tech-
nologies. All these methods have one thing in common: they generate large
and high-dimensional datasets that require advanced computational mod-
elling tools to highlight and interpret interesting patterns in these data,
potentially leading to novel biological insights and hypotheses. In this
work, we provide an overview of the computational approaches used to
interpret various types of single-cell data in an automated and unbiased
way.
Introduction
Single-cell technologies are currently revolutionising
the way life scientists are studying biological systems
from different perspectives. Three major classes of
technologies can be distinguished: imaging-based tech-
niques, techniques based on flow or mass cytometry
and techniques based on next-generation sequencing.
However, this is only a rough classification, as some
recent innovations combine elements of different
classes of techniques. While many of the early data
preprocessing steps are specific to each class of tech-
niques, several downstream computational analyses are
generally applicable to any form of single-cell data,
and one of the goals of this work is to provide a unify-
ing overview of these generally applicable approaches.
Historically, microscopy-based techniques were the
first methodology to study organisms at single-cell
resolution [1]. While initially consisting largely of man-
ual labour and thus being very low-throughput, auto-
mated image acquisition and segmentation have enabled
high-throughput image-based screening, by analysing
up to hundreds of thousands of cells in single-well plates
[2]. Similarly, many other microscopy-based techniques
allow the extraction of information at the single-cell
level, although at a lower throughput. These include
most types of light and electron microscopy, with a
broad variety of applications. Common to all these
image-based approaches is the fact that advanced
image-analysis pipelines are needed to arrive at single-
cell resolution [3]. A typical image processing pipeline
first performs segmentation of the single cells from the
image, followed by a feature extraction step, typically
extracting several hundreds of features for each
Abbreviations
DE, differential expression; HVGs, highly variable genes; scRNA-Seq, single-cell RNA sequencing; TI, trajectory inference.
51
individual cell [4]. In comparison to other single-cell
approaches where cells are dissociated in suspension, a
major advantage of image-based single-cell profiling
methodology is that it inherently provides the user with
two- or three-dimensional spatial information, as know-
ing a cell’s spatial context is often the key to discover
novel biological findings.
Flow cytometry allows profiling and analysing cells
in a high-throughput fashion and is based on passing
cells through a laser beam in a rapidly flowing fluid
stream. This core technology is in essence very similar
to the original design from the late 1960s [5], illustrat-
ing the robustness of the technology [4,6]. The field of
flow cytometry has emerged as a powerful methodol-
ogy for single-cell analysis due to continuous innova-
tions such as (a) multicolour assays enabling the
measurement of a large number of proteins simultane-
ously [7], (b) spectral flow cytometry [8] in which clas-
sical mirrors, optics and detectors are replaced by
dispersive optics and a linear array of detectors allow-
ing highly complex fluorochrome combinations, (c)
imaging flow cytometry [9] combining flow cytometry
and microscopy for high-throughput imaging of single
cells, and (d) acoustic-based focusing and sorting [10].
In addition, other technological advances such as mass
cytometry have replaced the fluorescent labelling and
readout using optics by labelling using heavy isotopes,
and subsequent readout by mass spectrometry [11].
This eliminates the problem of spectral overlap in clas-
sical flow cytometry, allowing the theoretical measure-
ment of up to 100 proteins simultaneously. Mass
cytometry can also be performed on tissue slices,
thereby scanning the tissue spot-by-spot and perform-
ing a single experiment per spot. This approach,
named imaging mass cytometry, allows performing
spatial proteomics in a high-throughput fashion [12].
The ability to measure increasing amounts of proteins
simultaneously [7] complicates the analysis of this type
of data, which can no longer be analysed manually as
was done with datasets containing a few markers per
cell, but needs new computational approaches to cor-
rectly identify cell populations [13].
Recent developments in microvolume sequencing
have led to a new wave of single-cell ‘-omics’ profiling
technologies [14–18], permitting the quantification of
whole genomes, epigenomes and transcriptomes at the
single-cell level. Novel computational tools are being
developed in order to deal with the continuously
increasing dimensionality of these datasets, since a sin-
gle experiment can quantify molecular characteristics
of up to tens of thousands of cells, measuring tens of
thousands of parameters (e.g. transcripts in the case of
single-cell transcriptomics). A high level of resolution
is provided by single-cell omics tools, as they aim to
sequence all of the cell’s content, instead of focusing
on a set of user-defined targets as is done in cytome-
try. This allows performing novel types of analyses,
such as studying the heterogeneity of cell populations
in much greater detail, identifying rare cell types, and
studying the dynamics of cellular systems. Further-
more, the field continues to evolve by combining sin-
gle-cell RNA sequencing with other technologies such
as spatial transcriptomics [19] and CRISPR-mediated
knockout screens (Perturb-Seq [20]/CRISP-seq [21]).
Recent approaches combine transcriptomics with other
types of omics data at a single-cell resolution such as
single-cell proteomics (CITE-seq [22]/REAP-seq [23]),
single-cell genomics (G&T-seq [24]) and single-cell
methylomics (scM&T-seq [25]). These emerging ‘single-
cell multi-omics’ technologies [26] integrate several
types of measurements on the same single cell and are
likely to be part of the everyday methodology of
molecular biologists in the future.
While all techniques described above provide the
user with information at single-cell level, the through-
put, resolution, cost and type of information acquired
differ drastically between technologies. We will take a
computational perspective here, and compare the main
dataset characteristics for the three major classes of
single-cell data introduced above. Classical imaging-
based techniques typically offer a low throughput,
measuring a few hundreds of cells, while more
advanced high-content screening methods allow high-
throughput measurements of hundreds of thousands to
millions of cells. When applying segmentation and fea-
ture extraction, for example using popular pipelines
such as CELLPROFILER [27], almost a thousand image-
derived features can be extracted per cell. However,
many of those capture redundant information and
thus are very correlated. Flow and mass cytometry
allow measuring cells at high throughput, up to mil-
lions of cells for classical flow cytometry. Only a few
tens of parameters can be quantified simultaneously
per single cell, but these parameters often represent
very complementary information, as they are manually
chosen by an expert. Single-cell omics technologies
offer medium throughput, measuring thousands to tens
of thousands of cells in a single run. However, these
data are very rich in information, measuring thousands
of transcripts in the case of single-cell transcriptomics.
While the profiling methodology and dataset charac-
teristics in each of these technologies are very different,
many of the applications and computational workflows
are quite similar. In the remainder of the paper, we
will discuss the differences and commonalities in com-
putational workflows for the different applications.
Computational tools for single-cell data analysis H. Todorov and Y. Saeys
52
the section ‘Data preprocessing and quality control’.
After data preprocessing, an initial exploration of the
data can be performed using visualisation techniques,
in order to perform early detection of any possible
batch effects or unexpected subpopulations. Applying
visualisation techniques may also help to visualise the
population structure within samples, and to compare
this structure between different samples. In this step,
interesting populations or trends may be observed that
require further investigation.
Next, several types of in-depth analyses can be per-
formed, in most cases starting with an automated clus-
tering of the cells into cell types. This clustering allows
quantifying and comparing different cell types in the
samples and identifying new cell types or transition
states. Novel computational approaches to model
gradual transitions between cell states (trajectory infer-
ence) can also be applied at this stage. Other alterna-
tives include specific predictive modelling approaches
such as classification, regression and survival analysis
modelling. All these approaches have the potential to
extract novel biomarkers from single-cell data, with
important diagnostic and therapeutic potential.
Finally, more advanced computational approaches can
be applied to single-cell omics data. The correlations
in gene expression within cells can be studied to assess
gene regulatory networks (network inference). In the
case of multi-omics datasets, data integration
Fig. 1. The computational workflow for single-cell experiments detailed in steps.
Computational workflow forsingle-cell experiments
Regardless of the specific technology used to generate a single-cell dataset, a common pipeline can be devised, starting with the experimental design, data generation, technology-specific preprocessing, quality control and subsequent data analysis (Fig. 1). A detailed design of the experiment is a crucial step towards minimising technical variation and improving scientific reproducibility. This not only includes stan-dardisation of experimental protocols and equipment, but also careful planning and consultation with statis-ticians and/or bioinformaticians regarding sample size, specific setup related to the biological questions that should be answered or specific types of computational analyses that should be carried out. Subsequently the experiment should be performed, ensuring that stan-dardised procedures are followed for sample prepara-tion, handling equipment and data acquisition while appropriate controls are added at multiple steps of the experiments.
The next step in the pipeline is the preprocessing and quality control. This step will likely take a consid-erable amount of time, as it is crucial to start from good quality data if good quality results are desired. Therefore, it is important to perform technology-speci-fic preprocessing steps, a topic that will be covered in
53
CELLPROFILER has a modular structure that allows the
user to select and configure the individual algorithms
that will be applied, which in turn defines the specific
preprocessing applied and the features that are
obtained at the end of the pipeline. The resulting fea-
tures can later be used for visualisation, clustering or
differential downstream analyses for instance.
Flow/mass cytometry
In conventional flow cytometry, the first preprocessing
step is typically compensation of the spectral overlap,
to correct for spillover of the fluorescent signal into
neighbouring channels. This is typically accounted for
in the experimental procedure, by measuring the fluo-
rescence of single stains in the different channels,
allowing for the calculation of a compensation matrix.
In mass cytometry, this issue is largely avoided by
using rare isotopes instead of light measurements,
although the measurement of certain isotopes can still
be polluted due to metal impurity levels, oxidation and
abundance sensitivity [35]. Mass cytometry panels
should therefore be designed with caution by pairing
strong intensity markers with less sensitive channels in
order to avoid interference between channels [36]. The
data is then transformed through a biexponential or
hyperbolic arcsine transformation, which improves the
separation between negative and positive cells for the
different markers. Fluctuations in measurements can
also be caused by an unsteady flow rate. Typically, up
to 10 000 cells are measured per second at a steady
rate in flow cytometry. Mass cytometry has a slightly
lower throughput, measuring a few thousand cells per
second. However, obstructions in the fluid stream and
manual interventions can disturb the flow, which also
impacts the amount of protein levels measured. To
remove these technical artefacts, the data needs to be
either manually gated against time or screened by tools
such as FLOWCLEAN [37], FLOWQ [38] and FLOWAI [39],
which can automatically identify and remove sections
in which the flow was perturbed.
The acquisition level of cytometers can slightly
change from one day to another, or even within hours.
The use of control tubes to calibrate the machine
before running an experiment can help to make differ-
ent samples more comparable, but batch effects are
often observed between two experiments. The resulting
slight shift in protein expression can be accounted for
manually, by shifting the gates of every sample that
differs, or in an automated way using the FLOWSTATS
[40] package. In mass cytometry, beads are commonly
used in the experiments, allowing normalisation of the
data based on the signal of these beads to have more
approaches can be used to combine the information on single-cell mechanisms.
Data preprocessing and quality control
Single-cell imaging
The preprocessing of single-cell imaging data usually starts by accounting for batch effects through illumina-
tion correction, and image-wise processing such as noise removal, aligning or cropping [28,29]. This pro-cedure is commonly followed by the segmentation of the individual cells within the images, and finally by a feature extraction process that yields a vector of numeric features for each individual cell, usually in a tabular format.
CELLPROFILER [27] is widely used to extract numerical features from two-dimensional microscopy images (such as in high-content screening assays). The main difficulty faced by CELLPROFILER is the segmentation of the cells or objects of interest present in the image. CELLPROFILER contains several fast algorithms that can extract well-separated objects; however, in many cases, these objects appear clumped, hindering their segmen-
tation and making it prone to both false negatives (when the borders between objects cannot be found) and false positives (when the sensitivity of the detec-tion is too high). In order to deal with this difficulty, CELLPROFILER also provides a more complex segmenta-
tion algorithm that follows a hierarchical process: first, it finds primary level objects that are typically well-separated (such as cell nuclei, visible on DNA-stain channels); then, the boundaries of secondary level objects (such as cell edges) are searched around the primary level objects.
However, it is also possible that the primary level objects appear clumped, which is why CELLPROFILER
divides their detection into several steps following the guidelines of previously published algorithms [30–34]. Clumped objects are first detected, segmented and sep-arated by dividing lines, thus avoiding false negatives. Finally, some of the objects are either removed or merged to reduce the false positive rate. Once the pri-mary level objects are properly detected, it becomes simpler to find secondary level objects around them. CELLPROFILER provides an improved algorithm to prop-erly detect the borders even when the objects are clumped against each other. Once the objects have been segmented, multiple features can be extracted from each of them in a per-channel basis (area, shape, intensity, texture, etc.) or at the whole-image level (number of cells, background intensity, etc.).
54
data transformation is then applied to align similar cell
populations, resulting in more consistent datasets that
can be further analysed together.
Several quality control metrics, such as the library
size and the percentage of mitochondrial genes, are
used to filter out abnormal cells, in order to reduce the
technical variance of the data [50]. Additionally, a
great part of intercellular variability can be caused by
the cell cycle, and it is up to the user to decide whether
this variability should be removed from the data or
not. Cyclone [51] is a method that can be used to pre-
dict the cell cycle stage, which can subsequently be
used to either remove cycling cells, or tag them so that
they can be easily identified later in the analysis. F-
scLVM [52] is another algorithm that identifies the
amount of variability across the expression of each
gene that is due to cell cycle differences. It can be used
to infer ‘corrected’ gene expression values, removing
the effect of the cell cycle.
The next step in the process regards the normalisa-
tion of the count data, since a large part of the
observed variability can be due to differences in size,
viability, capturing efficiency and amplification biases
between cells. Some methods aim to standardise the
total number of reads per cell (RPKM [53], TPM [54],
downsampling) or proportions of the total number of
reads per cell (UQ, full quantile [55]). However, these
methods can be seriously impacted by false negative
counts [56]. Indeed, the number of transcripts in a cell
being very low for certain genes, there is a high proba-
bility that these transcripts will be missed, resulting in
a zero count in the final expression data. These missed
transcripts are called dropouts, and lead to a high
technical variance that can affect the final results.
High-throughput scRNA-Seq protocols typically show
higher dropout rates [43], but high amounts of
sequenced cells can help to infer dropout probabilities.
ZIFA [57] is a method which identifies zero counts
that are most likely resulting from dropout events, and
gives less weight to these counts. ZINB-WAVE [58] is
another method which not only assesses the probabil-
ity for a zero to be a dropout based on the sequencing
depth, but also accounts for batch effects between
samples, and computes global-scaling normalisation
factors, which allow it to be used directly on non-
normalised data.
Some methods rely on spike-ins to distinguish techni-
cal variability from biologically relevant changes in
gene expression [59] (BASICS [60], GRM [61], SAMSTRT
[62]). Spike-ins are control RNA transcripts which are
added in the same quantity to all the samples to be
sequenced. They can be used to normalise the data,
as all cells should have exactly the same amount of
comparable samples. Some markers can also be used to barcode cells, and then pool several samples together, to avoid technical bias between different experimental conditions. When performing experi-
ments on different days, it may be advisable to include additional control samples, such as an aliquot from the same sample that is taken along all different exper-iment days, in order to allow normalisation between experiment days later on. Once batch effects have been accounted for, debris, doublets and other low quality cells can be removed either by manual gating or using OPENCYTO [41], or FLOWDENSITY [42].
As flow cytometry allows the measurement of pro-teins at the single-cell level while preserving the integ-rity of the cells, it is sometimes used to sort specific cells into wells before sequencing their transcriptome. The cells can either be sorted by cell population, based on a set of common markers, or index-sorted, in which case single cells are sorted into wells and barcoded, so that their protein expression profile is kept. In this case, doublets and empty wells might occur, which should be carefully removed from the analysis before any further processing step.
Single-cell omics
Preprocessing single-cell omics data based on NGS technologies further builds on the wide availability of NGS preprocessing tools that are already available from experiments on bulk RNA or DNA. However, single-cell omics technologies lead to a number of additional challenges when going through the process from the individual reads to the mapped genomes or transcriptomes. We will focus here more specifically on methods for single-cell transcriptomics, as this is the most widely used type of single-cell omics data at pre-sent. Several scRNA-Seq protocols were developed, usually focusing either on sequencing a large number of cells, or a high amount of genes at an increased sequencing depth [43]. Due to the low amount of tran-scripts in the cells, scRNA-Seq data usually contain a lot of technical variance, requiring specific computa-
tional tools to perform quality control, normalisation and downstream analyses [44–47].
When performing a computational analysis on scRNA-Seq data coming from multiple experiments, batch effects can arise, leading to an increased interex-perimental variability. Two recently published algo-rithms can be used in order to reduce batch effects. These algorithms either identify a gene correlation structure [48], or a subset of cells coming from the same population [49], that are shared between the datasets coming from different experiments. Proper
55
Dimensionality reduction tools aim to capture the
structure of the high-dimensional data by projecting it
to a lower dimensional space that keeps the most
important structural properties of the original, high-
dimensional space. The lower dimensional projection
allows the human expert to visualise and explore the
data. Dimensionality reduction can be performed
either in a linear way (the lower dimensional projec-
tions are a linear combination of the original dimen-
sions), or in a nonlinear way. PCA is a linear
dimensionality reduction technique, in which the fea-
tures with the largest variability are preserved in prin-
cipal components. The main sources of variability in
the data can then be optimally laid out. A PCA can
Table 1. Dimensionality reduction based- and clustering
based-tools for visualisation of single-cell high-dimensional data.
Class of
method Name Description
Dimensionality
reduction
PCA Linear reduction in the dimensions
holding the highest variance into
orthogonal principal components
MDS Nonlinear reduction in the
dimensions by preserving the
intercellular distances of high
dimensions in the lower
dimensions
tSNE Nonlinear dimensionality reduction,
preserves the local similarities
between cells
Diffusion
maps
Nonlinear dimensionality reduction,
computes transition probabilities
between cells
SPRING k-Nearest Neighbour force directed
graph, preserves the high-
dimensional relationships between
cells
Clustering SPADE Hierarchical clustering of the cells
followed by the representation of
these clusters in a minimal
spanning tree
FLOWSOM SOM clustering followed by the
representation of these clusters in
a minimal spanning tree
Scaffold
Maps
Semisupervised method: new cells
are grouped with the user-provided
cell populations to which they are
most similar
FLOWMAP Hierarchical clustering of the cells,
followed by the representation of
these clusters in a strong
connected graph structure
Phenograph Groups cells which share the same
neighbours together and identifies
communities which maximise the
Louvain modularity
spike-ins after sequencing, and the differences in spike-in amounts should only be the consequence of technical artefacts. However, the most commonly used spike-in set (ERCC [63]) cannot always faithfully account for the intrinsic gene variability, as they have been shown to have a length and GC content that differ from mam-
malian transcripts [58]. Moreover, choosing the quan-tity of spike-ins that should be added to the cells can be challenging, as a significant amount of spike-ins has to be used in order to reflect faithfully the intercellular variability, but may eclipse the intracellular transcripts of interest. However, ERCC spike-ins are still com-
monly used to filter out low quality cells [50]. Overall, the views on the use of spike-ins for single-cell RNA Seq normalisation are still conflicting [64–66].The methods cited above apply global scaling fac-
tors to all cells equally, assuming that the relation between the number of genes measured per cell and the sequencing depth is the same for all genes. How-
ever, this assumption of a constant gene-count/sequen-cing depth ratio has been shown to hold on bulk RNA data, but not in single-cell datasets [67]. Apply-
ing global scaling factors to scRNA-Seq data might therefore lead to biased correction of lowly and highly expressed genes. Two algorithms can be used to per-form single-cell specific normalisation of scRNA-Seq datasets. The SCnorm method [67] relies on the fact that the normalisation should not be applied in the same way to all the genes, as they differ in various properties such as transcript length and GC content. SCnorm first groups genes with similar dependencies on sequencing depth and subsequently estimates differ-ent scale factors for each group of genes. Alternatively, SCRAN [50], first groups cells with similar expression profiles together, and applies intragroup normalisation before performing intergroup normalisation.
Visualising high-dimensionalsingle-cell dataOnce the data has been preprocessed, visualisation tools can help to get a first insight into the structure of the data. A quick principal component analysis (PCA) plot of the data can, for instance, allow identi-fying any remaining source of technical variability between samples, which should be removed by normal-
isation. Structures in the data or biological differences between the samples may then be investigated using different approaches: dimensionality reduction tech-
niques, clustering techniques, or the novel class of techniques to model cell trajectories and state transi-tions. A list of visualisation tools and their principal characteristics is provided in Table 1.
56
[50,78], which considerably reduces the number of fea-
tures and the noise they contain, while preserving the
main biologically relevant sources of variability.
Another algorithm was implemented in the SEURAT R
package [79] to filter HVGs. Visualisation, clustering
or any downstream analysis algorithms can then be
applied either to the HVGs, or, if the dimensions of
the data are still too high, on the principal compo-
nents of a PCA run on these HVGs.
In order to highlight the differences between the dif-
ferent methods cited above, we applied two dimension-
ality reduction tools (PCA and tSNE) and two
clustering-based tool (FLOWSOM, Phenograph) on a
publicly available scRNA-Seq dataset [16] of 3000
peripheral blood mononuclear cells (PBMCs) from the
10X Genomics platform (Fig. 2). We first preprocessed
the dataset as described in the data preprocessing sec-
tion by filtering out low quality cells and genes. We
then selected the most highly variable genes, to which
we applied the different visualisation methods. This fil-
tering on highly variable genes has two advantages. It
significantly reduces the size of the dataset, therefore
reducing the analysis time, and it helps to focus on the
genes that are driving heterogeneity across cells [50].
The PBMC dataset had previously been expert-labelled
in the Seurat R pipeline [79], which allowed us to use
the cell identities to simplify the comparison of the
outputs from the different methods. The different
methods provided complementary information on the
structure of the data. For instance, all methods except
PCA identified the rare megakaryocyte cell population,
and all methods except FlowSOM represented these
megakaryocyes close to the monocyte cell population.
As a general guideline, it is often advisable to apply
several techniques in parallel to acquire a deeper
understanding of the data structure.
Cell type identification
While the clustering approach to single-cell analysis
assumes that cells are forming well separated groups,
other types of techniques focus on better detecting
cells that are in transition between cell states. In the
first case, the expression of certain markers is expected
to differ drastically, providing hard separations
between cell populations. In the second case, the mark-
ers are seen as continuous variables which smoothly
change from one cell to another, leading to structural
patterns in the data which can be seen as developmen-
tal trajectories (Fig. 3). The choice between the two
sets of methods depends on the biological question,
but a good practice can be to first apply a clustering
algorithm to identify the main populations in the data,
therefore be applied to check for batch effects in the data, or to identify any main source of variability. The use of nonlinear dimensionality reduction methods (e.g. tSNE [t-stochastic neighbour embedding, 68], MDS [multidimensional scaling, 69], diffusion maps [70], SPRING [71]) allows optimal plotting of the data in two dimensions while preserving the local similari-
ties between cells.Clustering-based visualisation methods group similar
cells together and may be combined with a subsequent visualisation step, for example by laying out the result-ing clusters in two dimensions. This reduces computa-
tion time and can simplify the understanding of the resulting plot. Several methods have been proposed for the visualisation of clusters in single-cell data (SPADE
[Spanning-tree Progression Analysis of Density-
normalized Events] [72], FLOWSOM [73], FLOWMAP [74]). These methods represent the clusters under the form of a graph in which the most similar clusters are linked by an edge. FLOWSOM also allows performing meta-
clustering, grouping clusters into larger populations, which has shown to return results very similar to man-
ual labelling of cytometry data [75]. Single-Cell Analy-
sis by Fixed Force- and Landmark-Directed (Scaffold) maps [76] were specifically designed to simplify the identification of user defined cell populations in cytom-
etry data. Finally, Phenograph [77] identifies closely linked communities of cells in a graph structure. This algorithm therefore identifies populations without any previous knowledge on the number of expected popu-lations, which can be very useful in discovery studies. While most of these methods were initially developed for flow cytometry data, FlowSOM and Phenograph are scalable to high dimensional datasets. These meth-
ods can therefore be applied to mass cytometry and scRNA-Seq datasets, or to features extracted from images, allowing the visualisation of structure in the data.
However, scRNA-Seq and image derived data typi-cally contain much more dimensions than the usual 10–30 colour panels used in cytometry. When dealing with features extracted from images, a first step can consist in performing principal component analysis, which will help to reduce the redundancy of these highly correlated features. One can then choose to work with the principal components containing 95%of the data variability. These principal components can be analysed as new features, using visualisation or clustering techniques. scRNA-Seq datasets tend to contain noise which might bias clustering studies, espe-cially due to the high amount of lowly expressed genes and dropouts. Therefore, the highly variable genes (HVGs) can first be filtered on this type of data
57
and then perform trajectory inference on a specific
group of similar cells. Indeed, trajectory inference tools
will tend to identify trajectories in any dataset, so they
should be applied to specifically delineated sets of cells.
The identification of trajectories in highly variable
datasets is a current challenge, which is only described
recently in the literature [80].
Clustering-based approaches
Several tools have been implemented in order to iden-
tify similar groups of cells in cytometry data, compar-
ing either the similarities between cells (SPADE [81],
FLOWSOM [73]), the distances between cells in a lower
dimensional space (Accense [82]) or the shared neigh-
bours in a graph (Phenograph [77]). A benchmark
study of clustering tools, the FLOWCAP I [83] challenge,
provided several mammalian datasets to assess the
ability of different clustering methods to identify cell
populations accurately. Most tools provided a good
delineation of cell populations compared to manual
gating, and ensemble methods which merged the out-
puts of several clustering methods showed the best
results. However, due to the increasing number of
markers used in cytometry data, there is a need to per-
form benchmark studies regularly, as tools which were
very efficient with low-dimensional datasets might not
necessarily perform equally well in higher dimensions
[84]. Another study [75] compared 18 clustering meth-
ods for conventional flow and mass cytometry data,
taking into account the clustering accuracy as well as
the computational time, which becomes more impor-
tant when dealing with large datasets. The FLOWSOM
[73] algorithm showed the best clustering accuracy and
was one of the fastest methods when applied to large
datasets, with a linear complexity with respect to the
number of cells. CytoCompare [85] is a tool which was
created to perform the comparison of the clustering
results of three methods: SPADE, ViSNE/Accense [82]
and Citrus [86].
The clustering algorithms described above can also
be applied to image derived features, although, as was
the case for visualisation techniques, the high correla-
tion between features might bias clustering results. The
redundancy of the features can be reduced by first
applying a PCA to this type of data, and performing
Fig. 2. Comparison of (A) tSNE, (B) PCA, (C) FLOWSOM and (D) Phenograph on the PBMC dataset. (A) The cell colours correspond to the
labels provided by experts in the Seurat R pipeline. (B) The main differences between cell types can be seen on the horizontal (1st principal
component) and vertical (2nd principal component) axis. (C) The colours inside the pies correspond to the cell colours on the tSNE plot. The
background colours correspond to the meta-clusters identified by FlowSOM. Discrepancies between the pie colour and the background
colour highlight the cells for which FlowSOM’s results diverged from the manual annotation. (D) The similarities between the different cell
types are nicely laid out on a Phenograph plot.
58
clustering on the principal components of the PCA. In
scRNA-Seq data, clustering is more tricky because the
gene expression contains noise and the data is very
sparse. Cells may mistakenly be grouped together
based on technical noise attributed to sequencing
depth or library size, rather than actual biological
effects. This raises the need for new tools, which are
able to overcome this issue. Several tools do not com-
pare the expression patterns of cells directly anymore,
but apply tricks to perform more accurate clustering:
SC3 [87] computes a consensus clustering over several
kmeans runs at the cost of a high computational cost,
BackSPIN [88] uses a biclustering method and
DIMM-SC [89] was designed specifically for droplet-
based single-cell RNA seq data.
Another characteristic of scRNA-Seq data is the
high amount of dropout events. Some clustering
methods were specifically designed to deal with this
artefact, either by imputing the expected value of
dropout candidates (CIDR [90]), or by computing the
similarities between cells with techniques that are
robust to dropouts (SIMLR [91], SNN-Cliq [92], SCE-
NIC [93]). The PAGODA [94] algorithm also accounts
for technical biases such as the expression magnitude
and the cell cycle.
Approaches for modelling gradual transitions
Another set of approaches, called trajectory inference
(TI) methods, aim to reconstruct the developmental pro-
cess that cells are undergoing. The resulting trajectory
consists of states and transitions, with each cell mapped
to a pseudotemporal location in the trajectory (Fig. 4A).
Various visualisation techniques can aid in interpreting
Pseudotime
Expression data
Trajectory inference
Clustering
Similarities between cellsare preserved and displayed
in lower dimensions
Similarities withinclusters are preserved
Fig. 3. In order to identify structures in an expression data matrix, two types of methods can be used. Clustering-based methods will tend
to maximise the similarities between cells within clusters while maximising the differences between clusters. These methods thus help to
identify homogeneous groups of cells in the data. On the other hand, trajectory inference methods will tend to preserve the local similarities
between cells, ordering them along trajectories which represent gradual changes between similar cells.
59
the cell state- and branching point delineation, by visual-
ising the expression value of a marker over time
(Fig. 4B), comparing the gene expression values in cells
within the reduced dimensions (Fig. 4C), or grouping
genes together in pseudotemporally coregulated modules
(Fig. 4D). Cannoodt et al. [95] provide an overview of
several commonly used TI methods, organising them by
the different components they are based on.
Trajectory inference was first explored on mass
cytometry in order to reconstruct the differentiation of
hematopoietic stem cells into naive B cells [96]. Since
then, TI methods have been used increasingly to
reconstruct cell developmental trajectories. There are
several strategies TI methods use to tackle this com-
plexity, and the choice of which method is most
appropriate will thereby depend on the characteristics
of the given dataset [97]. Pioneering TI methods were
often specialised in producing a fixed trajectory type
(e.g. linear [96,98], bifurcating [70,99] or cyclical [100]).
Some methods require specific input [101], while others
State 5
State 2
State 3
State 4
State 1
Marker 1 Marker 2
Marker 3 Marker 4
State 1 State 3 State 4
Pseudotime
A B
b) DC
Expr
essi
on o
f Mar
ker 3
Fig. 4. There are several approaches to visualising trajectory models inferred by TI methods. (A) The most common visualisation is a
dimensionality reduction where similar cells are placed close together. The cells are typically coloured based on prior knowledge (e.g. cell
type) or computationally inferred clustering, and are overlaid by the trajectory inferred by the TI method. (B) A scatter plot can be used to
demonstrate a response in gene expression over pseudotime. (C) Colouring of the cells in the dimensionality reduction plot can also be
used to compare the gene expression profiles. (D) In order to obtain an overview of the dynamics of a large number of genes, these genes
can be grouped together into modules, and one path along the trajectory can be visualised in the form of a heatmap.
60
expression (DE) of genes in scRNA-Seq data (SCDE
[106], MAST [107], scDD [108]). These methods use
mixture models or Bayesian modelling frameworks to
identify both the technical effects between samples
(mainly caused by the gene detection rate) and the vari-
ance which is related to the condition being tested.
Another method, CENSUS [72], normalises the single-
cell gene expression into relative transcript counts
(accounting for technical variability between cells) in
time series studies specifically, allowing for the identifi-
cation of genes whose expression varies along time.
These single-cell specific DE methods aim to free them-
selves from the idea that gene expression is unimodal
across cells. Indeed, as many cells often show unmea-
sured genes, either due to biological or technical effects,
these methods model gene expression through more
elaborate distributions.
However, a recent study [109], which compared 36
differential gene expression approaches, concluded that
methods that were largely used for the DE analysis of
bulk RNA datasets (such as DESEQ2 [110], edger [111],
VOOM [112]), were in fact not performing worse than
single-cell specific DE methods on scRNA-Seq data-
sets. Single-cell specific DE approaches also required
more computational time, although they scaled well
with increasing cell numbers. This comparative study
highlighted the fact that an important trend that gen-
erally improved a DE analysis results was accurate
gene filtering, which reduces noise in lowly expressed
genes, leading to less false positive genes being identi-
fied as differentially expressed.
Advanced computational approaches
Network inference
Single-cell transcriptomics provide a rich source of
data, by quantifying the expression profiles of thou-
sands of cells. The intercellular heterogeneity which
naturally results from biological stochasticity [113]
allows inferring mechanisms of gene regulation involv-
ing transcription factors and their target genes. More
complex, nonlinear interactions between genes can be
studied at the single-cell level, as was shown with the
PIDC [114] algorithm, which was able to infer regula-
tory networks involved in developmental processes
from sc-qPCR datasets. However, inferring one global
regulatory network from thousands of cells might not
always prove accurate. Different subpopulations of
cells in the data might be undergoing different regula-
tory processes, which is why some methods were
implemented specifically to compute differential regu-
latory networks. These methods derive one regulatory
are capable of inferring the trajectory structure in an unbiased way [72,102]. A recent comparative review [97] assessed the performance of more than thirty TI methods on both synthetic and real scRNA-Seq data-sets, providing useful practical guidelines to choose the most appropriate methods. Notably, no method con-sistently outperformed the others on all datasets. Rather, various sets of methods were better suited to specific trajectories in the datasets, with some methods better identifying linear trajectories, and others effi-ciently identifying cycles. A good practice would there-fore be to identify a set of TI methods to apply to the data based on the expected structure, and comparing the results of at least 2–3 methods to confirm the bio-logical findings.
Differential analysis
Cytometry-based approaches
In order to identify cell populations which differ between different experimental conditions (e.g between samples of patients with different clinical outcomes), cytometry data can first be clustered, and these clusters can be compared between the conditions. In FLOWSOM
[73], the user can provide a fold-change threshold, to colour clusters which differ between the conditions. The Citrus [86] and COMPASS [103] algorithms both perform model selection to identify the clusters which are best associated with a certain condition. A similar method was implemented, which groups cells into hyperspheres instead of clusters (Cydar [59]). Convolu-tional neural networks have also been used to identify subpopulations of cells which differ the most between two conditions (CellCNN [104]). However, none of these methods directly cope with complex experiments and may therefore be sensitive to batch effects, which might be misinterpreted as the main difference between the conditions. One solution is to first remove possible batch effects in a preprocessing step before performing differential analysis. A CYTOF workflow [105] has been proposed, which first applies clustering and then uses Gaussian linear mixture models to perform differ-ential analysis while accounting for possible batch effect, paired experiments and other sources of techni-cal variance in the data.
Sequencing-based approaches
The technical biases which have to be dealt with are even larger in single-cell and bulk RNA-Seq data, as many genes are lowly expressed and noisy. Several methods were proposed to specifically tackle differential
61
and its transcription. More surprisingly, the measure-
ment of both transcripts and proteins [122,123] in single
cells has highlighted the fact that the amount of these
two entities was poorly correlated. This could be due to
the fact that transcription occurs in bursts, resulting in
high discrepancies between the numbers of transcripts,
whereas protein levels have been shown to be more
stable for particular genes [124].
The experimental procedures cited above led to low-
throughput datasets, typically containing 100 cells at
most, and could therefore be analysed by regular corre-
lation studies to assess the links between different omics
entities. The recently published CITE-seq [22] and
REAP-seq [23] methods have allowed the simultaneous
measurement of the transcriptome as well as 100 pro-
teins in thousands of cells, and have the potential to
measure thousands of proteins in single cells, as these
proteins are tagged with synthetic oligonucleotides.
Some studies have also achieved a broader characterisa-
tion of single cells by combining proteomics- and imag-
ing-based approaches [125,126]. As new experimental
procedures keep providing larger and larger datasets,
and new tools allow getting more insight into the mech-
anisms of regulations at the single-cell level [127,128],
there is a great need for multi-omics integrative compu-
tational tools. These tools should have the ability to
combine the information coming from complementary
sources to infer complex global models.
Conclusions and future perspectives
Various high-throughput approaches currently allow
studying cell populations into unprecedented depth.
The rapid development of novel technologies or
hybridisations between them is generating large and
complex datasets that require designing novel computa-
tional approaches for preprocessing, visualising and
extracting novel patterns from them. As novel tech-
nologies arise, the development of computational tools
and the adequate benchmarking between them is lag-
ging behind. Indeed, many computational approaches
to study single-cell data are continuously being pub-
lished, but the number of benchmark studies that objec-
tively compare these methods is under-represented.
Nevertheless, such benchmarks are essential to extract
useful guidelines for biologists who want to use these
tools, pinpoint limitations of current approaches and
highlight novel directions for future tool development.
While current methods mainly focus on cells in sus-
pension, novel advances that include the spatial con-
text will stimulate novel classes of computational tools
that will enable modelling cellular interactions and cell
dynamics into much greater depth. Such techniques
network for each cell subtype (CSRF [115], P�olya tree models [116]).
In order to improve the inference of gene regulatory networks, external sources of information can be pro-vided. As was discussed in the section ‘Approaches modelling gradual transitions’, cells can be ordered along developmental trajectories. Some network infer-ence methods can include the information from these inferred trajectories to reconstruct dynamic regulatory networks (AR1MA1 [117], SCODE [118]). Another source of external information could come from pertur-bational studies, in which genes are knocked out and the consequences on the transcriptome can be observed [21]. New tools will be needed to optimally use this type of data in order to infer regulatory networks.
Single-cell transcriptomics data represent a rich source of information to infer interactions which occur between genes and transcription factors. However, new studies are highlighting the need to not only focus on a single-cell’s transcripts, but also the methylation state of the DNA, the chromatin state and other epige-nomic data that might enrich our knowledge of the gene regulation dynamics [119,120].
Single-cell multi-omics data integration
Single-cell transcriptomics, proteomics, genomics and epigenomics have provided a level of understanding of the cellular heterogeneity that could not be reached with bulk studies. However, the models which are inferred from single technologies are by definition incomplete. Indeed, the relationships between the gen-ome, the amount of transcripts and proteins in a single cell are not always straightforward. Transcriptional regulatory mechanisms such as methylation may for instance alter the correlation between the gene copy number and the associated number of transcripts. Moreover, post-transcriptional mechanisms regulating protein translation and stability may also influence the relation between the number of transcripts and pro-teins in a cell. In order to fully understand and to start modelling the mechanisms involved in single cells, it will therefore be essential to integrate complementary types of data from the same single cells [26].
New experimental approaches have already been able to achieve a simultaneous and multiparameter measure-
ment by combining methods. The study of the genome together with the transcriptome [24,121] for instance has confirmed the existence of a strong correlation between genes with high copy numbers and the number of mRNA transcripts. The joint analysis of the methylome together with the transcriptome [25] also corroborated the negative relation between the methylation of a gene
62
1 Liu Z, Lavis LD & Betzig E (2015) Imaging live-cell
dynamics and structure at the single-molecule level.
Mol Cell 58, 644.
2 Abraham V, Taylor D & Haskins J (2004) High
content screening applied to large-scale cell biology.
Trends Biotechnol 22, 15–22.3 Goodman A & Carpenter AE (2016) High-throughput,
automated image processing for large-scale
fluorescence microscopy experiments. Microsc
Microanal 22, 538–539.4 Kamentsky L, Jones TR, Fraser A, Bray MA, Logan
DJ, Madden KL, Ljosa V, Rueden C, Eliceiri KW &
Carpenter AE (2011) Improved structure, function and
compatibility for Cell Profiler: modular high-
throughput image analysis software. Bioinformatics 27,
1179–1180.5 Fulwyler MJ (1965) Electronic separation of biological
cells by volume. Science (New York, NY) 150, 910–911.6 Robinson JP & Roederer M (2015) Flow cytometry
strikes gold. Science 350, 739–740.7 Perfetto SP, Chattopadhyay PK & Roederer M (2004)
Innovation: Seventeen-colour flow cytometry:
unravelling the immune system. Nat Rev Immunol 4,
648–655.
8 Nolan JP, Condello D, Nolan JP & Condello D
(2013). Spectral flow cytometry. In Current Protocols
in Cytometry, p. 1.27.1–1.27.13. John Wiley & Sons,
Inc., Hoboken, NJ.
9 McGrath KE, Bushnell TP & Palis J (2008)
Multispectral imaging of hematopoietic cells: where flow
meets morphology. J Immunol Methods 336, 91–97.10 Goddard G, Martin JC, Graves SW & Kaduchak G
(2006) Ultrasonic particle-concentration for sheathless
focusing of particles for analysis in a flow cytometer.
Cytometry Part A 69A, 66–74.11 Bandura DR, Baranov VI, Ornatsky OI, Antonov A,
Kinach R, Lou X, Pavlov S, Vorobiev S, Dick JE &
Tanner SD (2009) Mass cytometry: technique for real
time single cell multitarget immunoassay based on
inductively coupled plasma time-of-flight mass
spectrometry. Anal Chem 81, 6813–6822.12 Giesen C, Wang HA, Schapiro D, Zivanovic N, Jacobs
A, Hattendorf B, Sch€uffler PJ, Grolimund D, Buhmann
JM, Brandt S et al. (2014) Highly multiplexed imaging
of tumor tissues with subcellular resolution by mass
cytometry. Nat Methods 11, 417–422.13 Saeys Y, Van Gassen S & Lambrecht BN (2016)
Computational flow cytometry: helping to make sense
of high-dimensional immunology data. Nat Rev
Immunol 16, 449–462.14 Picelli S, Bj€orklund �AK, Faridani OR, Sagasser S,
Winberg G & Sandberg R (2013) Smart-seq2 for
sensitive full-length transcriptome profiling in single
cells. Nat Methods 10, 1096–1098.15 Macosko EZ, Basu A, Satija R, Nemesh J, Shekhar K,
Goldman M, Tirosh I, Bialas AR, Kamitaki N,
Martersteck EM et al. (2015) Highly parallel genome-
wide expression profiling of individual cells using
nanoliter droplets. Cell 161, 1202–1214.16 Zheng GXY, Terry JM, Belgrader P, Ryvkin P, Bent ZW,
Wilson R, Ziraldo SB, Wheeler TD, McDermott GP, Zhu
J et al. (2017) Massively parallel digital transcriptional
profiling of single cells.Nat Commun 8, 14049.
17 Gierahn TM, Wadsworth MH, Hughes TK, Bryson
BD, Butler A, Satija R, Fortune S, Love JC & Shalek
AK (2017) Seq-Well: portable, low-cost RNA
sequencing of single cells at high throughput. Nat
Methods 14, 395–398.18 Rosenberg AB, Roco C, Muscat RA, Kuchina A,
Mukherjee S, Chen W, Peeler DJ, Yao Z, Tasic B, Sellers
DL et al. (2017) Scaling single cell transcriptomics
through split pool barcoding. bioRxiv [preprint].
19 St�ahl PL, Salm�en F, Vickovic S, Lundmark A,
Navarro JF, Magnusson J, Giacomello S, Asp M,
Westholm JO, Huss M et al. (2016) Visualization and
analysis of gene expression in tissue sections by spatial
transcriptomics. Science (New York, NY) 353, 78–82.20 Dixit A, Parnas O, Li B, Chen J, Fulco CP, Jerby-
Arnon L, Marjanovic ND, Dionne D, Burks T,
will allow going from cells in isolation to tissues and organs, offering new perspectives for multiscale mod-
elling. On the other hand, single-cell multi-omics approaches are providing complementary information that can relate epigenetic, transcriptional and transla-tional information, paving the way for single-cell mul-
ti-omics and multi-source data integration.All of these advances strengthen the idea that the
life sciences are becoming even more data-driven sciences. To be able to analyse and correctly interpret the results of computational pipelines, young research-ers thus should be trained adequately in properly using and understanding the principles of these novel com-
putational approaches.
Acknowledgements
We thank Sofie Van Gassen, Robrecht Cannoodt, Niels Vandamme and Daniel Peralta for critical com-
ments and valuable input. HT is funded by a BOF-
IOP grant from Ghent University; YS is an ISAC Marylou Ingram scholar.
Conflict of interest
The authors declare no competing interests.
References
63
Raychowdhury R et al. (2016) Perturb-Seq: dissecting
molecular circuits with scalable single-cell RNA
profiling of pooled genetic screens. Cell 167, 1853–1866e17.
21 Jaitin DA, Weiner A, Yofe I, Lara-Astiaso D, Keren-
Shaul H, David E, Meir Salame T, Tanay A, van
Oudenaarden A & Amit I (2016) Dissecting immune
circuits by linking CRISPR-pooled screens with single-
cell RNA-Seq. Cell 167, 1883–1896.e15.22 Stoeckius M, Hafemeister C, Stephenson W, Houck-
Loomis B, Chattopadhyay PK, Swerdlow H, Satija R
& Smibert P (2017) Simultaneous epitope and
transcriptome measurement in single cells. Nat
Methods 14, 865–868.23 Peterson VM, Zhang KX, Kumar N, Wong J, Li L,
Wilson DC, Moore R, McClanahan TK, Sadekova S
& Klappenbach JA (2017) Multiplexed quantification
of proteins and transcripts in single cells. Nat
Biotechnol 35, 936–939.24 Macaulay IC, Haerty W, Kumar P, Li YI, Hu TX, Teng
MJ, Goolam M, Saurat N, Coupland P, Shirley LM
et al. (2015) G&T-seq: parallel sequencing of single-cell
genomes and transcriptomes. Nat Methods 12, 519–522.25 Angermueller C, Clark SJ, Lee HJ, Macaulay IC,
Teng MJ, Hu TX, Krueger F, Smallwood SA, Ponting
CP, Voet T et al. (2016) Parallel single-cell sequencing
links transcriptional and epigenetic heterogeneity. Nat
Methods 13, 229–232.26 Macaulay IC, Ponting CP & Voet T (2017) Single-cell
multiomics: multiple measurements from single cells.
TIG 33, 155–168.27 Carpenter AE, Jones TR, Lamprecht MR, Clarke C,
Kang IH, Friman O, Guertin DA, Chang J, Lindquist
RA, Moffat J et al. (2006) Cell Profiler: image analysis
software for identifying and quantifying cell
phenotypes. Genome Biol 7, R100.
28 Peng T, Thorn K, Schroeder T, Wang L, Theis FJ,
Marr C & Navab N (2017) A BaSiC tool for
background and shading correction of optical
microscopy images. Nat Commun 8, 14836.
29 Smith K, Li Y, Piccinini F, Csucs G, Balazs C,
Bevilacqua A & Horvath P (2015) CIDRE: an
illumination-correction method for optical microscopy.
Nat Methods 12, 404–406.30 W€ahlby C (2003) Algorithms for applied digital image
cytometry. Acta Universitatis Upsaliensis.
Comprehensive Summaries of Uppsala Dissertations
from the Faculty of Science and Technology 896,
75 pp., Uppsala. ISBN 91-554-5759-2.
31 Malpica N, de Sol�orzano CO, Vaquero JJ, Santos A,
Vallcorba I, Garc�ıa-Sagredo JM & del Pozo F (1998)
Applyingwatershed algorithms to the segmentation of
clustered nuclei. Cytometry 28, 289–297.32 Wahlby C, Sintorn IM, Erlandsson F, Borgefors G &
Bengtsson E (2004) Combining intensity, edge and
shape information for 2D and 3D segmentation of cell
nuclei in tissue sections. J Microsc 215, 67–76.33 Ortiz de Sol�orzano C, Garc�ıa Rodriguez E, Jones A,
Pinkel D, Gray JW, Sudar D & Lockett SJ. (1999)
Segmentation of confocal microscope images of cell
nuclei in thick tissue sections. J Microsc 193, 212–26.34 Meyer F & Beucher S (1990) Morphological
segmentation. J Vis Commun Image Represent 1, 21–46.35 Leipold MD (2015) Another step on the path to mass
cytometry standardization. Cytometry Part A 87,
380–382.36 Takahashi C, Au-Yeung A, Fuh F, Ramirez-Montagut
T, Bolen C, Mathews W & O’Gorman WE (2017)
Mass cytometry panel optimization through the
designed distribution of signal interference. Cytometry
Part A 91, 39–47.37 Fletez-Brant K, �Spidlen J, Brinkman RR, Roederer M
& Chattopadhyay PK (2016) flowClean: automated
identification and removal of fluorescence anomalies
in flow cytometry data. Cytometry Part A 89,
461–471.38 Bashashati A & Brinkman RR (2009) A survey of flow
cytometry data analysis methods. Adv Bioinform 2009,
584603.
39 Monaco G, Chen H, Poidinger M, Chen J,
deMagalh~aes JP & Larbi A (2016) flowAI: automatic
and interactive anomaly discerning tools for flow
cytometry data. Bioinformatics 32, 2473–2480.40 Hahne F, Khodabakhshi AH, Bashashati A, Wong
CJ, Gascoyne RD, Weng AP, Seyfert-Margolis V,
Bourcier K, Asare A, Lumley T et al. (2010) Per-
channel basis normalization methods for flow
cytometry data. Cytometry Part A 77, 121–131.41 Finak G, Frelinger J, Jiang W, Newell EW, Ramey J,
Davis MM, Kalams SA, De Rosa SC & Gottardo R
(2014) OpenCyto: an open source infrastructure for
scalable, robust, reproducible, and automated, end-to-
end flow cytometry data analysis. PLoS Comput Biol
10, e1003806.
42 Malek M, Taghiyar MJ, Chong L, Finak G, Gottardo
R & Brinkman RR (2015) flowDensity: reproducing
manual gating of flow cytometry data by automated
density-based cell population identification.
Bioinformatics 31, 606–607.43 Ziegenhain C, Vieth B, Parekh S, Reinius B,
Guillaumet-Adkins A, Smets M, Leonhardt H, Heyn
H, Hellmann I & Enard W (2017) Comparative
analysis of single-cell RNA sequencing methods. Mol
Cell 65, 631–643.44 Stegle O, Teichmann SA & Marioni JC (2015)
Computational and analytical challenges in single-cell
transcriptomics. Nat Rev Genet 16, 133–145.45 Poirion OB, Zhu X, Ching T & Garmire L (2016)
Single-cell transcriptomics bioinformatics and
computational challenges. Front Genet 7, 163.
64
46 Bacher R & Kendziorski C (2016) Design and
computational analysis of single-cell RNA-sequencing
experiments. Genome Biol 17, 63.
47 McCarthy DJ, Campbell KR, Lun ATL & Wills QF
(2016) scater: pre-processing, quality control,
normalisation and visualisation of single-cell RNA-seq
data in R. bioRxiv [preprint].
48 Butler A, Hoffman P, Smibert P, Papalexi E & Satija
R (2018) Integrating single-cell transcriptomic data
across different conditions, technologies, and species.
Nat Biotechnol 36, 411–420.49 Haghverdi L, Lun ATL, Morgan MD & Marioni JC
(2018) Batch effects in single-cell RNA-sequencing
data are corrected by matching mutual nearest
neighbors. Nat Biotechnol 36, 421–427.50 Lun ATL, McCarthy DJ & Marioni JC (2016) A step-
by-step workflow for low-level analysis of single-cell
RNA-seq data with bioconductor. F1000Research 5,
2122.
51 Scialdone A, Natarajan KN, Saraiva LR, Proserpio V,
Teichmann SA, Stegle O, Marioni JC & Buettner F
(2015) Computational assignment of cellcycle stage
from single-cell transcriptome data. Methods 85,
54–61.52 Buettner F, Pratanwanich N, McCarthy DJ, Marioni JC
& Stegle O (2017) f-scLVM: scalable and versatile factor
analysis for single-cell RNA-seq. Genome Biol 18, 212.
53 Mortazavi A, Williams BA, McCue K, Schaeffer L &
Wold B (2008) Mapping and quantifying mammalian
transcriptomes by RNA-Seq. Nat Methods 5, 621–628.54 Wagner GP, Kin K & Lynch VJ (2012) Measurement
of mRNA abundance using RNA-seq data: RPKM
measure is inconsistent among samples. Theory Biosci
131, 281–285.55 Bullard JH, Purdom E, Hansen KD & Dudoit S
(2010) Evaluation of statistical methods for
normalization and differential expression in mRNA-
Seq experiments. BMC Bioinformatics 11, 94.
56 Vallejos CA, Risso D, Scialdone A, Dudoit S &
Marioni JC (2017) Normalizing single-cell RNA
sequencing data: challenges and opportunities. Nat
Methods 14, 565–571.57 Pierson E & Yau C (2015) ZIFA: dimensionality
reduction for zero-inflated single-cell gene expression
analysis. Genome Biol 16, 241.
58 Risso D, Perraudeau F, Gribkova S, Dudoit S & Vert
JP (2017) ZINB-WaVE: a general and flexible method
for signal extraction from single-cell RNA-seq data.
bioRxiv [preprint].
59 Lun ATL, Richard AC & Marioni JC (2017) Testing
for differential abundance in mass cytometry data. Nat
Methods 14, 707–709.60 Vallejos CA, Marioni JC & Richardson S (2015)
BASiCS: Bayesian analysis of single-cell sequencing
data. PLoS Comput Biol 11, e1004333.
61 Ding B, Zheng L, Zhu Y, Li N, Jia H, Ai R, Wildberg
A & Wang W (2015) Normalization and noise
reduction for single cell RNA-seq experiments.
Bioinformatics 31, 2225–2227.62 Katayama S, T€oh€onen V, Linnarsson S & Kere J
(2013) SAMstrt: statistical test for differential
expression in single-cell transcriptome with spike-in
normalization. Bioinformatics 29, 2943–2945.63 Reid LH (2005) Proposed methods for testing and
selecting the ERCC external RNA controls. BMC
Genom 6, 150.
64 Baran-Gale J, Chandra T & Kirschner K (2017)
Experimental design for single-cell RNA sequencing.
Brief Funct Genomics 17, 233–239.65 Tung PY, Blischak JD, Hsiao CJ, Knowles DA,
Burnett JE, Pritchard JK & Gilad Y (2017) Batch
effects and the effective design of single-cell gene
expression studies. Sci Rep 7, 39921. https://doi.org/10.
1038/srep39921
66 Lun AT, Calero-Nieto FJ, Haim-Vilmovsky L,
Gottgens B & Marioni JC (2017) Assessing the
reliability of spike-in normalization for analyses of
single-cell RNA sequencing data. bioRxiv [preprint].
67 Bacher R, Chu LF, Leng N, Gasch AP, Thomson JA,
Stewart RM, Newton M & Kendziorski C (2017)
SCnorm: robust normalization of single-cell RNA-seq
data. Nat Methods 14, 584–586.68 van der Maaten L & Hinton G (2008) Visualizing data
using t-SNE. J Mach Learn Res 9, 2579–2605.69 Kruskal JB (1964) Multidimensional scaling by
optimizing goodness of fit to a nonmetric hypothesis.
Psychometrika 29, 1–27.70 Haghverdi L, Buettner F & Theis FJ (2015) Diffusion
maps for high-dimensional single-cell analysis of
differentiation data. Bioinformatics 31, 2989–2998.71 Weinreb C, Wolock S & Klein A (2017) SPRING: a
kinetic interface for visualizing high dimensional
single-cell expression data. bioRxiv [preprint].
72 Qiu X, Mao Q, Tang Y, Wang L, Chawla R, Pliner
HA & Trapnell C (2017) Reversed graph embedding
resolves complex single-cell trajectories. Nat Methods
14, 979–982.73 Van Gassen S, Callebaut B, Van Helden MJ,
Lambrecht BN, Demeester P, Dhaene T & Saeys Y
(2015) FlowSOM: using selforganizing maps for
visualization and interpretation of cytometry data.
Cytometry Part A 87, 636–645.74 Zunder ER, Lujan E, Goltsev Y, Wernig M & Nolan
GP (2015) A continuous molecular roadmap to iPSC
reprogramming through progression analysis of single-
cell mass cytometry. Cell Stem Cell 16, 323–337.75 Weber LM & Robinson MD (2016) Comparison of
clustering methods for high-dimensional single-cell
flow and mass cytometry data. Cytometry Part A 89,
1084–1096.
65
76 Spitzer MH, Gherardini PF, Fragiadakis GK,
Bhattacharya N, Yuan RT, Hotson AN, Finck R,
Carmi Y, Zunder ER, Fantl WJ et al. (2015) An
interactive reference framework for modeling a
dynamic immune system. Science 349, 1259425.
77 Levine JH, Simonds EF, Bendall SC, Davis KL, EaD
A, Tadmor MD, Litvin O, Fienberg HG, Jager A,
Zunder ER et al. (2015) Data-driven phenotypic
dissection of AML reveals progenitor-like cells that
correlate with prognosis. Cell 162, 184–197.78 Klein A, Mazutis L, Akartuna I, Tallapragada N,
Veres A, Li V, Peshkin L, Weitz DA & Kirschner MW
(2015) Droplet barcoding for single-cell
transcriptomics applied to embryonic stem cells. Cell
161, 1187–1201.79 Satija R, Farrell JA, Gennert D, Schier AF & Regev
A (2015) Spatial reconstruction of single-cell gene
expression data. Nat Biotechnol 33, 495–502.80 Campbell KR & Yau C (2016) Order under
uncertainty: robust differential expression analysis
using probabilistic models for pseudotime inference.
PLoS Comput Biol 12, e1005212.
81 Anchang B, Hart TDP, Bendall SC, Qiu P, Bjornson
Z, Linderman M, Nolan GP & Plevritis SK (2016)
Visualization and cellular hierarchy inference of
single-cell data using SPADE. Nat Protoc 11, 1264–1279.
82 Shekhar K, Brodin P, Davis MM & Chakraborty AK
(2014) Automatic classification of cellular expression
by nonlinear stochastic embedding (ACCENSE). Proc
Natl Acad Sci USA 111, 202–207.83 Aghaeepour N, Finak G, Hoos H, Mosmann TR,
Brinkman R, Gottardo R & Scheuermann RH (2013)
Critical assessment of automated flow cytometry data
analysis techniques. Nat Methods 10, 228–238.84 Newell EW & Cheng Y (2016) Mass cytometry:
blessed with the curse of dimensionality. Nat Immunol
17, 890–895.85 Platon L, Pejoski D, Gautreau G, Targat B, Le
Grand R & Beignon AS (2018) A computational
approach for phenotypic comparisons of cell
populations in high-dimensional cytometry data.
Methods 132, 66–75.86 Bruggner RV, Bodenmiller B, Dill DL, Tibshirani RJ
& Nolan GP (2014) Automated identification of
stratifying signatures in cellular subpopulations. Proc
Natl Acad Sci USA 111, E2770–E2777.87 Kiselev VY, Kirschner K, Schaub MT, Andrews T,
Yiu A, Chandra T, Natarajan KN, Reik W, Barahona
M, Green AR et al. (2017) SC3: consensus clustering
of single-cell RNA-seq data. Nat Methods 14, 483–486.88 Zeisel A, Mu~noz-Manchado AB, Codeluppi S,
L€onnerberg P, La Manno G, Jur�eus A, Marques S,
Munguba H, He L, Betsholtz C et al. (2015) Brain
structure. Cell types in the mouse cortex and
hippocampus revealed by single-cell RNA-seq. Science
(New York, NY) 347, 1138–1142.89 Sun Z, Wang T, Deng K, Wang XF, Lafyatis R, Ding
Y, Hu M & Chen W (2018) DIMM-SC: a Dirichlet
mixture model for clustering droplet-based single cell
transcriptomic data. Bioinformatics 34, 139–146.90 Lin P, Troup M & Ho JWK (2017) CIDR: ultrafast
and accurate clustering through imputation for single-
cell RNA-seq data. Genome Biol 18, 59.
91 Wang B, Zhu J, Pierson E, Ramazzotti D & Batzoglou
S (2017) Visualization and analysis of single-cell RNA-
seq data by kernelbased similarity learning. Nat
Methods 14, 414–416.92 Xu C & Su Z (2015) Identification of cell types from
single-cell transcriptomes using a novel clustering
method. Bioinformatics 31, 1974–1980.93 Aibar S, Gonz�alez-Blas CB, Moerman T, Huynh-Thu
VA, Imrichova H, Hulselmans G, Rambow F, Marine
JC, Geur P & Aerts J (2017) SCENIC: single-cell
regulatory network inference and clustering. Nat
Methods 14, 1083–1086.94 Fan J, Salathia N, Liu R, Kaeser GE, Yung YC,
Herman JL, Kaper F, Fan J-B, Zhang K, Chun J
et al. (2016) Characterizing transcriptional
heterogeneity through pathway and gene set over
dispersion analysis. Nat Methods 13, 241–244.95 Cannoodt R, Saelens W & Saeys Y (2016)
Computational methods for trajectory inference from
single-cell transcriptomics. Eur J Immunol 46, 2496–2506.96 Bendall SC, Davis KL, Amir EAD, Tadmor MD,
Simonds EF, Chen TJ, Shenfeld DK, Nolan GP &
Pe’er D (2014) Single-cell trajectory detection uncovers
progression and regulatory coordination in human B
cell development. Cell 157, 714–725.97 Saelens W, Cannoodt R, Todorov H & Saeys Y (2018)
A comparison of single-cell trajectory inference
methods: towards more accurate and robust tools.
bioRxiv [preprint].
98 Cannoodt R, Saelens W, Sichien D, Tavernier S,
Janssens S, Guilliams M, Lambrecht BN, De PK &
Saeys Y (2016) SCORPIUS improves trajectory
inference and identifies novel modules in dendritic cell
development. bioRxiv [preprint].
99 Setty M, Tadmor MD, Reich-Zeliger S, Angel O,
Salame TM, Kathail P, Choi K, Bendall S, Friedman
N & Pe’er D (2016) Wishbone identifies bifurcating
developmental trajectories from single-cell data. Nat
Biotechnol 34, 1–14.100 Liu Z, Lou H, Xie K, Wang H, Chen N, Aparicio
OM, Zhang MQ, Jiang R & Chen T (2017)
Reconstructing cell cycle pseudo time-series via single-
cell transcriptome data. Nat Commun 8, 22.
101 Trapnell C, Cacchiarelli D, Grimsby J, Pokharel P, Li
S, Morse M, Lennon NJ, Livak KJ, Mikkelsen TS &
Rinn JL (2014) The dynamics and regulators of cell
66
N, Purdom E & Dudoit S (2017) Slingshot: cell
lineage and pseudotime inference for single-cell
transcriptomics. bioRxiv [preprint].
103 Lin L, Finak G, Ushey K, Seshadri C, Hawn TR, Frahm
N, Scriba TJ, Mahomed H, HanekomW, Bart P-A et al.
(2015) COMPASS identifies T-cell subsets correlated
with clinical outcomes. Nat Biotechnol 33, 610–616.104 Arvaniti E & Claassen M (2017) Sensitive detection of
rare disease-Associated cell subsets via representation
learning. Nat Commun 8, 1–10.105 Nowicka M, Krieg C, Weber LM, Hartmann FJ,
Guglietta S, Becher B, Levesque MP & Robinson MD
(2017) CyTOF workflow: differential discovery in
high-throughput high-dimensional cytometry datasets.
F1000Research 6, 748.
106 Kharchenko PV, Silberstein L & Scadden DT (2014)
Bayesian approach to single-cell differential expression
analysis. Nat Methods 11, 740–742.107 Finak G, McDavid A, Yajima M, Deng J, Gersuk V,
Shalek AK, Slichter CK, Miller HW, McElrath MJ,
Prlic M et al. (2015) MAST: a flexible statistical
framework for assessing transcriptional changes and
characterizing heterogeneity in single-cell RNA
sequencing data. Genome Biol 16, 278.
108 Korthauer KD, Chu LF, Newton MA, Li Y, Thomson
J, Stewart R & Kendziorski C (2016) A statistical
approach for identifying differential distributions in
single-cell RNA-seq experiments. Genome Biol 17, 222.
109 Soneson C & Robinson MD (2018) Bias, robustness
and scalability in single-cell differential expression
analysis. Nat Methods 15, 255–261.110 Love MI, Huber W & Anders S (2014) Moderated
estimation of fold change and dispersion for RNA-seq
data with DESeq2. Genome Biol 15, 550.
111 Robinson MD, McCarthy DJ & Smyth GK (2010)
edgeR: a Bioconductor package for differential
expression analysis of digital gene expression data.
Bioinformatics 26, 139–140.112 Law CW, Chen Y, Shi W & Smyth GK (2014) voom:
precision weights unlock linear model analysis tools
for RNA-seq read counts. Genome Biol 15, R29.
113 Padovan-Merhar O & Raj A (2013) Using variability
in gene expression as a tool for studying gene
regulation. WIREs Syst Biol Med 5, 751–759.114 Chan TE, Stumpf MPH & Babtie AC (2017) Gene
regulatory network inference from single-cell data
using multivariate information measures. Cell systems
5, 251–267.e3.115 Xu R, Nettleton D & Nordman DJ (2016) Case-specific
random forests. J Comput Graph Stat 25, 49–65.
116 Filippi S & Holmes CC (2017) A Bayesian
nonparametric approach to testing for dependence
between random variables. Bayesian Anal 12, 919–938.117 Castillo MS, Blanco D, Luna IMT, Carrion MC &
Huang Y (2018) A Bayesian framework for the
inference of gene regulatory networks from time and
pseudo-time series data. Bioinformatics 34, 964–970.118 Matsumoto H, Kiryu H, Furusawa C, Ko MSH, Ko
SBH, Gouda N, Hayashi T & Nikaido I (2017)
SCODE: an efficient regulatory network inference
algorithm from single-cell RNA-Seq during
differentiation. Bioinformatics 33, 2314–2321.119 Fiers MWEJ, Minnoye L, Aibar S, Bravo Gonz�alez-
Blas C, Kalender Atak Z & Aerts S (2018) Mapping
gene regulatory networks from single-cell omics data.
Brief Funct Genomics 17, 246–254.120 €Aij€o T & Bonneau R (2017) Biophysically motivated
regulatory network inference: progress and prospects.
Hum Hered 81, 62–77.121 Dey SS, Kester L, Spanjaard B, Bienko M & Van
Oudenaarden A (2015) Integrated genome and
transcriptome sequencing of the same cell. Nat
Biotechnol 33, 285–289.122 Darmanis S, Gallant CJ, Marinescu VD, Niklasson
M, Segerman A, Flamourakis G, Fredriksson S,
Assarsson E, Lundberg M, Nelander S et al. (2016)
Simultaneous multiplexed measurement of RNA and
proteins in single cells. Cell Rep 14, 380–389.123 Albayrak C, Jordi CA, Zechner C, Lin J, Bichsel CA,
Khammash M & Tay S (2016) Digital quantification
of proteins and mRNA in single mammalian cells.
Mol Cell 61, 914–924.124 Schwanh€ausser B, Busse D, Li N, Dittmar G,
Schuchhardt J, Wolf J, Chen W & Selbach M (2011)
Global quantification of mammalian gene expression
control. Nature 473, 337–342.125 Soh KT, Tario JD, Colligan S, Maguire O, Pan D,
Minderman H & Wallace PK (2016) Simultaneous,
single-cell measurement of messenger RNA, cell
surface proteins, and intracellular proteins. Curr
Protoc Cytom 75, 7.45.1–7.45.33.126 Kochan J, Wawro M & Kasza A (2015) Simultaneous
detection of mRNA and protein in single cells using
immunofluorescence combined single-molecule RNA
FISH. Biotechniques 59, 209–212, 214, 216.127 Buenrostro JD, Wu B, Litzenburger UM, Ruff D,
Gonzales ML, Snyder MP, Chang HY & Greenleaf WJ
(2015) Single-cell chromatin accessibility reveals
principles of regulatory variation. Nature 523, 486–490.128 Jin W, Tang Q, Wan M, Cui K, Zhang Y, Ren G, Ni B,
Sklar J, Przytycka TM, Childs R et al. (2015) Genome-
wide detection of DNase i hypersensitive sites in single
cells and FFPE tissue samples. Nature 528, 142–146.
fate decisions are revealed by pseudotemporal ordering of single cells. Nat Biotechnol 32, 381–386.
102 Street K, Risso D, Fletcher RB, Das D, Ngai J, Yosef
67