Single-cell RNA Sequencing (scRNA-seq) Protocols: A Comparative Guide

scRNA-seq technologies have revolutionized transcriptomics, but the variety of available protocols and their distinct features can be confusing. This guide presents essential information on 30 protocols to help you select the right method for your needs and process the data correctly.

The protocols differ in several key aspects, including cell isolation techniques, transcript coverage, throughput, strand specificity, multiplexing capability, cost and technical complexity.

Last modification: 2023-06-02

	STRT-seq	Smart-seq/C1	CEL-seq	Quartz-seq	Smart-seq2	SCRB-seq	MARS-seq	CEL-seq2	CEL-seq2/C1	MATQ-seq	Quartz-seq2	Smart-seq3	Smart-seq3xpress	FLASH-seq	VASA-plate	Drop-Seq	InDrop V1	InDrop V2	InDrop V3	10X Chromium V1	10X Chromium V2	10X Chromium V3	VASA-drop	CytoSeq	Seq-well	Microwell-seq	sci-RNA-seq	sci-RNA-seq3	Split-seq
Released year	2011	2012	2012	2013	2014	2014	2014	2016	2016	2017	2018	2020	2022	2022	2022	2015	2015	2015	2015	2017	2017	2017	2022	2015	2017	2018	2017	2019-2021	2018
Method-based	Plate-based	Plate-based	Plate-based/microfluidics	Plate-based	Plate-based	Plate-based	Plate-based	Plate-based/microfluidics	Fluidigm C1	Plate-based	Plate-based	Plate-based	Plate-based	plate-based	Plate-based	Droplet-based	Droplet-based	Droplet-based	Droplet-based	Droplet-based	Droplet-based	Droplet-based	Droplet-based	Nanowell array	Nanowell array	Nanowell array	Combinatorial indexing-based (plate-based)	Combinatorial indexing-based (plate-based)	Combinatorial indexing-based
Throughput	low-throughput	low-throughput	low-throughput	low-throughput	low-throughput	low-throughput	Automatic liquid handling high-throughput	medium throughput	medium throughput	medium throughput	medium throughput	low-throughput	low-throughput	low-throughput	low-throughput	high-throughput	high-throughput	high-throughput	high-throughput	high-throughput	high-throughput	high-throughput	high throughput	high-throughput	high-throughput	high-throughput	high-throughput	high-throughput	high-throughput
Number of cells processed	1-100 (96 cells)	1-100	10-500	1-100	<1,000	<1,000	384-1,535	100-1,000	100-1,000	100-1,000	up to 1,536	< 1000 (384-well plates)	< 1,000	< 1000	<1,000	1,000-10,000	1,000-10,000	1,000-10,000	1000-10,000	> 10,000	> 10,000	> 10,000	> 10,000	10,000-100,000	10,000-100,000	5,000-10,000	1,000-10,000	> 10,000	> 10,000
Cost per cell for sequencing-ready libraries	$2	NA	NA	$23	$1.50 - $2.50	$1.70	$1.3	$0.30-$0.50*	$0.70-$1.20*	$0.40 - $0.60	$0.40 - $1.08	$0.57 - $1.14	$ 0.30	$0,99 - $4,21 with UMI	$0.98 USD	$0.10 - $0.20	$0.10 - $0.50	$0.10 - $0.50	$0.10 - $0.50	$0.5	$0.5	$0.5	$0.11	< 1$	0.15 $	0.02$	$0.03 - $0.20	$0.01	$0.01
Target RNA type	polyadenylated RNA	polyadenylated RNA	polyadenylated RNA	polyadenylated RNA	polyadenylated RNA	polyadenylated RNA	polyadenylated RNA	polyadenylated RNA	polyadenylated RNA	polyA+ and polyA-	polyadenylated RNA	polyadenylated RNA	polyadenylated RNA	polyadenylated RNA	polyA+ and polyA-	polyadenylated RNA	polyadenylated RNA	polyadenylated RNA	polyadenylated RNA	polyadenylated RNA	polyadenylated RNA	polyadenylated RNA	polyA+ and polyA-	polyadenylated RNA	polyadenylated RNA	polyadenylated RNA	polyadenylated RNA	polyadenylated RNA	polyA+ and polyA-
Transcript coverage	5'	Full-length	3'	Full-length	Full-length	3'	3'	3'	3'	Full-length	3'	Full-length	Full-length	Full-length	Full-length	3'	3'	3'	3'	3'	3'	3'	Full-length	3'	3'	3'	3'	3'	3'
UMI	no	no	no	no	no	yes (10bp)	yes (10bp)	yes (6bp)	yes (6bp)	yes	yes (8bp)	yes (8bp)	yes	yes if wanted	UFI (6pb)	yes (8pb)	yes (6pb)	yes (6pb)	yes (6pb)	yes (10pb)	yes (10bp)	yes (12bp)	UFI (6pb)	yes (8bp)	yes (8pb)	yes (6bp)	yes (8bp)	yes	yes (10bp)
Barcode	yes (19bp)	no	yes (8bp)	no	no	yes (6bp)	yes (6bp)	yes (6bp)	yes (6bp)	no	yes (15bp for 1536 wells or 14bp for 384 wells )	no	no	no	yes (8pb)	yes (12bp)	yes (19bp)	yes (19bp)	yes (16bp)	yes (14bp)	yes (16bp)	yes (16bp)	yes (2 x 8bp)	yes (8bp)	yes (12pb)	yes (18bp)	yes (10pb)	yes	yes (18bp)
Strand specific	yes	no	yes	no	no	yes	yes	yes	yes	yes	yes	5'UMI fragments stranded, internal fragments not stranded	5'UMI fragments stranded, internal fragments not stranded	5'UMI fragments stranded, internal fragments not stranded	yes	yes	yes	yes	yes	yes	yes	yes	yes	yes	yes	yes	yes	yes	yes
Librare time preparation	2 days	NA	2-3 days (~30h)	NA	10h	NA	NA	NA	NA	10h	NA	10.5 h	5-6h	~4.5 h (low amplification) - 7.2 h	NA	12h	> 24h	> 24h	> 24h	< 24h	< 24h	< 24h	NA	NA	NA	NA	2 days	3 days (16h)	2-3 days*
av. number gene detect per cell (at sequencing saturation)	1,000-8,000	6,000 - 8,000	4,000-6,000	3,000-7,000	6,500-10,000	5,000-9,000	500-5,000	5,000-7,000	6,000-9,000	8,000 -14'000	5,500-8,000	9,000-12,000	9,000-14,000	9,000-12,000	9,000-15,000	2000-6000	2,000 and 5,000	2,000 - 5,000	2,000 - 5,000	4000-7000 (before 500-1,500)	4,000-7,000 (before 500-1,500)	4,000-7,000 (before 500-1,500)	9,000-15,000	-	6,000-10,000	6,500	3,000- 7,000*		3,000- 7,000
Conventional cell isolation/capture	Mouth pipette or FACS	Fluidigm C1 / FACS	Mouth pipette, FACS, microfluidics	Mouth pipette or FACS	FACS	FACS	FACS with automatic liquid handling	Mouth pipette, FACS, microfluidics	Fluidigm C1	FACS	flow cytometry	FACS	FACS	FACS	FACS	Droplet	Droplet	Droplet	Droplet	Droplet	Droplet	Droplet	Microdroplets	not needed (dilution)	not needed (dilution)	not needed (dilution)	not needed (dilution)	not needed (dilution)	not needed (dilution)
mRNA priming (1st strand synt.)	poly(T)	poly(T)	poly(T)	poly(T)	poly(T)	poly(T)	poly(T)	poly(T)	poly(T)	random primers (GATdT/MALBAC primers)	poly(T)	poly(T)	poly(T)	Poly(T)	Poly(T)*	poly(T)	poly(T)	poly(T)	poly(T)	poly(T)	poly(T)	poly(T)	Poly(T)*	poly(T)	poly(T)	poly(T)	poly(T)	poly(T)	poly(T) + random hexamer primers
2nd strand synthesis	TSO	TSO	RNase H and DNA pol 1 (IVT)	5' poly(A) tagging method:	TSO	TSO	RNase H and DNA pol 1	RNase H and DNA pol 1 (IVT)	RNase H and DNA pol 1 (IVT)	ten cycles of annealing	PolyA tailing and primer ligation	TSO	TSO	TSO	RNase H and DNA pol 1 (IVT)	TSO	RNase H and DNA pol 1	RNase H and DNA pol 1	RNase H and DNA pol 1	TSO	TSO	TSO	RNase H and DNA pol 1 (IVT)	NA	TSO	TSO	RNase H and DNA pol 1		TSO
Full-length cDNA synthesis	no	yes	no	Yes	yes	yes	no	no	no	yes (but by pieces, as random priming)	yes in principle	yes	yes	yes	yes in pieces	yes	no	no	no	yes	yes	yes	yes in pieces	NA	yes	yes	no	no	yes
Amplification method	PCR	PCR	IVT	PCR	PCR	PCR	IVT	IVT	IVT	PCR, Multiple annealing	PCR	PCR	PCR	semi-suppressive PCR	IVT	PCR	IVT	IVT	IVT	PCR	PCR	PCR	IVT	PCR (Pre-defined genes only)	PCR	PCR	PCR	PCR	PCR
Pooling before library prep	yes	no	yes	no	no	yes	yes	yes	yes	no	yes	no	no	no	no (pooling just before IVT)	yes	yes	yes	yes	yes	yes	yes	yes	yes	yes	yes	yes	yes	yes
Fragmentation/tagmentation	fragmentation	tagmentation	fragmentation	fragmentation by Covaris (Ultrasound)	tagmentation	Tagmentation + 3' enrichment	RNA fragmentation	RNA fragmentation	RNA fragmentation	fragmentation by sonication	fragmentation by Ultrasound	tn5 tagmentation	tagmentation	tagmentation	RNA fragmentation	cDNA fragmentation	RNA fragmentation	RNA fragmentation	RNA fragmentation	cDNA fragmentation	Fragmentation + 3' enrichment	fragmentation + 3' enrichment	RNA fragmentation	NA	tagmentation + 3' enrichment	fragmentation + 3' enrichment	tagmentation + 3' enrichment	tagmentation	tagmentation
In Kallisto	no	no	yes	no	yes	yes	no	yes	no	no	no	yes				yes	yes	yes	yes	yes	yes	yes	no	no	no	no	no	no	yes

Protocol TO ADD: SUPeR-seq, SORT-seq STORM-seq, STRT-seq-C1, STRT-seq-2i, DNBelab C4, DroNC-seq

Cell Isolation Techniques

Cell isolation techniques form the basis of all scRNA-seq protocols and largely dictate the procedure's scalability and applicability. Plate-based methods, such as manual picking or fluorescence-activated cell sorting (FACS), offer precise control over cell selection (e.g. suitable for targeting rare cell types) but are generally lower throughput. Microfluidic approaches, including for example droplet-based and microwell-based methods, can process a larger number of cells simultaneously and often incorporate barcoding strategies for sample multiplexing.

Transcript Coverage

Transcript coverage refers to the portion of each RNA molecule that is sequenced. Some protocols, like SMART-seq2/3, capture full-length transcripts, providing comprehensive information about alternative splicing and isoform usage. In contrast, high-throughput methods such as 10X Genomics Chromium, Drop-seq, and inDrop focus on sequencing only the 3' or 5' ends of transcripts, trading transcript-level detail for increased cell throughput and lower cost.

Strand Specificity

Strand specificity refers to whether the protocol retains information about which DNA strand the RNA transcript was derived from. Strand specificity is essential for distinguishing between overlapping genes on opposite strands, identifying splicing events, detecting non-coding RNA transcripts, and investigating antisense transcription. ScRNA-seq protocols that specifically sequence the 3' ends of RNA molecules tend to be stranded, while full-length protocols often do not preserve strand information.

Amplification Methods

Amplification is an essential step in scRNA-seq protocols, increasing the limited cDNA from each cell to levels appropriate for sequencing. Two primary methods are utilized: PCR (Polymerase Chain Reaction), an exponential amplification method, and IVT (In Vitro Transcription), a linear amplification method. While PCR is faster, it can introduce biases due to uneven amplification efficiency, although these can be mitigated with unique molecular identifiers (UMIs). IVT typically introduces fewer biases due to its linear amplification nature, providing a more accurate representation of the original transcript abundance. However, IVT is more time-consuming than PCR.

Target RNA Type

The type of RNA targeted for sequencing is another crucial factor that distinguishes scRNA-seq protocols. Most currently available methods focus on mRNA due to its ease of isolation and compatibility with multiplexing strategies. This is achieved by using Poly-T primers during reverse transcription, which selectively target the poly-A tail of mRNA. If a broader investigation of RNA species, including non-coding RNAs, is desired, different approaches can be employed. One option is to use random primers. Another method involves RNA fragmentation in the first step, followed by end repair and poly(A) tailing, enabling cDNA synthesis from barcoded oligo-dT probes.

Cost Considerations

The cost of scRNA-seq varies significantly based on the chosen protocol. High-throughput methods, such as droplet-based techniques, can process many cells simultaneously, significantly reducing the cost per cell despite the initial expense of specialized equipment and reagents. On the other hand, plate-based methods, like manual picking or fluorescence-activated cell sorting (FACS), while having lower equipment costs, are more labor-intensive and often have higher costs per cell due to the time and resources needed to process individual cells. Methods targeting total RNA or aiming for full-length transcript coverage can also be more expensive due to additional reagents and steps required. Furthermore, downstream data analysis costs need to be considered as high-throughput methods typically generate large amounts of data, requiring substantial computational resources to process.

Multiplexing and Data Processing

Multiplexing, a process that allows for the simultaneous preparation of multiple samples, has become an essential feature of many high-throughput scRNA-seq protocols. Multiplexing is achieved through the use of barcodes and unique molecular identifiers (UMIs). Barcodes are sequences unique to each cell, while UMIs are unique tags added to each transcript, allowing for the differentiation and quantification of individual mRNA molecules.

While multiplexing greatly enhances the throughput and efficiency of scRNA-seq, it also introduces additional data processing steps. Correcting barcodes becomes necessary to account for potential sequencing errors, ensuring accurate cell assignment. Demultiplexing, the process of assigning reads back to their respective cells based on their barcodes, requires protocol-specific handling of the barcode structures. Additionally, UMI deduplication is performed to account for PCR amplification bias and also requires protocol-specific handling of the UMI structures.

Furthermore, in droplet-based methods, it's crucial to filter out doublets (droplets containing more than one cell), multiplets (more than two cells), empty droplets and damaged cells to ensure accurate downstream analysis. Each of these steps increases the complexity of the data processing pipeline and should be carefully considered when planning a scRNA-seq experiment.

Normalization

Normalization is a crucial step in single-cell RNA sequencing (scRNA-seq) data analysis that aims to remove technical biases and enable meaningful comparisons of gene expression across cells. The choice of the most suitable normalization method often depends on the specific scRNA-seq protocol employed. For protocols capturing the entire transcript, normalization approaches based on total RNA molecule counts, such as library size normalization or transcripts per million (TPM), are commonly used. These methods account for sequencing depth differences. In 3' end sequencing protocols, unique molecular identifier (UMI) counts are often employed for normalization to correct for amplification biases. Additionally, normalization methods that consider capture efficiency variations can be applied, such as spike-in normalization using synthetic RNA molecules or statistical models incorporating factors like GC content and transcript length. Choosing an appropriate normalization method ensures accurate quantification and reliable interpretation of gene expression patterns in scRNA-seq data.

Still need to talk about dropout and sensitivity