Single-cell RNA Sequencing (scRNA-seq) Protocols: A Comparative Guide

scRNA-seq technologies have revolutionized transcriptomics, but the variety of available protocols and their distinct features can be confusing. This guide presents essential information on 30 protocols to help you select the right method for your needs and process the data correctly.

The protocols differ in several key aspects, including cell isolation techniques, transcript coverage, throughput, strand specificity, multiplexing capability, cost and technical complexity.

Last modification: 2023-06-02

STRT-seqSmart-seq/C1CEL-seqQuartz-seqSmart-seq2SCRB-seqMARS-seqCEL-seq2CEL-seq2/C1MATQ-seqQuartz-seq2Smart-seq3Smart-seq3xpressFLASH-seqVASA-plateDrop-SeqInDrop V1InDrop V2InDrop V310X Chromium V110X Chromium V210X Chromium V3VASA-dropCytoSeqSeq-wellMicrowell-seqsci-RNA-seqsci-RNA-seq3Split-seq
Released year2011201220122013201420142014201620162017201820202022202220222015201520152015201720172017202220152017201820172019-20212018
Method-basedPlate-basedPlate-basedPlate-based/microfluidicsPlate-basedPlate-basedPlate-basedPlate-basedPlate-based/microfluidicsFluidigm C1Plate-basedPlate-basedPlate-basedPlate-basedplate-basedPlate-basedDroplet-basedDroplet-basedDroplet-basedDroplet-basedDroplet-basedDroplet-basedDroplet-basedDroplet-basedNanowell arrayNanowell arrayNanowell arrayCombinatorial indexing-based (plate-based)Combinatorial indexing-based (plate-based)Combinatorial indexing-based
Throughputlow-throughputlow-throughputlow-throughputlow-throughputlow-throughputlow-throughputAutomatic liquid handling high-throughputmedium throughputmedium throughputmedium throughputmedium throughputlow-throughputlow-throughputlow-throughputlow-throughputhigh-throughputhigh-throughputhigh-throughputhigh-throughputhigh-throughputhigh-throughputhigh-throughputhigh throughputhigh-throughputhigh-throughputhigh-throughputhigh-throughputhigh-throughputhigh-throughput
Number of cells processed1-100 (96 cells)1-10010-5001-100<1,000<1,000384-1,535100-1,000100-1,000100-1,000up to 1,536< 1000 (384-well plates)< 1,000< 1000<1,0001,000-10,0001,000-10,0001,000-10,0001000-10,000> 10,000> 10,000> 10,000> 10,00010,000-100,00010,000-100,0005,000-10,0001,000-10,000> 10,000> 10,000
Cost per cell for sequencing-ready libraries$2NANA$23$1.50 - $2.50$1.70$1.3$0.30-$0.50*$0.70-$1.20*$0.40 - $0.60$0.40 - $1.08$0.57 - $1.14$ 0.30$0,99 - $4,21 with UMI$0.98 USD$0.10 - $0.20$0.10 - $0.50$0.10 - $0.50$0.10 - $0.50$0.5$0.5$0.5$0.11< 1$0.15 $0.02$$0.03 - $0.20$0.01$0.01
Target RNA typepolyadenylated RNApolyadenylated RNApolyadenylated RNApolyadenylated RNApolyadenylated RNApolyadenylated RNApolyadenylated RNApolyadenylated RNApolyadenylated RNApolyA+ and polyA-polyadenylated RNApolyadenylated RNApolyadenylated RNApolyadenylated RNApolyA+ and polyA-polyadenylated RNApolyadenylated RNApolyadenylated RNApolyadenylated RNApolyadenylated RNApolyadenylated RNApolyadenylated RNApolyA+ and polyA-polyadenylated RNApolyadenylated RNApolyadenylated RNApolyadenylated RNApolyadenylated RNApolyA+ and polyA-
Transcript coverage5'Full-length3'Full-lengthFull-length3'3'3'3'Full-length3'Full-lengthFull-lengthFull-lengthFull-length3'3'3'3'3'3'3'Full-length3'3'3'3'3'3'
UMInononononoyes (10bp)yes (10bp)yes (6bp)yes (6bp)yesyes (8bp)yes (8bp)yesyes if wantedUFI (6pb)yes (8pb)yes (6pb)yes (6pb)yes (6pb)yes (10pb)yes (10bp)yes (12bp)UFI (6pb)yes (8bp)yes (8pb)yes (6bp)yes (8bp)yesyes (10bp)
Barcodeyes (19bp)noyes (8bp)nonoyes (6bp)yes (6bp)yes (6bp)yes (6bp)noyes (15bp for 1536 wells or 14bp for 384 wells )nononoyes (8pb)yes (12bp)yes (19bp)yes (19bp)yes (16bp)yes (14bp)yes (16bp)yes (16bp)yes (2 x 8bp)yes (8bp)yes (12pb)yes (18bp)yes (10pb)yesyes (18bp)
Strand specificyesnoyesnonoyesyesyesyesyesyes5'UMI fragments stranded, internal fragments not stranded5'UMI fragments stranded, internal fragments not stranded5'UMI fragments stranded, internal fragments not strandedyesyesyesyesyesyesyesyesyesyesyesyesyesyesyes
Librare time preparation2 daysNA2-3 days (~30h)NA10hNANANANA10hNA10.5 h5-6h~4.5 h (low amplification) - 7.2 hNA12h> 24h> 24h> 24h< 24h< 24h< 24hNANANANA2 days3 days (16h)2-3 days*
av. number gene detect per cell (at sequencing saturation)1,000-8,0006,000 - 8,0004,000-6,0003,000-7,0006,500-10,0005,000-9,000500-5,0005,000-7,0006,000-9,0008,000 -14'0005,500-8,0009,000-12,0009,000-14,0009,000-12,0009,000-15,0002000-60002,000 and 5,0002,000 - 5,0002,000 - 5,0004000-7000 (before 500-1,500)4,000-7,000 (before 500-1,500)4,000-7,000 (before 500-1,500)9,000-15,000-6,000-10,0006,5003,000- 7,000*3,000- 7,000
Conventional cell isolation/captureMouth pipette or FACSFluidigm C1 / FACSMouth pipette, FACS, microfluidicsMouth pipette or FACSFACSFACSFACS with automatic liquid handlingMouth pipette, FACS, microfluidicsFluidigm C1FACSflow cytometryFACSFACSFACSFACSDropletDropletDropletDropletDropletDropletDropletMicrodropletsnot needed (dilution)not needed (dilution)not needed (dilution)not needed (dilution)not needed (dilution)not needed (dilution)
mRNA priming (1st strand synt.)poly(T)poly(T)poly(T)poly(T)poly(T)poly(T)poly(T)poly(T)poly(T)random primers (GATdT/MALBAC primers)poly(T)poly(T)poly(T)Poly(T)Poly(T)*poly(T)poly(T)poly(T)poly(T)poly(T)poly(T)poly(T)Poly(T)*poly(T)poly(T)poly(T)poly(T)poly(T)poly(T) + random hexamer primers
2nd strand synthesisTSOTSORNase H and DNA pol 1 (IVT)5' poly(A) tagging method:TSOTSORNase H and DNA pol 1RNase H and DNA pol 1 (IVT)RNase H and DNA pol 1 (IVT)ten cycles of annealingPolyA tailing and primer ligationTSOTSOTSORNase H and DNA pol 1 (IVT)TSORNase H and DNA pol 1RNase H and DNA pol 1RNase H and DNA pol 1TSOTSOTSORNase H and DNA pol 1 (IVT)NATSOTSORNase H and DNA pol 1TSO
Full-length cDNA synthesisnoyesnoYesyesyesnononoyes (but by pieces, as random priming)yes in principleyesyesyesyes in piecesyesnononoyesyesyesyes in piecesNAyesyesnonoyes
Amplification methodPCRPCRIVTPCRPCRPCRIVTIVTIVTPCR, Multiple annealingPCRPCRPCRsemi-suppressive PCRIVTPCRIVTIVTIVTPCRPCRPCRIVTPCR (Pre-defined genes only)PCRPCRPCRPCRPCR
Pooling before library prepyesnoyesnonoyesyesyesyesnoyesnononono (pooling just before IVT)yesyesyesyesyesyesyesyesyesyesyesyesyesyes
Fragmentation/tagmentationfragmentationtagmentationfragmentationfragmentation by Covaris (Ultrasound)tagmentationTagmentation + 3' enrichmentRNA fragmentationRNA fragmentationRNA fragmentationfragmentation by sonicationfragmentation by Ultrasoundtn5 tagmentationtagmentationtagmentationRNA fragmentationcDNA fragmentationRNA fragmentationRNA fragmentationRNA fragmentationcDNA fragmentationFragmentation + 3' enrichmentfragmentation + 3' enrichmentRNA fragmentationNAtagmentation + 3' enrichmentfragmentation + 3' enrichmenttagmentation + 3' enrichmenttagmentationtagmentation
In Kallistononoyesnoyesyesnoyesnononoyesyesyesyesyesyesyesyesnonononononoyes

Protocol TO ADD: SUPeR-seq, SORT-seq STORM-seq, STRT-seq-C1, STRT-seq-2i, DNBelab C4, DroNC-seq

Cell Isolation Techniques

Cell isolation techniques form the basis of all scRNA-seq protocols and largely dictate the procedure's scalability and applicability. Plate-based methods, such as manual picking or fluorescence-activated cell sorting (FACS), offer precise control over cell selection (e.g. suitable for targeting rare cell types) but are generally lower throughput. Microfluidic approaches, including for example droplet-based and microwell-based methods, can process a larger number of cells simultaneously and often incorporate barcoding strategies for sample multiplexing.

Transcript Coverage

Transcript coverage refers to the portion of each RNA molecule that is sequenced. Some protocols, like SMART-seq2/3, capture full-length transcripts, providing comprehensive information about alternative splicing and isoform usage. In contrast, high-throughput methods such as 10X Genomics Chromium, Drop-seq, and inDrop focus on sequencing only the 3' or 5' ends of transcripts, trading transcript-level detail for increased cell throughput and lower cost.

Strand Specificity

Strand specificity refers to whether the protocol retains information about which DNA strand the RNA transcript was derived from. Strand specificity is essential for distinguishing between overlapping genes on opposite strands, identifying splicing events, detecting non-coding RNA transcripts, and investigating antisense transcription. ScRNA-seq protocols that specifically sequence the 3' ends of RNA molecules tend to be stranded, while full-length protocols often do not preserve strand information.

Amplification Methods

Amplification is an essential step in scRNA-seq protocols, increasing the limited cDNA from each cell to levels appropriate for sequencing. Two primary methods are utilized: PCR (Polymerase Chain Reaction), an exponential amplification method, and IVT (In Vitro Transcription), a linear amplification method. While PCR is faster, it can introduce biases due to uneven amplification efficiency, although these can be mitigated with unique molecular identifiers (UMIs). IVT typically introduces fewer biases due to its linear amplification nature, providing a more accurate representation of the original transcript abundance. However, IVT is more time-consuming than PCR.

Target RNA Type

The type of RNA targeted for sequencing is another crucial factor that distinguishes scRNA-seq protocols. Most currently available methods focus on mRNA due to its ease of isolation and compatibility with multiplexing strategies. This is achieved by using Poly-T primers during reverse transcription, which selectively target the poly-A tail of mRNA. If a broader investigation of RNA species, including non-coding RNAs, is desired, different approaches can be employed. One option is to use random primers. Another method involves RNA fragmentation in the first step, followed by end repair and poly(A) tailing, enabling cDNA synthesis from barcoded oligo-dT probes.

Cost Considerations

The cost of scRNA-seq varies significantly based on the chosen protocol. High-throughput methods, such as droplet-based techniques, can process many cells simultaneously, significantly reducing the cost per cell despite the initial expense of specialized equipment and reagents. On the other hand, plate-based methods, like manual picking or fluorescence-activated cell sorting (FACS), while having lower equipment costs, are more labor-intensive and often have higher costs per cell due to the time and resources needed to process individual cells. Methods targeting total RNA or aiming for full-length transcript coverage can also be more expensive due to additional reagents and steps required. Furthermore, downstream data analysis costs need to be considered as high-throughput methods typically generate large amounts of data, requiring substantial computational resources to process.

Multiplexing and Data Processing

Multiplexing, a process that allows for the simultaneous preparation of multiple samples, has become an essential feature of many high-throughput scRNA-seq protocols. Multiplexing is achieved through the use of barcodes and unique molecular identifiers (UMIs). Barcodes are sequences unique to each cell, while UMIs are unique tags added to each transcript, allowing for the differentiation and quantification of individual mRNA molecules.

While multiplexing greatly enhances the throughput and efficiency of scRNA-seq, it also introduces additional data processing steps. Correcting barcodes becomes necessary to account for potential sequencing errors, ensuring accurate cell assignment. Demultiplexing, the process of assigning reads back to their respective cells based on their barcodes, requires protocol-specific handling of the barcode structures. Additionally, UMI deduplication is performed to account for PCR amplification bias and also requires protocol-specific handling of the UMI structures.

Furthermore, in droplet-based methods, it's crucial to filter out doublets (droplets containing more than one cell), multiplets (more than two cells), empty droplets and damaged cells to ensure accurate downstream analysis. Each of these steps increases the complexity of the data processing pipeline and should be carefully considered when planning a scRNA-seq experiment.

Normalization

Normalization is a crucial step in single-cell RNA sequencing (scRNA-seq) data analysis that aims to remove technical biases and enable meaningful comparisons of gene expression across cells. The choice of the most suitable normalization method often depends on the specific scRNA-seq protocol employed. For protocols capturing the entire transcript, normalization approaches based on total RNA molecule counts, such as library size normalization or transcripts per million (TPM), are commonly used. These methods account for sequencing depth differences. In 3' end sequencing protocols, unique molecular identifier (UMI) counts are often employed for normalization to correct for amplification biases. Additionally, normalization methods that consider capture efficiency variations can be applied, such as spike-in normalization using synthetic RNA molecules or statistical models incorporating factors like GC content and transcript length. Choosing an appropriate normalization method ensures accurate quantification and reliable interpretation of gene expression patterns in scRNA-seq data.

Still need to talk about dropout and sensitivity