Abstract:Objective: Single-cell RNA-seq (scRNA-seq) analysis has been widely applied in oncology research for biomarker discovery. Although droplet-based methods are commonly used for such studies owing to their high throughput, they still miss important insights due to their lack of full-length transcript coverage. While full-length methods are available, to date, they have not been able to meet the throughput demands of many researchers. Moreover, both droplet and full-length scRNA-seq methods do not currently provide adequate readouts for non-coding genes, thereby limiting investigation of gene regulatory networks to protein coding genes. To close these gaps, we have developed a new high-throughput full-length scRNA-seq workflow that comprehensively profiles both protein-coding and non-coding genes in up to 60,000 cells within two days. Methods: Our new high-throughput workflow uses two rounds of combinatorial indexing, starting with a 96-well plate format for the first barcoding step followed by an automated second barcoding step in a 5,184-nanowell chip using an automated nanodispensing system. Initial testing demonstrated that our method could handle up to 60,000 cells without generating significant levels of doublets due to barcode collisions. To further illustrate the capacity of the new scRNA-seq approach, we profiled a total of approximately 11,000 isogenic A549 cells that either express WT TP53 or are TP53 null. In addition, both isogenic cell lines were treated with epigenetic therapy or mock treatment. Libraries were generated and sequenced using an Illumina® NextSeq®2000 sequencer. The sequencing data was then analyzed to define differential gene expression for both protein-coding and non-coding transcripts as a function of TP53 genotype and treatment condition, using CogentTM NGS software. Results: Preliminary analysis showed that, on average, approximately 11,000 genes and 40,000 transcripts were detected per single cell at a read depth of 100,000 reads per cell. UMAP-based clustering confidently separated the cells according to their genotypes and treatment conditions using either protein-coding genes or non-coding genes. Furthermore, differential expression analysis identified both protein-coding and non-coding transcripts with significant expression differences, underscoring biological significance. Conclusion: Our new high-throughput full-length scRNA workflow enables preparation of high-quality full-length RNA-seq libraries for up to 60,000 cells with only two rounds of barcoding and shows high sensitivity and specificity in gene/transcript detection and quantification. The technology significantly improves the ability to identify new biomarkers by enabling comprehensive profiling of both protein-coding and non-coding full length transcripts. Citation Format: Peng Xu, Joseph Liu, Yana Ryan, Kazuo Tori, Xuan Li, Hima Anbunathan, Mike Covington, Tomoya Uchiyama, Mohammad Fallahi, Xuan Qu, Xiaoyun Xing, Ting Wang, Bryan Bell, Shuwen Chen, Yue Yun, Andrew Farmer. A novel, high-throughput full-length scRNA-seq workflow for improved biomarker discovery [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular s); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl) nr 316.

Gene length and detection bias in single cell RNA sequencing protocols

Differences in molecular sampling and data processing explain variation among single-cell and single-nucleus RNA-seq experiments

Enhanced recovery of single-cell RNA-sequencing reads for missing gene expression data

Abstract 316: A novel, high-throughput full-length scRNA-seq workflow for improved biomarker discovery

Elimination of PCR duplicates in RNA-seq and small RNA-seq using unique molecular identifiers

Critical view on oligo(dT)-based RNA-seq: bias arising, modeling, and mitigating

scCensus: Off-target scRNA-seq reads reveal meaningful biology

High-Resolution Transcriptome Analysis with Long-Read RNA Sequencing

A risk-reward examination of sample multiplexing reagents for single cell RNA-Seq

Intrinsic molecular identifiers enable robust molecular counting in single-cell sequencing

Comparative Analysis of Single-Cell RNA Sequencing Methods with and without Sample Multiplexing

Evaluating Imputation Methods for Single-Cell RNA-seq Data

Bias, robustness and scalability in differential expression analysis of single-cell RNA-seq data

Precision and Accuracy of Single-Cell/Nuclei RNA Sequencing Data

A Unified Statistical Framework for Single Cell and Bulk RNA Sequencing Data

Machine learning-assisted identification of factors contributing to the technical variability between bulk and single-cell RNA-seq experiments

Identifying Genetic Signatures from Single-Cell RNA Sequencing Data by Matrix Imputation and Reduced Set Gene Clustering

Unraveling the timeline of gene expression: A pseudotemporal trajectory analysis of single-cell RNA sequencing data

An optimized protocol for single cell transcriptional profiling by combinatorial indexing

Single cell RNA‐sequencing: A powerful yet still challenging technology to study cellular heterogeneity

Experimental and Computational Methods for Allelic Imbalance Analysis from Single-Nucleus RNA-seq Data