Abstract 316: A novel, high-throughput full-length scRNA-seq workflow for improved biomarker discovery

Peng Xu,Joseph Liu,Yana Ryan,Kazuo Tori,Xuan Li,Hima Anbunathan,Mike Covington,Tomoya Uchiyama,Mohammad Fallahi,Xuan Qu,Xiaoyun Xing,Ting Wang,Bryan Bell,Shuwen Chen,Yue Yun,Andrew Farmer
DOI: https://doi.org/10.1158/1538-7445.am2024-316
IF: 11.2
2024-03-23
Cancer Research
Abstract:Objective: Single-cell RNA-seq (scRNA-seq) analysis has been widely applied in oncology research for biomarker discovery. Although droplet-based methods are commonly used for such studies owing to their high throughput, they still miss important insights due to their lack of full-length transcript coverage. While full-length methods are available, to date, they have not been able to meet the throughput demands of many researchers. Moreover, both droplet and full-length scRNA-seq methods do not currently provide adequate readouts for non-coding genes, thereby limiting investigation of gene regulatory networks to protein coding genes. To close these gaps, we have developed a new high-throughput full-length scRNA-seq workflow that comprehensively profiles both protein-coding and non-coding genes in up to 60,000 cells within two days. Methods: Our new high-throughput workflow uses two rounds of combinatorial indexing, starting with a 96-well plate format for the first barcoding step followed by an automated second barcoding step in a 5,184-nanowell chip using an automated nanodispensing system. Initial testing demonstrated that our method could handle up to 60,000 cells without generating significant levels of doublets due to barcode collisions. To further illustrate the capacity of the new scRNA-seq approach, we profiled a total of approximately 11,000 isogenic A549 cells that either express WT TP53 or are TP53 null. In addition, both isogenic cell lines were treated with epigenetic therapy or mock treatment. Libraries were generated and sequenced using an Illumina® NextSeq®2000 sequencer. The sequencing data was then analyzed to define differential gene expression for both protein-coding and non-coding transcripts as a function of TP53 genotype and treatment condition, using CogentTM NGS software. Results: Preliminary analysis showed that, on average, approximately 11,000 genes and 40,000 transcripts were detected per single cell at a read depth of 100,000 reads per cell. UMAP-based clustering confidently separated the cells according to their genotypes and treatment conditions using either protein-coding genes or non-coding genes. Furthermore, differential expression analysis identified both protein-coding and non-coding transcripts with significant expression differences, underscoring biological significance. Conclusion: Our new high-throughput full-length scRNA workflow enables preparation of high-quality full-length RNA-seq libraries for up to 60,000 cells with only two rounds of barcoding and shows high sensitivity and specificity in gene/transcript detection and quantification. The technology significantly improves the ability to identify new biomarkers by enabling comprehensive profiling of both protein-coding and non-coding full length transcripts. Citation Format: Peng Xu, Joseph Liu, Yana Ryan, Kazuo Tori, Xuan Li, Hima Anbunathan, Mike Covington, Tomoya Uchiyama, Mohammad Fallahi, Xuan Qu, Xiaoyun Xing, Ting Wang, Bryan Bell, Shuwen Chen, Yue Yun, Andrew Farmer. A novel, high-throughput full-length scRNA-seq workflow for improved biomarker discovery [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular s); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl) nr 316.
oncology
What problem does this paper attempt to address?