Abstract 1932: Pollock: Fishing for Cell States

Erik Storrs,Daniel Cui Zhou,Michael C. Wendl,Matthew A. Wyczalkowski,Alla Karpova,Liang-Bo Wang,Yize Li,Austin Southard-Smith,Reyka G. Jayasinghe,Lijun Yao,Ruiyang Liu,Yige Wu,Nadezhda V. Terekhanova,Houxiang Zhu,John M. Herndon,Feng Chen,William E. Gillanders,Ryan C. Fields,Li Ding
DOI: https://doi.org/10.1158/1538-7445.am2022-1932
IF: 11.2
2022-01-01
Cancer Research
Abstract:Abstract The use of single-cell methods is expanding at an ever-increasing rate. While multiple algorithms address the task of cell classification, they are limited in terms of cross platform compatibility, reliance on the availability of a reference dataset, and classification interpretability. Here, we introduce Pollock, a suite of algorithms for cell type identification that is compatible with popular single cell methods and analysis platforms, provides a series of pretrained human cancer reference models, and reports interpretability scores that identify the genes that drive cell type classifications. Our model combines two important approaches, one each from machine learning and deep learning: a variational autoencoder (VAE) and random forest classifier, to make cell type predictions. Pollock is highly versatile, being available as a command line tool, Python library (with scanpy integration), or R library (with Seurat integration), and can be installed as a conda package, or in containerized form via Docker. To allow for easier pan-disease and pan-tissue analyses, Pollock also ships with a library of pretrained cancer type specific and agnostic modules that were trained on expertly-curated single cell data that are ready to “plug and play” with no additional annotation or training required. Conversely, Pollock also allows for the training of custom classification modules, if an annotated reference single cell dataset is available. These pretrained models were fitted on manually curated and annotated single cell data from eight different cancer types spanning three single cell technologies (scRNA-seq, snRNA-seq, and snATAC-seq). Pollock also provides feature importance scores that allow for cell type classifications to be traced back to the genes influencing a particular cell type classification, further promoting biological interpretability. These scores could allow for new, technology-specific biomarker discovery. We also demonstrate the utility of Pollock by applying it in a pan-cancer single cell immune analysis. Citation Format: Erik Storrs, Daniel Cui Zhou, Michael C. Wendl, Matthew A. Wyczalkowski, Alla Karpova, Liang-Bo Wang, Yize Li, Austin Southard-Smith, Reyka G. Jayasinghe, Lijun Yao, Ruiyang Liu, Yige Wu, Nadezhda V. Terekhanova, Houxiang Zhu, John M. Herndon, Feng Chen, William E. Gillanders, Ryan C. Fields, Li Ding. Pollock: Fishing for cell states [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 1932.
What problem does this paper attempt to address?