Expanding the use of clustering and dimensionality reduction in high parameter flow cytometry data through machine learning for novel samples.

Joseph Cornelius Lownik,Simeon Mahov,Serhan Alkan,Akil Merchant,Sumire Kitahara
DOI: https://doi.org/10.4049/jimmunol.208.supp.172.03
2022-05-01
The Journal of Immunology
Abstract:Abstract High parameter flow cytometry is a highly utilized tool for accurate immunophenotyping and diagnostics in immunology, oncology, and many other disciplines. Over the past decade, several methods have been developed for automated clustering and dimensionality reduction of high parameter flow cytometry data which has sped up and simplified the discovery of cell populations not observed by manual gating strategies. However, the input and output of such tools are stochastic in nature, thus making their results difficult to reuse with novel samples. To address this challenge, we present a method utilizing machine learning to predict cluster labels and dimensionality reduction coordinates on novel samples. For proof of principle, we utilized high parameter (22 marker) flow cytometry data which examines myeloid hematopoiesis on bone marrow aspirate samples. Training data for machine learning consisted of pooled, normal, bone marrow. Phenograph was used for initial population clustering and UMAP for dimensionality reduction. A random forest model was found to be most accurate in predicting Phenograph clusters (98%) on novel data while a k-nearest neighbors (knn) model was found most accurate for UMAP coordinate prediction. The utility of this model was observed by examining acute myeloid leukemia with aberrant immunophenotypes, which were not included in the training data set. For these samples, predicted clusters and UMAP coordinates correlated with early progenitor populations from the training set. This approach allows for a de novo sample clustering and dimensionality reduction assignment onto an already established and characterized model, allowing for rapid, automated interpretation of high dimensional flow cytometry data. Supported by Cedars-Sinai Pathology and Laboratory Medicine Minigrant
immunology
What problem does this paper attempt to address?