Visualizing and interpreting single-cell gene expression datasets with Similarity Weighted Nonnegative Embedding

Yan Wu,Pablo Tamayo,Kun Zhang
DOI: https://doi.org/10.1101/276261
2018-03-05
Abstract:Summary High throughput single-cell gene expression profiling has enabled the characterization of novel cell types and developmental trajectories. Visualizing these datasets is crucial to biological interpretation, and the most popular method is t-Stochastic Neighbor embedding (t-SNE), which visualizes local patterns better than other methods, but often distorts global structure, such as distances between clusters. We developed Similarity Weighted Nonnegative Embedding (SWNE), which enhances interpretation of datasets by embedding the genes and factors that separate cell states alongside the cells on the visualization, captures local structure better than t-SNE and existing methods, and maintains fidelity when visualizing global structure. SWNE uses nonnegative matrix factorization to decompose the gene expression matrix into biologically relevant factors, embeds the cells, genes and factors in a 2D visualization, and uses a similarity matrix to smooth the embeddings. We demonstrate SWNE on single cell RNA-seq data from hematopoietic progenitors and human brain cells.
What problem does this paper attempt to address?