Gene selection for optimal prediction of cell position in tissues from single-cell transcriptomics data
Jovan Tanevski,Thin Nguyen,Buu Truong,Nikos Karaiskos,Mehmet Eren Ahsen,Xinyu Zhang,Chang Shu,Ke Xu,Xiaoyu Liang,Ying Hu,Hoang VV Pham,Li Xiaomei,Thuc D Le,Adi L Tarca,Gaurav Bhatti,Roberto Romero,Nestoras Karathanasis,Phillipe Loher,Yang Chen,Zhengqing Ouyang,Disheng Mao,Yuping Zhang,Maryam Zand,Jianhua Ruan,Christoph Hafemeister,Peng Qiu,Duc Tran,Tin Nguyen,Attila Gabor,Thomas Yu,Justin Guinney,Enrico Glaab,Roland Krause,Peter Banda,DREAM SCTC Consortium,Gustavo Stolovitzky,Nikolaus Rajewsky,Julio Saez-Rodriguez,Pablo Meyer,
DOI: https://doi.org/10.26508/lsa.202000867
IF: 5.781
2020-09-24
Life Science Alliance
Abstract:Single-cell RNA-sequencing (scRNAseq) technologies are rapidly evolving. Although very informative, in standard scRNAseq experiments, the spatial organization of the cells in the tissue of origin is lost. Conversely, spatial RNA-seq technologies designed to maintain cell localization have limited throughput and gene coverage. Mapping scRNAseq to genes with spatial information increases coverage while providing spatial location. However, methods to perform such mapping have not yet been benchmarked. To fill this gap, we organized the DREAM Single-Cell Transcriptomics challenge focused on the spatial reconstruction of cells from the Drosophila embryo from scRNAseq data, leveraging as silver standard, genes with in situ hybridization data from the Berkeley Drosophila Transcription Network Project reference atlas. The 34 participating teams used diverse algorithms for gene selection and location prediction, while being able to correctly localize clusters of cells. Selection of predictor genes was essential for this task. Predictor genes showed a relatively high expression entropy, high spatial clustering and included prominent developmental genes such as gap and pair-rule genes and tissue markers. Application of the top 10 methods to a zebra fish embryo dataset yielded similar performance and statistical properties of the selected genes than in the Drosophila data. This suggests that methods developed in this challenge are able to extract generalizable properties of genes that are useful to accurately reconstruct the spatial arrangement of cells in tissues.
biology