SCENIC: Single-cell regulatory network inference and clustering
Sara Aibar,Carmen Bravo González-Blas,Thomas Moerman,Jasper Wouters,Vân Anh Huynh-Thu,Hana Imrichova,Zeynep Kalender Atak,Gert Hulselmans,Michael Dewaele,Florian Rambow,Pierre Geurts,Jan Aerts,Jean-Christophe Marine,Joost van den Oord,Stein Aerts
DOI: https://doi.org/10.1101/144501
2017-05-31
Abstract:Abstract Single-cell RNA-seq allows building cell atlases of any given tissue and infer the dynamics of cellular state transitions during developmental or disease trajectories. Both the maintenance and transitions of cell states are encoded by regulatory programs in the genome sequence. However, this regulatory code has not yet been exploited to guide the identification of cellular states from single-cell RNA-seq data. Here we describe a computational resource, called SCENIC (Single Cell rEgulatory Network Inference and Clustering), for the simultaneous reconstruction of gene regulatory networks (GRNs) and the identification of stable cell states, using single-cell RNA-seq data. SCENIC outperforms existing approaches at the level of cell clustering and transcription factor identification. Importantly, we show that cell state identification based on GRNs is robust towards batch-effects and technical-biases. We applied SCENIC to a compendium of single-cell data from the mouse and human brain and demonstrate that the proper combinations of transcription factors, target genes, enhancers, and cell types can be identified. Moreover, we used SCENIC to map the cell state landscape in melanoma and identified a gene regulatory network underlying a proliferative melanoma state driven by MITF and STAT and a contrasting network controlling an invasive state governed by NFATC2 and NFIB. We further validated these predictions by showing that two transcription factors are predominantly expressed in early metastatic sentinel lymph nodes. In summary, SCENIC is the first method to analyze scRNA-seq data using a network-centric, rather than cell-centric approach. SCENIC is generic, easy to use, and flexible, and allows for the simultaneous tracing of genomic regulatory programs and the mapping of cellular identities emerging from these programs. Availability: SCENIC is available as an R workflow based on three new R/Bioconductor packages: GENIE3, RcisTarget and AUCell. As scalable alternative to GENIE3, we also provide GRNboost, paving the way towards the network analysis across millions of single cells.