scAuto as a comprehensive framework for single-cell chromatin accessibility data analysis
Meiqin Gong,Yun Yu,Zixuan Wang,Junming Zhang,Xiongyi Wang,Cheng Fu,Yongqing Zhang,Xiaodong Wang
DOI: https://doi.org/10.1016/j.compbiomed.2024.108230
IF: 7.7
2024-02-01
Computers in Biology and Medicine
Abstract:Interpreting single-cell chromatin accessibility data is crucial for understanding intercellular heterogeneity regulation. Despite the progress in computational methods for analyzing this data, there is still a lack of a comprehensive analytical framework and a user-friendly online analysis tool. To fill this gap, we developed a pre-trained deep learning-based framework, single-cell auto-correlation transformers (scAuto), to overcome the challenge. Following DNABERT's methodology of pre-training and fine-tuning, scAuto learns a general understanding of DNA sequence's grammar by being pre-trained on unlabeled human genome via self-supervision; it is then transferred to the single-cell chromatin accessibility analysis task of scATAC-seq data for supervised fine-tuning. We extensively validated scAuto on the Buenrostro2018 dataset, demonstrating its superior performance on chromatin accessibility prediction, single-cell clustering, and data denoising. Based on scAuto, we further developed an interactive web server for single-cell chromatin accessibility data analysis. It integrates tutorial-style interfaces for those with limited programming skills. The platform is accessible at http://zhanglab.icaup.cn. To our knowledge, this work is expected to help analyze single-cell chromatin accessibility data and facilitate the development of precision medicine.
engineering, biomedical,computer science, interdisciplinary applications,mathematical & computational biology,biology