BDBB: A Novel Beta-distribution-based Biclustering Algorithm for Revealing Local Co-methylation Patterns in Epi-transcriptome Profiling Data

Zhaoyang Liu,Yuteng Xiao,Hongsheng Yin,Xiaodan Li,Shutao Chen,Kaijian Xia,Lin Zhang,Yin Hongsheng
DOI: https://doi.org/10.1109/jbhi.2021.3068783
IF: 7.7
2021-01-01
IEEE Journal of Biomedical and Health Informatics
Abstract:N6-methyladenosine (m<sup>6</sup>A) has been shown to play crucial roles in RNA metabolism, physiology, and pathological processes. However, the specific regulatory mechanisms of most methylation sites remain uncharted due to the complexity of life processes. Biological experimental methods are costly to solve this problem, and computational methods are relatively lacking. The discovery of local co-methylation patterns (LCPs) of m<sup>6</sup>A epi-transcriptome data can benefit to solve the above problems. Based on this, we propose a novel biclustering algorithm based on the beta distribution (BDBB), which realizes the mining of LCPs of m<sup>6</sup>A epi-transcriptome data. BDBB employs the Gibbs sampling method to complete parameter estimation. In the process of modeling, LCPs are recognized as sharp beta distributions compared to the background distribution. Simulation study showed BDBB can extract all the three actual LCPs implanted in the background data and the overlap conditions between them with considerable accuracy (almost close to 100%). On MeRIP-Seq data of 69,446 methylation sites under 32 experimental conditions from 10 human cell lines, BDBB unveiled two LCPs, and Gene Ontology (GO) enrichment analysis showed that they were enriched in histone modification and embryo development, etc. important biological processes respectively. The GOE_Score scoring indicated that the biclustering results of BDBB in the m<sup>6</sup>A epi-transcriptome data are more biologically meaningful than the results of other biclustering algorithms.
computer science, interdisciplinary applications,mathematical & computational biology,medical informatics, information systems
What problem does this paper attempt to address?