Abstract:Biochemical modifications to mRNA, especially N6-methyladenosine (m(6)A) and 5-methylcytosine (m(5)C), have been recently shown to be associated with crucial biological functions. Despite the intriguing advancements, little is known so far about the dynamic landscape of RNA methylome across different cell types and how the epitranscriptome is regulated at the system level by enzymes, i.e., RNA methyltransferases and demethylases. To investigate this issue, a meta-analysis of m(6)A MeRIP-Seq datasets collected from 10 different experimental conditions (cell type/tissue or treatment) is performed, and the combinatorial epitranscriptome, which consists of 42758 m(6)A sites, is extracted and divided into 3 clusters, in which the methylation sites are likely to be hyper- or hypo-methylated simultaneously (or co-methylated), indicating the sharing of a common methylation regulator. Four different clustering approaches are used, including K-means, hierarchical clustering (HC), Bayesian factor regression model (BFRM) and nonnegative matrix factorization (NMF) to unveil the co-methylation patterns. To validate whether the patterns are corresponding to enzymatic regulators, i.e., RNA methyltransferases or demethylases, the target sites of a known m(6)A regulator, fat mass and obesity-associated protein (FTO), are identified from an independent mouse MeRIP-Seq dataset and lifted to human. Our study shows that 3 out of the 4 clustering approaches used can successfully identify a group of methylation sites overlapping with FTO target sites at a significance level of 0.05 (after multiple hypothesis adjustment), among which, the result of NMF is the most significant (p-value 2.81 x 10(-06)). We defined a new approach evaluating the consistency between two clustering results which shows that clustering results of different methods are highly correlated strongly indicating the existence of co-methylation patterns. Consistent with recent studies, a number of cancer and neuronal disease-related bimolecular functions are enriched in the identified clusters, which are biological functions that can be regulated at the epitranscriptional level, indicating the pharmaceutical prospect of RNA N6-methyladenosine-related studies. This result successfully reveals the linkage between the global RNA co-methylation patterns embedded in the epitranscriptomic data under multiple experimental conditions and the latent enzymatic regulators, suggesting a promising direction towards a more comprehensive understanding of the epitranscriptome.

Clustering Count-based RNA Methylation Data Using a Nonparametric Generative Model

A Nonparametric Bayesian Approach for Clustering Bisulfate-Based DNA Methylation Profiles

Clustering DNA methylation expressions using nonparametric beta mixture model

A sparse negative binomial mixture model for clustering RNA-seq count data

QNB: differential RNA methylation analysis for count-based small-sample sequencing data with a quad-negative binomial model

Nonparametric clustering of RNA-sequencing data

BDBB: A Novel Beta-distribution-based Biclustering Algorithm for Revealing Local Co-methylation Patterns in Epi-transcriptome Profiling Data

MBMM: Moment Estimating Beta Mixture Model-based Clustering Algorithm for M6a Co-methylation Module Mining

DRME: Count-based Differential RNA Methylation Analysis at Small Sample Size Scenario

BBM: A novel beta-binomial-distribution-based biclustering algorithm for mining m 6 A co-methylation patterns

Spatially Enhanced Differential RNA Methylation Analysis from Affinity-Based Sequencing Data with Hidden Markov Model.

A Hierarchical Model for Clustering M(6)a Methylation Peaks in MeRIP-seq Data.

Differential analysis of RNA methylome with improved spatial resolution

Negative Binomial Additive Model for RNA-Seq Data Analysis

A family of mixture models for beta valued DNA methylation data

Decomposition of Rna Methylome Reveals Co-Methylation Patterns Induced by Latent Enzymatic Regulators of the Epitranscriptome

Higher order methylation features for clustering and prediction in epigenomic studies

DIMM-SC: a Dirichlet mixture model for clustering droplet-based single cell transcriptomic data

Non-parametric Bayesian modelling of digital gene expression data

BayMeth: improved DNA methylation quantification for affinity capture sequencing data using a flexible Bayesian approach

Differential RNA Methylation Analysis for MeRIP-seq Data under General Experimental Design