Identification of differentially methylated single-nucleotide m A sites by incorporating site-specific antibody specificity

Yang Guo,Zehong Wu,Weisheng Cheng,Zhijun Ren,Yixian Cun,Jinkai Wang
DOI: https://doi.org/10.1101/2024.02.04.578119
2024-02-05
Abstract:Various genome-wide and transcriptome-wide technologies are based on antibodies, however, the specificity of antibodies on different targets has not been characterized or considered in the analyses. The antibody-based MeRIP-seq is the most widely used method to determine the locations of N6-methyladenosine (m A) on RNAs, especially for differential m A analyses. However, the antibody specificities in different RNA regions and their resulting technical biases in differential m A analyses have not been evaluated. Here, we evaluated the m A antibody specificities using 100 pairs of spike-in RNAs with known m A levels at single sites. Based on two replicates with different m A levels on spike-in RNAs, we realized the m A antibody specificities of the m A sites on spike-in RNAs were greatly varied and mainly determined by the surrounding sequences of the m A sites. Moreover, the MeRIP-seq signal fold change is the function of the real difference in m A levels as well as the m A antibody specificity. We then trained a machine learning model to predict the m A antibody specificities of given sequences and predicted the m A specificities of all RNA sequences surrounding the known m A motif DRACH throughout the human transcriptome. Finally, we developed a Hierarchical statistic model for Differential Analysis of m A Sites (HDAMS) by taking advantage of the predicted m A specificities. We found that HDAMS can accurately determine the differentially methylated single-nucleotide m A sites and the output more functionally relevant results. Our study not only provides a powerful tool for differential m A analyses but also provides a methodological framework for other antibody-based studies to incorporate antibody specificities.
Bioinformatics
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper primarily focuses on accurately identifying differentially methylated m6A sites at single-nucleotide resolution and proposes a novel method to address this issue. Specifically, the paper attempts to solve the following key problems: 1. **Impact of Antibody Specificity on m6A Detection**: - The widely used antibody-based MeRIP-seq technology has a significant limitation in detecting m6A modifications, namely the varying antibody specificity for different m6A sites. This variability can lead to technical biases, affecting the accurate identification of differentially methylated m6A sites. - The paper evaluates the specificity of m6A antibodies in different RNA regions by synthesizing 100 pairs of RNA probes with known m6A levels and finds that these specificities are mainly determined by the sequences surrounding the m6A sites. 2. **Developing a Machine Learning Model to Predict m6A Antibody Specificity**: - The paper trains an ensemble learning model to predict the antibody specificity of each m6A site using sequence features within 75bp around the m6A site. This model can distinguish between high-specificity and low-specificity m6A sites, providing crucial information for subsequent analyses. 3. **Developing a New Statistical Model HDAMS**: - To more accurately identify differentially methylated single-nucleotide m6A sites, the paper develops a hierarchical statistical model (Hierarchical Differential Analysis of m6A Sites, HDAMS). This model integrates the predicted m6A antibody specificity information into the differential methylation analysis, improving detection accuracy and functional relevance. 4. **Validating the Performance of HDAMS**: - The paper validates the performance of HDAMS using multiple published MeRIP-seq datasets, including knockdowns of m6A methyltransferases METTL3 and METTL14, directed differentiation of human embryonic stem cells to mesoderm cells, and comparisons between type 2 diabetic islet cells and normal islet cells. The results show that HDAMS outperforms traditional fold-change methods and other existing methods across all experimental datasets. ### Summary The paper systematically evaluates the specificity of m6A antibodies, develops a machine learning model to predict m6A antibody specificity, and constructs a new statistical model HDAMS to more accurately identify differentially methylated m6A sites at single-nucleotide resolution. These methods not only improve the accuracy of m6A differential analysis but also provide a methodological framework for other antibody-based studies to consider the impact of antibody specificity.