Advances in cancer DNA methylation analysis with methPLIER: use of non-negative matrix factorization and knowledge-based constraints to enhance biological interpretability

Ken Takasawa,Ken Asada,Syuzo Kaneko,Kouya Shiraishi,Hidenori Machino,Satoshi Takahashi,Norio Shinkai,Nobuji Kouno,Kazuma Kobayashi,Masaaki Komatsu,Takaaki Mizuno,Yu Okubo,Masami Mukai,Tatsuya Yoshida,Yukihiro Yoshida,Hidehito Horinouchi,Shun-Ichi Watanabe,Yuichiro Ohe,Yasushi Yatabe,Takashi Kohno,Ryuji Hamamoto
DOI: https://doi.org/10.1038/s12276-024-01173-7
2024-03-05
Experimental & Molecular Medicine
Abstract:DNA methylation is an epigenetic modification that results in dynamic changes during ontogenesis and cell differentiation. DNA methylation patterns regulate gene expression and have been widely researched. While tools for DNA methylation analysis have been developed, most of them have focused on intergroup comparative analysis within a dataset; therefore, it is difficult to conduct cross-dataset studies, such as rare disease studies or cross-institutional studies. This study describes a novel method for DNA methylation analysis, namely, methPLIER, which enables interdataset comparative analyses. methPLIER combines Pathway Level Information Extractor (PLIER), which is a non-negative matrix factorization (NMF) method, with regularization by a knowledge matrix and transfer learning. methPLIER can be used to perform intersample and interdataset comparative analysis based on latent feature matrices, which are obtained via matrix factorization of large-scale data, and factor-loading matrices, which are obtained through matrix factorization of the data to be analyzed. We used methPLIER to analyze a lung cancer dataset and confirmed that the data decomposition reflected sample characteristics for recurrence-free survival. Moreover, methPLIER can analyze data obtained via different preprocessing methods, thereby reducing distributional bias among datasets due to preprocessing. Furthermore, methPLIER can be employed for comparative analyses of methylation data obtained from different platforms, thereby reducing bias in data distribution due to platform differences. methPLIER is expected to facilitate cross-sectional DNA methylation data analysis and enhance DNA methylation data resources.
biochemistry & molecular biology,medicine, research & experimental
What problem does this paper attempt to address?