BioMM: Biologically-informed Multi-stage Machine learning for identification of epigenetic fingerprints

Junfang Chen,Emanuel Schwarz
DOI: https://doi.org/10.48550/arXiv.1712.00336
2017-12-01
Quantitative Methods
Abstract:The identification of reproducible biological patterns from high-dimensional data is a bottleneck for understanding the biology of complex illnesses such as schizophrenia. To address this, we developed a biologically informed, multi-stage machine learning (BioMM) framework. BioMM incorporates biological pathway information to stratify and aggregate high-dimensional biological data. We demonstrate the utility of this method using genome-wide DNA methylation data and show that it substantially outperforms conventional machine learning approaches. Therefore, the BioMM framework may be a fruitful machine learning strategy in high-dimensional data and be the basis for future, integrative analysis approaches.
What problem does this paper attempt to address?