Methodology for biomarker discovery with reproducibility in microbiome data using machine learning

David Rojas-Velazquez,Sarah Kidwai,Aletta D. Kraneveld,Alberto Tonda,Daniel Oberski,Johan Garssen,Alejandro Lopez-Rincon
DOI: https://doi.org/10.1186/s12859-024-05639-3
IF: 3.307
2024-01-17
BMC Bioinformatics
Abstract:In recent years, human microbiome studies have received increasing attention as this field is considered a potential source for clinical applications. With the advancements in omics technologies and AI, research focused on the discovery for potential biomarkers in the human microbiome using machine learning tools has produced positive outcomes. Despite the promising results, several issues can still be found in these studies such as datasets with small number of samples, inconsistent results, lack of uniform processing and methodologies, and other additional factors lead to lack of reproducibility in biomedical research. In this work, we propose a methodology that combines the DADA2 pipeline for 16s rRNA sequences processing and the Recursive Ensemble Feature Selection (REFS) in multiple datasets to increase reproducibility and obtain robust and reliable results in biomedical research.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?