DREAMER: a computational framework to evaluate readiness of datasets for machine learning

Meysam Ahangaran,Hanzhi Zhu,Ruihui Li,Lingkai Yin,Joseph Jang,Arnav P. Chaudhry,Lindsay A. Farrer,Rhoda Au,Vijaya B. Kolachalama
DOI: https://doi.org/10.1186/s12911-024-02544-w
IF: 3.298
2024-06-06
BMC Medical Informatics and Decision Making
Abstract:Machine learning (ML) has emerged as the predominant computational paradigm for analyzing large-scale datasets across diverse domains. The assessment of dataset quality stands as a pivotal precursor to the successful deployment of ML models. In this study, we introduce DREAMER ( D ata REA diness for M achin E learning R esearch), an algorithmic framework leveraging supervised and unsupervised machine learning techniques to autonomously evaluate the suitability of tabular datasets for ML model development. DREAMER is openly accessible as a tool on GitHub and Docker, facilitating its adoption and further refinement within the research community..
medical informatics
What problem does this paper attempt to address?