Knowledge abstraction and filtering based federated learning over heterogeneous data views in healthcare

Anshul Thakur,Soheila Molaei,Pafue Christy Nganjimi,Andrew Soltan,Patrick Schwab,Kim Branson,David A. Clifton
DOI: https://doi.org/10.1038/s41746-024-01272-9
IF: 15.2
2024-10-17
npj Digital Medicine
Abstract:Robust data privacy regulations hinder the exchange of healthcare data among institutions, crucial for global insights and developing generalised clinical models. Federated learning (FL) is ideal for training global models using datasets from different institutions without compromising privacy. However, disparities in electronic healthcare records (EHRs) lead to inconsistencies in ML-ready data views, making FL challenging without extensive preprocessing and information loss. These differences arise from variations in services, care standards, and record-keeping practices. This paper addresses data view heterogeneity by introducing a knowledge abstraction and filtering-based FL framework that allows FL over heterogeneous data views without manual alignment or information loss. The knowledge abstraction and filtering mechanism maps raw input representations to a unified, semantically rich shared space for effective global model training. Experiments on three healthcare datasets demonstrate the framework's effectiveness in overcoming data view heterogeneity and facilitating information sharing in a federated setup.
health care sciences & services,medical informatics
What problem does this paper attempt to address?