Studying Privacy Aspects of Learned Knowledge Bases in the Context of Synthetic and Medical Data

Xenia Heilmann,Valentin Henkys,Daan Apeldoorn,Konstantin Strauch,Bertil Schmidt,Timm Lilienthal,Torsten Panholzer
DOI: https://doi.org/10.3233/SHTI240866
2024-08-30
Abstract:Introduction: Retrieving comprehensible rule-based knowledge from medical data by machine learning is a beneficial task, e.g., for automating the process of creating a decision support system. While this has recently been studied by means of exception-tolerant hierarchical knowledge bases (i.e., knowledge bases, where rule-based knowledge is represented on several levels of abstraction), privacy concerns have not been addressed extensively in this context yet. However, privacy plays an important role, especially for medical applications. Methods: When parts of the original dataset can be restored from a learned knowledge base, there may be a practically and legally relevant risk of re-identification for individuals. In this paper, we study privacy issues of exception-tolerant hierarchical knowledge bases which are learned from data. We propose approaches for determining and eliminating privacy issues of the learned knowledge bases. Results: We present results for synthetic as well as for real world datasets. Conclusion: The results show that our approach effectively prevents privacy breaches while only moderately decreasing the inference quality.
What problem does this paper attempt to address?