Identification of patients' smoking status using an explainable AI approach: a Danish electronic health records case study

Ali Ebrahimi,Margrethe Bang Høstgaard Henriksen,Claus Lohman Brasen,Ole Hilberg,Torben Frøstrup Hansen,Lars Henrik Jensen,Abdolrahman Peimankar,Uffe Kock Wiil
DOI: https://doi.org/10.1186/s12874-024-02231-4
2024-05-18
BMC Medical Research Methodology
Abstract:Smoking is a critical risk factor responsible for over eight million annual deaths worldwide. It is essential to obtain information on smoking habits to advance research and implement preventive measures such as screening of high-risk individuals. In most countries, including Denmark, smoking habits are not systematically recorded and at best documented within unstructured free-text segments of electronic health records (EHRs). This would require researchers and clinicians to manually navigate through extensive amounts of unstructured data, which is one of the main reasons that smoking habits are rarely integrated into larger studies. Our aim is to develop machine learning models to classify patients' smoking status from their EHRs.
health care sciences & services
What problem does this paper attempt to address?