Structured Information Extraction of Pathology Reports with Attention-based Graph Convolutional Network.

Jialun Wu,Kaiwen Tang,Haichuan Zhang,Chunbao Wang,Chen Li
DOI: https://doi.org/10.1109/BIBM49941.2020.9313347
2020-01-01
Abstract:Electronic medical data contains biochemical, imaging, pathological information during diagnosis and treatment. The pathology report is a kind of highly liberalized unstructured textual data, which is the basis and gold standard of cancer diagnosis and is very important for the prognosis and treatment of patients. The application of information extraction technology to pathological reports can obtain structured data that can be understood and analyzed by computers, helping pathologists make appropriate decisions. In this work, we proposed an attention-based graph convolutional network (GCN) for converting unstructured pathological reports into a structured form suitable for computer analysis to improve the current pathologist’s workflow, collected medical data from different platforms, and provided more accurate assistance for diagnosis and treatment. We used pathology reports data from TCGA (The Cancer Genome Atlas) database with fine-grained annotations on 3632 pathology reports including four types of cancers. Our method performs better in our pathology report dataset with higher F1 score than traditional methods and deep learning methods. The results indicate that our method is robust, thus may work with other types of cancer pathology report.
What problem does this paper attempt to address?