Generate Medical Synthetic Data Based on Generative Adversarial Network

XIANG Xiayu,WANG Jiahui,WANG Zirui,DUAN Shaoming,PAN Hezhong,ZHUANG Rongfei,HAN Peiyi,LIU Chuanyi
DOI: https://doi.org/10.11959/j.issn.1000−436x.2022057
2022-01-01
Abstract:Modeling the probability distribution of rows in structured electronic health records and generating realistic synthetic data is a non-trivial task.Tabular data usually contains discrete columns, and traditional encoding approaches may suffer from the curse of feature dimensionality.Poincaré Ball model was utilized to model the hierarchical structure of nominal variables and Gaussian copula-based generative adversarial network was employed to provide synthetic structured electronic health records.The generated training data are experimentally tested to achieve only 2% difference in utility from the original data yet ensure privacy.
What problem does this paper attempt to address?