Synthetic Generation of Patient Service Utilization Data: A Scalability Study

Joseph Howie,Sowmya Balasubramanian,Jonas Bambi,Kenneth Moselle,Venkatesh Srinivasan,Alex Thomo
DOI: https://doi.org/10.3233/SHTI240511
2024-08-22
Abstract:To address privacy and ethical issues in using health data for machine learning, we evaluate the scalability of advanced synthetic data generation methods like GANs, VAEs, copulaGAN, and transformer models specifically for patient service utilization data. Our study examines five models on data from a Canadian health authority, focusing on training and generation efficiency, data resemblance, and practical utility. Our findings indicate that statistical models excel in efficiency, while most models produce synthetic data that closely mirrors real data, and is also useful for real-world applications.
What problem does this paper attempt to address?