A Progressive Growing Generative Adversarial Network Composed of Enhanced Style-Consistent Modulation for Fetal Ultrasound Four-Chamber View Editing Synthesis

Sibo Qiao,Shanchen Pang,Gang Luo,Pengfei Xie,Wenjing Yin,Silin Pan,Zhihan Lyu
DOI: https://doi.org/10.1016/j.engappai.2024.108438
IF: 8
2024-01-01
Engineering Applications of Artificial Intelligence
Abstract:Fetal ultrasound (US) four-chamber (FC) views are essential in diagnosing congenital heart defects (CHD). Gathering a sufficient number of diverse and high-quality fetal US FC views within a short period remains a significant challenge for training novice physicians or assisted-diagnosis intelligent models. As an effective data augmentation technique, medical image editing synthesis enables the synthesis of various structural views within specific regions as desired, thereby greatly facilitating the training of novice physicians and intelligent models. However, due to the high noise and artifacts within the views, the existing approaches do not necessarily synthesize perfect texture aligned with external guidance in editable areas while also struggling to maintain a contextual style consistent with preserved areas. To address this issue, we propose an enhanced style-consistent modulation (ESCM), a two-stage modulation technology that combines external semantic and sketch contours with contextual style to produce enhanced context-aware modulation parameters. We also develop an ESCM-based progressive growing generative adversarial network (ESCMPGGAN), which decomposes the synthetic process into three stages to synthesize faithful textures using a coarse-to-fine approach. Moreover, we propose a sketch mask loss that promotes the ESCMPGGAN’s emphasis on the background during training, thereby comprehensively exploiting sketch information to enhance the model’s capacity for learning background textures. Our experimental results demonstrate that the proposed method successfully synthesizes high-quality views with a consistent contextual style between the editable and preserved regions and realistic appearance details. In addition, the synthesized views meet the requirements for clinical standards, confirming our approach’s effectiveness.
What problem does this paper attempt to address?