Near-pair patch generative adversarial network for data augmentation of focal pathology object detection models

Ethan Tu,Jonathan Burkow,Andy Tsai,Joseph Junewick,Francisco A Perez,Jeffrey Otjen,Adam M Alessio
DOI: https://doi.org/10.1117/1.JMI.11.3.034505
Abstract:Purpose: The limited volume of medical training data remains one of the leading challenges for machine learning for diagnostic applications. Object detectors that identify and localize pathologies require training with a large volume of labeled images, which are often expensive and time-consuming to curate. To reduce this challenge, we present a method to support distant supervision of object detectors through generation of synthetic pathology-present labeled images. Approach: Our method employs the previously proposed cyclic generative adversarial network (cycleGAN) with two key innovations: (1) use of "near-pair" pathology-present regions and pathology-absent regions from similar locations in the same subject for training and (2) the addition of a realism metric (Fréchet inception distance) to the generator loss term. We trained and tested this method with 2800 fracture-present and 2800 fracture-absent image patches from 704 unique pediatric chest radiographs. The trained model was then used to generate synthetic pathology-present images with exact knowledge of location (labels) of the pathology. These synthetic images provided an augmented training set for an object detector. Results: In an observer study, four pediatric radiologists used a five-point Likert scale indicating the likelihood of a real fracture (1 = definitely not a fracture and 5 = definitely a fracture) to grade a set of real fracture-absent, real fracture-present, and synthetic fracture-present images. The real fracture-absent images scored 1.7±1.0, real fracture-present images 4.1±1.2, and synthetic fracture-present images 2.5±1.2. An object detector model (YOLOv5) trained on a mix of 500 real and 500 synthetic radiographs performed with a recall of 0.57±0.05 and an F2 score of 0.59±0.05. In comparison, when trained on only 500 real radiographs, the recall and F2 score were 0.49±0.06 and 0.53±0.06, respectively. Conclusions: Our proposed method generates visually realistic pathology and that provided improved object detector performance for the task of rib fracture detection.
What problem does this paper attempt to address?