Multimodal image synthesis based on disentanglement representations of anatomical and modality specific features, learned using uncooperative relativistic GAN

Sureerat Reaungamornrat,Hasan Sari,Ciprian Catana,Ali Kamen
DOI: https://doi.org/10.1016/j.media.2022.102514
Abstract:Growing number of methods for attenuation-coefficient map estimation from magnetic resonance (MR) images have recently been proposed because of the increasing interest in MR-guided radiotherapy and the introduction of positron emission tomography (PET) MR hybrid systems. We propose a deep-network ensemble incorporating stochastic-binary-anatomical encoders and imaging-modality variational autoencoders, to disentangle image-latent spaces into a space of modality-invariant anatomical features and spaces of modality attributes. The ensemble integrates modality-modulated decoders to normalize features and image intensities based on imaging modality. Besides promoting disentanglement, the architecture fosters uncooperative learning, offering ability to maintain anatomical structure in a cross-modality reconstruction. Introduction of a modality-invariant structural consistency constraint further enforces faithful embedding of anatomy. To improve training stability and fidelity of synthesized modalities, the ensemble is trained in a relativistic generative adversarial framework incorporating multiscale discriminators. Analyses of priors and network architectures as well as performance validation were performed on computed tomography (CT) and MR pelvis datasets. The proposed method demonstrated robustness against intensity inhomogeneity, improved tissue-class differentiation, and offered synthetic CT in Hounsfield units with intensities consistent and smooth across slices compared to the state-of-the-art approaches, offering median normalized mutual information of 1.28, normalized cross correlation of 0.97, and gradient cross correlation of 0.59 over 324 images.
What problem does this paper attempt to address?