Semi-supervised classification by reaching consensus among modalities

Zining Zhu,Jekaterina Novikova,Frank Rudzicz
DOI: https://doi.org/10.48550/arXiv.1805.09366
2018-11-20
Abstract:Deep learning has demonstrated abilities to learn complex structures, but they can be restricted by available data. Recently, Consensus Networks (CNs) were proposed to alleviate data sparsity by utilizing features from multiple modalities, but they too have been limited by the size of labeled data. In this paper, we extend CN to Transductive Consensus Networks (TCNs), suitable for semi-supervised learning. In TCNs, different modalities of input are compressed into latent representations, which we encourage to become indistinguishable during iterative adversarial training. To understand TCNs two mechanisms, consensus and classification, we put forward its three variants in ablation studies on these mechanisms. To further investigate TCN models, we treat the latent representations as probability distributions and measure their similarities as the negative relative Jensen-Shannon divergences. We show that a consensus state beneficial for classification desires a stable but imperfect similarity between the representations. Overall, TCNs outperform or align with the best benchmark algorithms given 20 to 200 labeled samples on the Bank Marketing and the DementiaBank datasets.
Machine Learning,Multimedia,Sound,Audio and Speech Processing,Signal Processing
What problem does this paper attempt to address?