Semi-supervised Text Regression with Conditional Generative Adversarial Networks

Tao Li,Xudong Liu,Shihan Su
DOI: https://doi.org/10.1109/BigData.2018.8622140
2018-11-11
Abstract:Enormous online textual information provides intriguing opportunities for understandings of social and economic semantics. In this paper, we propose a novel text regression model based on a conditional generative adversarial network (GAN), with an attempt to associate textual data and social outcomes in a semi-supervised manner. Besides promising potential of predicting capabilities, our superiorities are twofold: (i) the model works with unbalanced datasets of limited labelled data, which align with real-world scenarios; and (ii) predictions are obtained by an end-to-end framework, without explicitly selecting high-level representations. Finally we point out related datasets for experiments and future research directions.
Computation and Language,Artificial Intelligence,Computational Finance,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to establish an association between text data and social outcomes, especially in a semi - supervised learning environment. Specifically, the paper proposes a new text regression model based on Conditional Generative Adversarial Networks (cGANs), aiming to use limited labeled data and unbalanced data sets for effective prediction. The advantages of this model are as follows: 1. **Ability to handle unbalanced data sets**: The model can work with only a small amount of labeled data, which is a very common situation in the real world. 2. **End - to - end framework**: The prediction process does not need to explicitly select high - level representations, but directly learns from the original data through an end - to - end framework. The paper also explores relevant experimental data sets and future research directions, emphasizing the potential application value of the model in multiple prediction tasks, such as automobile sales prediction, public opinion tracking, and epidemiological monitoring on social media.