Squeezing bottlenecks: exploring the limits of autoencoder semantic representation capabilities

Parth Gupta,Rafael E. Banchs,Paolo Rosso
DOI: https://doi.org/10.48550/arXiv.1402.3070
2014-02-13
Abstract:We present a comprehensive study on the use of autoencoders for modelling text data, in which (differently from previous studies) we focus our attention on the following issues: i) we explore the suitability of two different models bDA and rsDA for constructing deep autoencoders for text data at the sentence level; ii) we propose and evaluate two novel metrics for better assessing the text-reconstruction capabilities of autoencoders; and iii) we propose an automatic method to find the critical bottleneck dimensionality for text language representations (below which structural information is lost).
Information Retrieval,Machine Learning
What problem does this paper attempt to address?