Unsupervised multi-modal representation learning for affective computing with multi-corpus wearable data

Kyle Ross,Paul Hungler,Ali Etemad
DOI: https://doi.org/10.1007/s12652-021-03462-9
IF: 3.662
2021-10-09
Journal of Ambient Intelligence and Humanized Computing
Abstract:There has been a growing focus on the use of artificial intelligence and machine learning for affective computing to further enhance user experience through emotion recognition. Typically, machine learning models used for affective computing are trained using manually extracted features from biological signals. Such features may not generalize well for large datasets. One approach to address this issue is to use fully supervised deep learning methods to learn latent representations. However, this method requires human supervision to label the data, which may be unavailable. In this work we propose an unsupervised framework for representation learning. The proposed framework utilizes two stacked convolutional autoencoders to learn latent representations from wearable electrocardiogram and electrodermal activity signals. The representations learned from this unsupervised framework are subsequently utilized within a random forest model to classify arousal. To validate this framework, an aggregation of the AMIGOS, ASCERTAIN, CLEAS, and MAHNOB-HCI datasets is created. The results of our proposed method are compared with other methods including convolutional neural networks, as well as methods that employ manual extraction of features. We show that our method outperforms current state-of-the-art results. The results show the wide-spread applicability for stacked convolutional autoencoders to be used for affective computing.
computer science, information systems,telecommunications, artificial intelligence
What problem does this paper attempt to address?