Learning Embodied Semantics via Music and Dance Semiotic Correlations

Francisco Afonso Raposo,David Martins de Matos,Ricardo Ribeiro
DOI: https://doi.org/10.48550/arXiv.1903.10534
2019-03-25
Computer Vision and Pattern Recognition
Abstract:Music semantics is embodied, in the sense that meaning is biologically mediated by and grounded in the human body and brain. This embodied cognition perspective also explains why music structures modulate kinetic and somatosensory perception. We leverage this aspect of cognition, by considering dance as a proxy for music perception, in a statistical computational model that learns semiotic correlations between music audio and dance video. We evaluate the ability of this model to effectively capture underlying semantics in a cross-modal retrieval task. Quantitative results, validated with statistical significance testing, strengthen the body of evidence for embodied cognition in music and show the model can recommend music audio for dance video queries and vice-versa.
What problem does this paper attempt to address?