Modeling Expression with Perceptual Audio Features to Enhance User Interaction

Luca Mion,Gianluca D'Incà,A. Götzen,Ennio Rapanà
DOI: https://doi.org/10.1162/comj.2010.34.1.65
IF: 0.688
2010-03-01
Computer Music Journal
Abstract:“Natural interfaces” represent one of the fastmoving investigation topics in the design of modern electronic appliances for both domestic and professional use. These interfaces stress the idea that the interaction should mimic everyday life under many different aspects, from the input device used to the feedback received and to the embodiment of the interaction. Nonverbal communication plays an important role in our everyday life, and the auditory modality is used to comprehend many different kinds of information and shorter time-span states such as moods and emotions. In this work, a model for the expressive control of unstructured sounds is proposed. Starting from the investigation of simple musical gestures played with various instruments (repeated notes, scales, and short excerpts), a set of relevant audio features for expression description is selected by statistical analysis. Selected features are not related to musical scores or structures, thus yielding an ecological approach to the representation of expression communication. In particular, perceptual features like roughness and spectral centroid provide additional descriptors related to texture and brightness, as opposed to the timing/intensity-based parameters, which lead to typical music-oriented characterizations. Afterwards, the control parameters of an expressive synthesis model are tuned according to the results of analysis to add expressive content to simple synthetic sounds. Listening tests were conducted to validate the model and results confirm the impact that this model can have on affective communication in human–computer interaction (HCI).
What problem does this paper attempt to address?