Audio scene semantic similarity computing approach

Wei Wei,Ye Bin,Chen Bang-sheng
DOI: https://doi.org/10.1109/ICIME.2010.5477506
2010-01-01
Abstract:Audio in the video carries abundant semantic message. An audio scene is temporal audio segments which represented by a few basic audio effects. The semantic similarity of pair audio scenes is very useful for high-level audio semantic understanding. A computing approach for audio scene semantic similarity is proposed in this paper. Firstly, audio track is pre-segmented to audio scenes. Then, basic audio effects dominating each audio scene are recognized. Finally, the similarity of two audio scenes is calculated based on a model consist with information theoretic similarity principles and Tversky's set-theoretic similarity. The results of experiments indicate the audio scene semantic similarity computing approach could count quantitative semantic similarity of two scenes.
What problem does this paper attempt to address?