A Robust Audio Similarity Estimation Method for Audio Alignment in Mobile Karaoke Apps

Jinghui Mo,Yansong Feng,Dongyan Zhao
DOI: https://doi.org/10.1145/2578726.2578802
2014-01-01
Abstract:With smartphones further integrating into our lives, more people start to sing using mobile karaoke apps instead of going to a KTV club. However, the playback and record APIs of Android systems do not respond in real-time when called. Thus, an Android karaoke app will have to align the record music and the original accompaniment when super-posing those two audios. Dynamic time warping (DTW) based algorithms are usually used to find the optimal alignment between two audios and yield best result so far. In this paper, we propose a simple yet robust approach by considering waveform similarities to solve this problem. Experimental results show that our method outperforms the state-of-the-art method in both accuracy and robustness across different genres and devices.
What problem does this paper attempt to address?