Understanding Dynamic Auditory and Tactile Perception for Water Filling Level Estimation
Xinyuan Qian,Jingkai Xu,Yuxuan Gao,Minshu Li,Wanlin Li,Xu-Cheng Yin
DOI: https://doi.org/10.1007/s12369-024-01176-4
IF: 3.802
2024-10-16
International Journal of Social Robotics
Abstract:Accurate estimation of water filling levels is essential for various Human-Robot Interaction (HRI) applications, which has wide-spread usage in hospitals, restaurants and smart home scenarios. Although previous work has focused mainly on vision-based solutions, it has degraded performance in visual challenges, e.g., occlusions, and illumination changes. To solve this problem, we investigate the potential for dynamic auditory perception for water Filling Level Estimation (FLE) in HRI. Specifically, we examine the acoustic features and develop a novel deep learning-based regression method. To evaluate FLE with heterogeneous modalities, we design, collect, and annotate a new dataset, namely Audio, Tactile and Visual perception for Filling Level Estimation (ATVfle). Extensive experiments are carried out that demonstrate the viability of dynamic auditory perception as a reliable and complementary modality in FLE. Our proposal also shows effectiveness over the other state-of-the-art methods. Furthermore, the fusion of multimodality results in greater performance compared to a single modality. The datasets and code will be open source for research (https://github.com/GLDKWY/Water-Filling-Level).
robotics