Text to Blind Motion

Hee Jae Kim,Kathakoli Sengupta,Masaki Kuribayashi,Hernisa Kacorri,Eshed Ohn-Bar
2024-12-07
Abstract:People who are blind perceive the world differently than those who are sighted, which can result in distinct motion characteristics. For instance, when crossing at an intersection, blind individuals may have different patterns of movement, such as veering more from a straight path or using touch-based exploration around curbs and obstacles. These behaviors may appear less predictable to motion models embedded in technologies such as autonomous vehicles. Yet, the ability of 3D motion models to capture such behavior has not been previously studied, as existing datasets for 3D human motion currently lack diversity and are biased toward people who are sighted. In this work, we introduce BlindWays, the first multimodal motion benchmark for pedestrians who are blind. We collect 3D motion data using wearable sensors with 11 blind participants navigating eight different routes in a real-world urban setting. Additionally, we provide rich textual descriptions that capture the distinctive movement characteristics of blind pedestrians and their interactions with both the navigation aid (e.g., a white cane or a guide dog) and the environment. We benchmark state-of-the-art 3D human prediction models, finding poor performance with off-the-shelf and pre-training-based methods for our novel task. To contribute toward safer and more reliable systems that can seamlessly reason over diverse human movements in their environments, our text-and-motion benchmark is available at <a class="link-external link-https" href="https://blindways.github.io" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the deficiencies in existing 3D human motion datasets in terms of capturing and modeling the natural movement behaviors of blind pedestrians. Specifically, most of the existing datasets are mainly focused on the sighted population and lack descriptions of the movement characteristics of the disabled population, especially the blind. This has led to the poor performance of current 3D motion models in predicting the future actions of blind pedestrians, especially in complex and safety - critical urban environments, such as dynamic intersections, complex layouts, and dense social scenes. These problems not only limit the development of technology but may also exacerbate social problems regarding accessibility. For example, self - driving vehicles are unable to accurately predict and safely respond to the movements of the disabled, thus affecting their safety. To address this challenge, the paper introduces a new multimodal motion benchmark - BlindWays, which for the first time includes 3D motion data of the blind navigating in real - world urban environments. By collecting motion data of 11 blind participants on 8 different routes and providing rich text descriptions to capture the unique movement characteristics of blind pedestrians and their interactions with navigation aids (such as white canes or guide dogs) and the environment, BlindWays aims to fill this research gap and promote the development of safer and more reliable technological systems that can seamlessly understand diverse human movement behaviors.