Improve accessibility for Low Vision and Blind people using Machine Learning and Computer Vision

Jasur Shukurov
2024-03-25
Abstract:With the ever-growing expansion of mobile technology worldwide, there is an increasing need for accommodation for those who are disabled. This project explores how machine learning and computer vision could be utilized to improve accessibility for people with visual impairments. There have been many attempts to develop various software that would improve accessibility in the day-to-day lives of blind people. However, applications on the market have low accuracy and only provide audio feedback. This project will concentrate on building a mobile application that helps blind people to orient in space by receiving audio and haptic feedback, e.g. vibrations, about their surroundings in real-time. The mobile application will have 3 main features. The initial feature is scanning text from the camera and reading it to a user. This feature can be used on paper with text, in the environment, and on road signs. The second feature is detecting objects around the user, and providing audio feedback about those objects. It also includes providing the description of the objects and their location, and giving haptic feedback if the user is too close to an object. The last feature is currency detection which provides a total amount of currency value to the user via the camera.
Human-Computer Interaction,Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
This paper aims to address the various challenges faced by visually impaired individuals (including the blind) in their daily lives, especially as the world becomes increasingly digital and technology-dependent, leading to more obstacles for these individuals. The authors hope to improve the convenience of life for this group by developing a mobile application using machine learning and computer vision technologies. Specifically, the goal of this paper is to create an application with the following three main features: 1. **Text Scanning and Reading Aloud**: Users can scan text with their phone camera, which will then be converted into speech and played back to the user. 2. **Real-time Object Detection**: The application can identify objects around the user and provide audio feedback about the information and location of these objects; additionally, it will provide tactile feedback when the user is too close to an object. 3. **Currency Recognition**: The application can recognize the type and total amount of currency through the camera and inform the user. This research project is based on interviews and needs analysis of the target users, adopting an agile development methodology, and refining the application through multiple iterative cycles. The ultimate goal is to provide an application that is highly accurate, user-friendly, and specifically designed for visually impaired individuals to improve their daily life experience.