Assisting Blind People Using Object Detection with Vocal Feedback

Heba Najm,Khirallah Elferjani,Alhaam Alariyibi
DOI: https://doi.org/10.1109/MI-STA54861.2022.9837737
2023-12-19
Abstract:For visually impaired people, it is highly difficult to make independent movement and safely move in both indoors and outdoors environment. Furthermore, these physically and visually challenges prevent them from in day-today live activities. Similarly, they have problem perceiving objects of surrounding environment that may pose a risk to them. The proposed approach suggests detection of objects in real-time video by using a web camera, for the object identification, process. You Look Only Once (YOLO) model is utilized which is CNN-based real-time object detection technique. Additionally, The OpenCV libraries of Python is used to implement the software program as well as deep learning process is performed. Image recognition results are transferred to the visually impaired users in audible form by means of Google text-to-speech library and determine object location relative to its position in the screen. The obtaining result was evaluated by using the mean Average Precision (mAP), and it was found that the proposed approach achieves excellent results when it compared to previous approaches.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to use computer vision technology to help visually impaired people move safely indoors and outdoors independently. Specifically, the paper proposes a method based on real - time video object detection and voice feedback, aiming to identify objects in the surrounding environment and inform visually impaired people of the positions of these objects through voice, thereby helping them perceive obstacles and improving the safety and independence of their daily activities. The paper mentions that although traditional white canes are low - cost and easy to use, they are not sufficient to enable visually impaired people to fully and independently cope with the challenges in daily life. Therefore, the author proposes a new method that combines deep - learning and computer - vision technologies, uses the YOLO (You Only Look Once) algorithm for real - time object detection, and conveys the detection results to the user in the form of voice through Google's text - to - speech library, informing the identity of the object and its direction relative to the screen position (such as center, left, right, bottom or top). This not only improves the cognitive ability of visually impaired people about the surrounding environment, but also enhances the safety of their movement.