Abstract:Object detection, one of the most significant contributions of computer vision and machine learning, plays an immense role in identifying and locating objects in an image or a video. We recognize distinct objects and precisely get their information through object detection, such as their size, shape, and location. This paper developed a low-cost assistive system of obstacle detection and the surrounding environment depiction to help blind people using deep learning techniques. TensorFlow object detection API and SSDLite MobileNetV2 have been used to create the proposed object detection model. The pre-trained SSDLite MobileNetV2 model is trained on the COCO dataset, with almost 328,000 images of 90 different objects. The gradient particle swarm optimization (PSO) technique has been used in this work to optimize the final layers and their corresponding hyperparameters of the MobileNetV2 model. Next, we used the Google text-to-speech module, PyAudio, playsound, and speech recognition to generate the audio feedback of the detected objects. A Raspberry Pi camera captures real-time video where real-time object detection is done frame by frame with Raspberry Pi 4B microcontroller. The proposed device is integrated into a head cap, which will help visually impaired people to detect obstacles in their path, as it is more efficient than a traditional white cane. Apart from this detection model, we trained a secondary computer vision model and named it the "ambiance mode." In this mode, the last three convolutional layers of SSDLite MobileNetV2 are trained through transfer learning on a weather dataset. The dataset comprises around 500 images from four classes: cloudy, rainy, foggy, and sunrise. In this mode, the proposed system will narrate the surrounding scene elaborately, almost like a human describing a landscape or a beautiful sunset to a visually impaired person. The performance of the object detection and ambiance description modes are tested and evaluated in a desktop computer and Raspberry Pi embedded system. Detection accuracy and mean average precision, frame rate, confusion matrix, and ROC curve measure the model's accuracy on both setups. This low-cost proposed system is believed to help visually impaired people in their day-to-day life.

Assisting Blind People Using Object Detection with Vocal Feedback

Object Detection with Voice Guidance to Assist Visually Impaired Using Yolov7

Scene Text Detection and Recognition System for Visually Impaired People in Real World

Object Detection Using Image Processing for Blind Person

Vision AI: A Deep Learning-Based Object Recognition System for Visually Impaired People Using TensorFlow and OpenCV

An Efficient and Real-Time Emergency Exit Detection Technology for the Visually Impaired People Based on YOLOv5

Object Recognition System for the Visually Impaired: A Deep Learning Approach using Arabic Annotation

Efficient Multi-Object Detection and Smart Navigation Using Artificial Intelligence for Visually Impaired People

Deep learning based object detection and surrounding environment description for visually impaired people

REAL TIME OBJECT DETECTION FOR VISUALLY CHALLENGED PEOPLE USING MACHINE LEARNING

Enhanced Yolov8 with OpenCV for Blind-Friendly Object Detection and Distance Estimation

A Lightweight Visual Understanding System for Enhanced Assistance to the Visually Impaired Using an Embedded Platform

Obstacle Detection System for Navigation Assistance of Visually Impaired People Based on Deep Learning Techniques

Enabling Social Interaction: A Face Recognition System for Visually Impaired People using OpenCV

Realtime Object Detection and Disease Prediction of Visually Impaired People

Object detection and recognition: using deep learning to assist the visually impaired

Blind Person Assistant: Object Detection

Machine learning and Sensor-Based Multi-Robot System with Voice Recognition for Assisting the Visually Impaired

Improve accessibility for Low Vision and Blind people using Machine Learning and Computer Vision

Image Recognition Using Text and Audio Translation for the Visually Challenged

Embedded Computer Vision for Object Recognition in Smart Devices for the Blind