OBJECT DETECTION AND TEXT TO SPEECH CONVERSION BASED ON YOLOV7 USING DEEP LEARNING

KOTA TEJASWINI,KONDA SAI CHAITHANYA,KETHAVATHU LAKSHMI BAI
DOI: https://doi.org/10.55041/ijsrem18807
2023-04-10
INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT
Abstract:Abstract—Object detection is a computer vision technique that locates objects in images or videos by creating bounding boxes around them. In this paper, we propose a model based on object detection using deep learning technologies along with text to speech conversion.An object detection system uses a deep learningmodel to detect objects using YOLO (You Only Look Once) and text-to- speech (TTS) to synthesize a voice announcement about each object. The system we used is built using python OpenCV tool and Google text to speech (gTTS) is used to convert text into audio segment. First variations of YOLO algorithm are compared and then the best one is used according to result we get it by training it on COCO dataset. After the object is detected, the name of the detected object is displayed then the voice output is generated by using Google Text To Speech(gTTS) module. The contribution we make is to present a visual substitution system that uses features extraction and matching to recognize objects with a voice feedback. Index Terms—Object Detection, YOLO, Open CV, python, Google Text To Speech
What problem does this paper attempt to address?