Real-Time Pill Identification for the Visually Impaired Using Deep Learning

Bo Dang,Wenchao Zhao,Yufeng Li,Danqing Ma,Qixuan Yu,Elly Yijun Zhu
2024-05-08
Abstract:The prevalence of mobile technology offers unique opportunities for addressing healthcare challenges, especially for individuals with visual impairments. This paper explores the development and implementation of a deep learning-based mobile application designed to assist blind and visually impaired individuals in real-time pill identification. Utilizing the YOLO framework, the application aims to accurately recognize and differentiate between various pill types through real-time image processing on mobile devices. The system incorporates Text-to- Speech (TTS) to provide immediate auditory feedback, enhancing usability and independence for visually impaired users. Our study evaluates the application's effectiveness in terms of detection accuracy and user experience, highlighting its potential to improve medication management and safety among the visually impaired community. Keywords-Deep Learning; YOLO Framework; Mobile Application; Visual Impairment; Pill Identification; Healthcare
Computer Vision and Pattern Recognition,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
This paper aims to address the challenges faced by visually impaired individuals in medication management, particularly the accurate identification of pills. Traditional pill identification methods (such as visual inspection and manual reading of labels) heavily rely on vision, making them difficult for visually impaired individuals to perform, which can lead to health risks. Additionally, relying on others to help identify medication reduces the independence of visually impaired individuals and may raise privacy concerns. Although some medication packaging includes Braille labels, not all users can read Braille. To address these issues, the paper proposes a mobile application based on deep learning technology that uses the YOLO framework to identify pills in real-time. The application captures images of pills through the phone's camera and then uses a pre-trained YOLOv8 model for pill detection and classification. To improve accuracy, the researchers fine-tuned the model using the Pillbox dataset and employed data augmentation techniques to adapt to different lighting conditions and background environments. Additionally, the application integrates a text-to-speech (TTS) function so that users can immediately receive auditory feedback about the detected pills. Through this approach, the researchers hope to reduce the risks associated with incorrect medication and enhance the autonomy and safety of visually impaired individuals in medication management. Experimental results show that the model has a high accuracy in pill identification (mAP reaching 99.5%), demonstrating its potential in practical applications.