SeeSay: An Assistive Device for the Visually Impaired Using Retrieval Augmented Generation

Melody Yu
2024-10-03
Abstract:In this paper, we present SeeSay, an assistive device designed for individuals with visual impairments. This system leverages large language models (LLMs) for speech recognition and visual querying. It effectively identifies, records, and responds to the user's environment by providing audio guidance using retrieval-augmented generation (RAG). Our experiments demonstrate the system's capability to recognize its surroundings and respond to queries with audio feedback in diverse settings. We hope that the SeeSay system will facilitate users' comprehension and recollection of their surroundings, thereby enhancing their environmental perception, improving navigational capabilities, and boosting overall independence.
Human-Computer Interaction,Social and Information Networks
What problem does this paper attempt to address?