Helios: An extremely low power event-based gesture recognition for always-on smart eyewear

Prarthana Bhattacharyya,Joshua Mitton,Ryan Page,Owen Morgan,Ben Menzies,Gabriel Homewood,Kemi Jacobs,Paolo Baesso,David Trickett,Chris Mair,Taru Muhonen,Rory Clark,Louis Berridge,Richard Vigars,Iain Wallace
2024-08-26
Abstract:This paper introduces Helios, the first extremely low-power, real-time, event-based hand gesture recognition system designed for all-day on smart eyewear. As augmented reality (AR) evolves, current smart glasses like the Meta Ray-Bans prioritize visual and wearable comfort at the expense of functionality. Existing human-machine interfaces (HMIs) in these devices, such as capacitive touch and voice controls, present limitations in ergonomics, privacy and power consumption. Helios addresses these challenges by leveraging natural hand interactions for a more intuitive and comfortable user experience. Our system utilizes a extremely low-power and compact 3mmx4mm/20mW event camera to perform natural hand-based gesture recognition for always-on smart eyewear. The camera's output is processed by a convolutional neural network (CNN) running on a NXP Nano UltraLite compute platform, consuming less than 350mW. Helios can recognize seven classes of gestures, including subtle microgestures like swipes and pinches, with 91% accuracy. We also demonstrate real-time performance across 20 users at a remarkably low latency of 60ms. Our user testing results align with the positive feedback we received during our recent successful demo at AWE-USA-2024.
Computer Vision and Pattern Recognition,Human-Computer Interaction,Machine Learning
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to address the limitations of current smart glasses in terms of functionality and user experience. Specifically, the authors propose the Helios system, an extremely low - power, event - based gesture recognition system designed specifically for all - day - long smart glasses. The following are the main problems that the paper attempts to solve: 1. **Limitations of existing HMIs (Human - Machine Interfaces)**: - **Problems with capacitive touch and voice control**: Most existing smart glasses rely on capacitive touch and voice control. However, these methods have the following problems: - **Capacitive touch**: It requires direct contact with the device surface, which may cause discomfort during long - term use; and it is not responsive under certain conditions (such as when wearing gloves or when the screen is wet). - **Voice control**: It causes social discomfort, especially when using wake - up words in public places; and the voice control function is usually turned off to save power. 2. **Balance between functionality and comfort**: - Current smart glasses (such as Meta Ray - Bans) focus more on appearance and social acceptance at the expense of functionality and computing power. This makes them unable to provide sufficient interaction functions in practical applications, especially for all - day - long use requirements. 3. **Challenges in gesture recognition**: - **Micro - gesture recognition**: Recognizing subtle gestures (such as swiping and pinching) is challenging because these actions are very delicate and are easily confused with jitter or noise in hand tracking. - **Real - time and low - latency**: In order to achieve a smooth interaction experience, the gesture recognition system needs to have low - latency characteristics, usually requiring recognition to be completed within 100 milliseconds. 4. **Power consumption problem**: - Existing gesture recognition systems usually have high power consumption and are difficult to meet the battery life requirements for all - day - long use of smart glasses. Traditional frame - based cameras also increase power consumption when dealing with motion artifacts and complex lighting conditions. ### Solutions of the Helios system The Helios system solves the above problems in the following ways: - **Event - based camera**: It uses a 3mm × 4mm/20mW event camera. This camera only records luminance changes, generating sparse and efficient data streams, thereby reducing power consumption and latency. - **Convolutional Neural Network (CNN)**: It uses the CNN on the NXP Nano UltraLite platform for gesture recognition. The entire system has a power consumption of less than 350mW, an accuracy rate of 91% for recognizing seven types of gestures, and a latency of only 60ms. - **Natural hand interaction**: It enables a more intuitive and comfortable user experience through natural gestures (such as swiping, pinching, etc.), avoiding the limitations of capacitive touch and voice control. - **Robustness**: The Helios system can distinguish between the user's real gestures and noise caused by body movement, ensuring that there is no false triggering during daily activities. In summary, the Helios system achieves extremely low - power, high - accuracy, and low - latency gesture recognition through innovative hardware design and efficient machine - learning algorithms, significantly improving the functionality and user experience of smart glasses.