Real-time eye state recognition using dual convolutional neural network ensemble

Sumeet Saurav,Prashant Gidde,Ravi Saini,Sanjay Singh
DOI: https://doi.org/10.1007/s11554-022-01211-5
2022-03-16
Abstract:<p class="a-plus-plus">Automatic recognition of the eye states is essential for diverse computer vision applications related to drowsiness detection, facial emotion recognition (FER), human–computer interaction (HCI), etc. Existing solutions for eye state detection are either parameter intensive or suffer from a low recognition rate. This paper presents the design and implementation of a vision-based system for real-time eye state recognition on a resource-constrained embedded platform to tackle these issues. The designed system uses an ensemble of two lightweight convolutional neural networks (CNN), each trained to extract relevant information from the eye patches. We adopted transfer-learning-based fine-tuning to overcome the over-fitting issues when training the CNNs on small sample eye state datasets. Once trained, these CNNs are integrated and jointly fine-tuned to achieve enhanced performance. Experimental results manifest the effectiveness of the proposed eye state recognizer that is robust and computationally efficient. On the ZJU dataset, the proposed DCNNE model delivered the state-of-the-art recognition accuracy of 97.99% and surpassed the prior best recognition accuracy of 97.20% by 0.79%. The designed model also achieved competitive results on the CEW and MRL datasets. Finally, the designed CNNs are optimized and ported on two different embedded platforms for real-world applications with real-time performance. The complete system runs at 62 frames per second (FPS) on an Nvidia Xavier device and 11 FPS on a low-cost Intel NCS2 embedded platform using a frame size of 640 <span class="a-plus-plus inline-equation id-i-eq1"><span class="a-plus-plus equation-source format-t-e-x"><span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="1.808ex" height="1.509ex" style="vertical-align: 0.019ex; margin-bottom: -0.19ex;" viewBox="0 -576.1 778.5 649.8" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMAIN-D7" x="0" y="0"></use></g></svg></span></span></span> 480 pixels resolution.</p><svg xmlns="http://www.w3.org/2000/svg" style="display: none;"><defs id="MathJax_SVG_glyphs"><path stroke-width="1" id="MJMAIN-D7" d="M630 29Q630 9 609 9Q604 9 587 25T493 118L389 222L284 117Q178 13 175 11Q171 9 168 9Q160 9 154 15T147 29Q147 36 161 51T255 146L359 250L255 354Q174 435 161 449T147 471Q147 480 153 485T168 490Q173 490 175 489Q178 487 284 383L389 278L493 382Q570 459 587 475T609 491Q630 491 630 471Q630 464 620 453T522 355L418 250L522 145Q606 61 618 48T630 29Z"></path></defs></svg>
computer science, artificial intelligence,engineering, electrical & electronic,imaging science & photographic technology
What problem does this paper attempt to address?