Convolutional neural network with spatial pyramid pooling for hand gesture recognition

Yong Soon Tan,Kian Ming Lim,Connie Tee,Chin Poo Lee,Cheng Yaw Low
DOI: https://doi.org/10.1007/s00521-020-05337-0
2020-09-15
Neural Computing and Applications
Abstract:Hand gesture provides a means for human to interact through a series of gestures. While hand gesture plays a significant role in human–computer interaction, it also breaks down the communication barrier and simplifies communication process between the general public and the hearing-impaired community. This paper outlines a convolutional neural network (CNN) integrated with spatial pyramid pooling (SPP), dubbed CNN–SPP, for vision-based hand gesture recognition. SPP is discerned mitigating the problem found in conventional pooling by having multi-level pooling stacked together to extend the features being fed into a fully connected layer. Provided with inputs of varying sizes, SPP also yields a fixed-length feature representation. Extensive experiments have been conducted to scrutinize the CNN–SPP performance on two well-known American sign language (ASL) datasets and one NUS hand gesture dataset. Our empirical results disclose that CNN–SPP prevails over other deep learning-driven instances.
computer science, artificial intelligence
What problem does this paper attempt to address?