Deep Learning‐Enabled MXene/PEDOT:PSS Acoustic Sensor for Speech Recognition and Skin‐Vibration Detection

Huijun Ding,Zhenping Zeng,Ziwei Wang,Xiaolin Li,Tanju Yildirim,Qinlin Xie,Han Zhang,Swelm Wageh,Ahmed A. Al-Ghamdi,Xi Zhang,Bo Wen
DOI: https://doi.org/10.1002/aisy.202200140
IF: 7.298
2022-10-07
Advanced Intelligent Systems
Abstract:A flexible and wearable MXene/PEDOT:PSS acoustic sensor with high sensitivity, excellent mechanical strength, and easy integration is fabricated for further intelligent artificial acoustics. By combining with the proposed deep learning model based on NR‐CNN, speech recognition toward different pronunciations of numbers that appear frequently in daily conversations can be realized. Flexible acoustic sensors with high sensitivity, excellent mechanical strength, and easy integration are urgently needed for wearable electronics. MXene holds great promise as a sensing material for this application. However, low flexibility and stability limit the performance of MXene‐based composites. To alleviate the aforementioned issue, a flexible pressure sensor based on MXene/poly(3,4‐ethylenediox‐ythiophene)‐poly(styrenesulfonate) (PEDOT:PSS) is fabricated and used as an acoustic sensor inhibiting high sensitivity, fast response time (57 ms), ultra‐thin thickness (30 μm), and remarkable stability. Excellent performance enables the sensor to detect and identify weak muscle movements and skin vibrations, such as word pronunciation and carotid artery pulse. Furthermore, by combining the proposed deep learning model based on number recognition convolutional neural network (NR‐CNN), speech recognition toward different pronunciations of numbers that appear frequently in daily conversations can be realized. High recognition accuracy (91%) is achieved by training and testing the proposed NR‐CNN with large amounts of data recorded by the sensor. Results demonstrate that the flexible and wearable MXene/PEDOT:PSS acoustic sensor accelerates intelligent artificial acoustics and possesses great potential for applications involving speech recognition and health monitoring.
What problem does this paper attempt to address?