Advantages and Pitfalls of Dataset Condensation: An Approach to Keyword Spotting with Time-Frequency Representations

Pedro Henrique Pereira,Wesley Beccaro,Miguel Arjona Ramírez
DOI: https://doi.org/10.3390/electronics13112097
IF: 2.9
2024-05-29
Electronics
Abstract:With the exponential growth of data, the need for efficient techniques to extract relevant information from datasets becomes increasingly imperative. Reducing the training data can be useful for applications wherein storage space or computational resources are limited. In this work, we explore the concept of data condensation (DC) in the context of keyword spotting systems (KWS). Using deep learning architectures and time-frequency speech representations, we have obtained condensed speech signal representations using gradient matching with Efficient Synthetic-Data Parameterization. From a series of classification experiments, we analyze the models and condensed data performances in terms of accuracy and number of data per class. We also present results using cross-model techniques, wherein models are trained with condensed data obtained from a different architecture. Our findings demonstrate the potential of data condensation in the context of the speech domain for reducing the size of datasets while retaining their most important information and maintaining high accuracy for the model trained with the condensed dataset. We have obtained an accuracy of 80.75% with 30 condensed speech representations per class with ConvNet, representing an addition of 24.9% in absolute terms when compared to 30 random samples from the original training dataset. However, we demonstrate the limitations of this approach in the cross-model tests. We also highlight the challenges and opportunities for further improving the accuracy of condensed data obtained and trained with different neural network architectures.
engineering, electrical & electronic,physics, applied,computer science, information systems
What problem does this paper attempt to address?