Speech-based Age and Gender Prediction with Transformers

Felix Burkhardt,Johannes Wagner,Hagen Wierstorf,Florian Eyben,Björn Schuller

2023-06-29

Abstract:We report on the curation of several publicly available datasets for age and gender prediction. Furthermore, we present experiments to predict age and gender with models based on a pre-trained wav2vec 2.0. Depending on the dataset, we achieve an MAE between 7.1 years and 10.8 years for age, and at least 91.1% ACC for gender (female, male, child). Compared to a modelling approach built on handcrafted features, our proposed system shows an improvement of 9% UAR for age and 4% UAR for gender. To make our findings reproducible, we release the best performing model to the community as well as the sample lists of the data splits.

Sound,Audio and Speech Processing

What problem does this paper attempt to address?

### Problems Addressed by the Paper This paper aims to predict the age and gender of speakers using a transformer architecture based on the pre-trained wav2vec 2.0 model. Specifically, the paper focuses on the following aspects: 1. **Dataset Compilation**: The paper compiles several publicly available datasets for the tasks of age and gender prediction. These datasets include SpeechDat II, CommonVoice, aGender, TIMIT, and VoxCeleb2. 2. **Model Performance Evaluation**: The paper evaluates the model's performance on different datasets, including single-task models (predicting only age or gender) and multi-task models (predicting both age and gender simultaneously). 3. **Cross-Dataset Generalization**: The paper explores the model's generalization ability across different datasets, i.e., how a model trained on one dataset performs on other unseen datasets. 4. **Impact of Model Layers**: The paper studies the impact of the number of transformer layers on model performance to find the optimal balance between accuracy and speed. 5. **Comparison with Traditional Methods**: The paper compares deep learning-based methods with traditional hand-crafted feature-based methods, demonstrating the advantages of deep learning approaches. 6. **Emotion Data Prediction**: The paper also tests the model's performance on emotional speech data, exploring the impact of emotional expression on prediction accuracy. ### Main Contributions 1. **Proposed a New System**: A fine-tuned transformer model to estimate age and gender. 2. **Provided Curated Sample Sets**: Including lists of samples for training, development, and testing, and made them publicly available to the research community. 3. **Compared Single-Task and Multi-Task Models**: Evaluated the performance of single-task and multi-task models. 4. **Reported Cross-Dataset Results**: Examined the model's generalization ability. 5. **Studied the Impact of Transformer Layers**: Determined the number of layers needed to achieve the best balance between accuracy and speed. 6. **Released the Best Performing Model**: Published the best-performing model for public use. Through these studies, the paper aims to advance the technology in the field of age and gender prediction and provide benchmarks and references for subsequent research.

Speech-based Age and Gender Prediction with Transformers

Gender prediction using limited Twitter Data

Textual Pre-Trained Models for Age Screening Across Community Question-Answering

Dawn of the transformer era in speech emotion recognition: closing the valence gap

SEGAA: A Unified Approach to Predicting Age, Gender, and Emotion in Speech

Age and Gender Recognition Using a Convolutional Neural Network with a Specially Designed Multi-Attention Module through Speech Spectrograms

On the Limitations of Sociodemographic Adaptation with Transformers

Age Recommendation from Texts and Sentences for Children

A Deep Learning Approach to Language-independent Gender Prediction on Twitter

Effective training of convolutional neural networks for face-based gender and age prediction

Transfer Learning with Deep CNNs for Gender Recognition and Age Estimation

ABI Neural Ensemble Model for Gender Prediction Adapt Bar-Ilan Submission for the CLIN29 Shared Task on Gender Prediction

CNN Based Features Extraction for Age Estimation and Gender Classification

A Word Embeddings based Approach for Author Profiling: Gender and Age Prediction

Twitter-Based Gender Recognition Using Transformers

Author Identity Unveiled: Gender and Age Prediction from Textual Patterns using BERT

Age and Gender Prediction using Deep CNNs and Transfer Learning

Enhance Gender and Identity Preservation in Face Aging Simulation for Infants and Toddlers

Age and Gender Prediction Using Convolutional Neural Network

Can Demographic Factors Improve Text Classification? Revisiting Demographic Adaptation in the Age of Transformers

A Hybrid Transformer-Sequencer approach for Age and Gender classification from in-wild facial images