Abstract:Audio-based kinship verification (AKV) is important in many domains, such as home security monitoring, forensic identification, and social network analysis. A key challenge in the task arises from differences in age across samples from different individuals, which can be interpreted as a domain bias in a cross-domain verification task. To address this issue, we design the notion of an "age-standardised domain" wherein we utilise the optimised CycleGAN-VC3 network to perform age-audio conversion to generate the in-domain audio. The generated audio dataset is employed to extract a range of features, which are then fed into a metric learning architecture to verify kinship. Experiments are conducted on the KAN_AV audio dataset, which contains age and kinship labels. The results demonstrate that the method markedly enhances the accuracy of kinship verification, while also offering novel insights for future kinship verification research.

What problem does this paper attempt to address?

This paper attempts to solve the age - domain shift problem in Audio - based Kinship Verification (AKV). Specifically, the age differences between different individual samples can be regarded as the domain - shift problem in cross - domain verification tasks, which will affect the accuracy of kinship verification. ### Main problems 1. **Domain shift caused by age differences**: In audio - based kinship verification, age differences between different individuals will lead to changes in voice features, thus affecting the model's accurate identification of kinship. 2. **Deficiencies of existing methods**: Most of the existing research mainly focuses on using facial images and videos for kinship verification, while audio - based kinship verification (AKV) has not been fully explored, although audio data has unique advantages in some scenarios (such as telephone calls). ### Solutions To solve these problems, the author proposes the following methods: - **Age - standardized domain**: By using the optimized CycleGAN - VC3 network, convert audios of different ages to a unified intermediate age group, thereby reducing the impact of age differences on voice features. - **Feature extraction and metric learning**: Extract multiple features from the generated audio dataset and use a metric - learning architecture for kinship verification. ### Experimental results The experimental results show that this method significantly improves the accuracy of kinship verification, especially when dealing with cross - age - group relationships (such as father - daughter, mother - daughter). Specifically, using the optimized TripletNet model and Wav2Vec features, the overall weighted accuracy on the generated dataset reaches 71.3%, which is about 5% higher than the baseline method. ### Summary This paper effectively alleviates the age - domain shift problem in audio - based kinship verification by introducing age - conversion techniques, and improves the generalization ability and verification accuracy of the model. Future work can further consider other factors such as gender conversion to better deal with complex kinship verification tasks.

Audio-based Kinship Verification Using Age Domain Conversion

Audio-Visual Kinship Verification: A New Dataset and a Unified Adaptive Adversarial Multimodal Learning Approach

Age-Invariant Adversarial Feature Learning for Kinship Verification

AudioVSR: Enhancing Video Speech Recognition with Audio Data

Kinship verification and recognition based on handcrafted and deep learning feature-based techniques

Adv-Kin: An Adversarial Convolutional Network for Kinship Verification

Supervised Mixed Norm Autoencoder for Kinship Verification in Unconstrained Videos

Deep Collaborative Multi-Modal Learning for Unsupervised Kinship Estimation

Video-Based Facial Kinship Verification

Kinship Verification Based on Cross-Generation Feature Interaction Learning

Tri-Subject Kinship Verification: Understanding the Core of A Family

Multi-attribute balanced dataset generation framework AutoSyn and KinFace Channel-Spatial Feature Extractor for kinship recognition

Enhancing Kinship Verification through Multiscale Retinex and Combined Deep-Shallow features

A review on kinship verification from facial information

Audio-visual child-adult speaker classification in dyadic interactions

Kinship verification based on multi-scale feature fusion

Hierarchical Representation Learning for Kinship Verification

Kinship verification from facial images under uncontrolled conditions.

A Fusion Schema of Hand-Crafted Feature and Feature Learning for Kinship Verification

What Will Your Child Look Like? DNA-Net: Age and Gender Aware Kin Face Synthesizer

Kinship Measurement on Salient Facial Features