GMK Net: Generative Data-Guided Multiple Kernel Network for Multimodal Finger Recognition

Yiwei Huang,Hui Ma,Mingyang Wang
DOI: https://doi.org/10.1016/j.eswa.2024.125953
IF: 8.5
2024-01-01
Expert Systems with Applications
Abstract:Multimodal finger recognition technology that utilizes multiple sources of information, such as fingerprints and finger veins, to determine an individual’s identity has gained attention due to its potential to address the limitations associated with unimodal forms in terms of security and accuracy. However, the existing multi-biometric recognition methods require large-scale data with consistent number of samples across modalities for network training, which reduces the flexibility and accuracy of recognition. To address this issue from multiple aspects, we propose a Generative data-guided Multiple kernel network (GMK Net) for multimodal finger recognition that boosts the aggregation of multiple self-supervised subspaces with generated data. Specifically, on the one hand, for the unbalanced multimodal finger data distribution, a Finger Vein Category Diffusion Model (FV-CDM) is proposed, which generates deep semantic information through the dual-channel network architecture. It is subsequently optimized using the category penalty generated by the auxiliary branch and the structural penalty, thereby reducing the impact of irrelevant background noise in the finger vein images on noise estimation. On the other hand, to circumvent the ambiguity stemming from self-supervised tasks designed with prior knowledge, a Multiple Kernel Encoder (MKE) is proposed, which adaptively aggregates multiple subspaces with different mapping capabilities to fuse features. Considering the complexity and differences in the image distribution of multimodal finger data, multiple self-supervised and supervised losses are combined to prevent degradation in recognition performance. Experimental results validate the effectiveness of GMK Net, demonstrating its capacity to generate high-fidelity finger vein images to address the problem of data imbalance and enhance multimodal recognition accuracy without prior knowledge.
What problem does this paper attempt to address?