Abstract:Hypergraph learning is a new research hotspot in the machine learning field. The performance of the hypergraph learning model depends on the quality of the hypergraph structure built by different feature extraction methods as well as its incidence matrix. However, the existing models are all hypergraph structures built based on one feature extraction method, with limited feature extraction and abstract expression ability. This paper proposed a multimodal feature fusion method, which firstly built a single modal hypergraph structure based on different feature extraction methods, and then extended the hypergraph incidence matrix and weight matrix of different modals. The extended matrices fuse the multimodal abstract feature and an expanded Markov random walk range during model learning, with stronger feature expression ability. However, the extended multimodal incidence matrix has a high scale and high computational cost. Therefore, the Laplacian matrix fusion method was proposed, which performed Laplacian matrix transformation on the incidence matrix and weight matrix of every model, respectively, and then conducted a weighted superposition on these Laplacian matrices for subsequent model training. The tests on four different types of datasets indicate that the hypergraph learning model obtained after multimodal feature fusion has a better classification performance than the single modal model. After Laplace matrix fusion, the average time can be reduced by about 40% compared with the extended incidence matrix, the classification performance can be further improved, and the index F1 can be improved by 8.4%.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to improve the performance of the hypergraph learning model on the basis of multimodal feature fusion. Most of the existing hypergraph learning models construct hypergraph structures based on a single feature extraction method, which limits the ability of feature extraction and abstract representation. This paper proposes a multimodal feature fusion method. By expanding the hypergraph incidence matrices and weight matrices of different modalities, it fuses multimodal abstract features and expands the Markov random walk range during the model learning process, thereby enhancing the feature representation ability. However, the expanded multimodal incidence matrix has high dimensions and high computational cost. Therefore, this paper also proposes a Laplacian matrix fusion method. By performing Laplacian matrix transformation on the incidence matrices and weight matrices of each modality, and then performing weighted superposition on these Laplacian matrices, the computational cost is reduced and the model performance is further improved. Specifically, the main contributions of the paper include: 1. **Construction of multimodal hypergraph structure**: First, construct a unimodal hypergraph structure based on different feature extraction methods, and then expand the hypergraph incidence matrices and weight matrices of different modalities, and fuse multimodal abstract features through model training. The test results show that the expanded multimodal incidence matrix can effectively improve the classification performance of the hypergraph learning model. 2. **Laplacian matrix fusion method**: When the dimension of the multimodal incidence matrix is high, it will lead to high computational cost. Therefore, the Laplacian matrix fusion method is proposed. First, perform Laplacian matrix transformation on the hypergraph incidence matrices and weight matrices of each modality, and then perform weighted superposition on these Laplacian matrices for subsequent model training. The test results show that the Laplacian matrix fusion method can not only reduce the computational cost of the multimodal incidence matrix, but also further improve the model performance. Through these methods, the paper aims to solve the limitations of existing hypergraph learning models in feature extraction and abstract representation ability, and improve the classification performance and computational efficiency of the model.

Multimodal Feature Fusion Based Hypergraph Learning Model

Hypergraph Learning: Methods and Practices

Residual Enhanced Multi-Hypergraph Neural Network

Multi-Modal Image Fusion Via Deep Laplacian Pyramid Hybrid Network

Dense Multimodal Fusion for Hierarchically Joint Representation

Learning Deep Multimodal Feature Representation with Asymmetric Multi-layer Fusion

Feature relationships hypergraph for multimodal recognition

Multi-Scale Representation Learning on Hypergraph for 3D Shape Retrieval and Recognition

Feature Correlation Hypergraph: Exploiting High-order Potentials for Multimodal Recognition

Adaptive Multimodal Robust Feature Learning Based on Dual Graph-regularization

Hyperspectral Image Classification Using Feature Fusion Hypergraph Convolution Neural Network

A Heterogeneous Graph Based Framework for Multimodal Neuroimaging Fusion Learning

Multimodal Hyperspectral Image Classification via Interconnected Fusion

MLSFF: Multi-level structural features fusion for multi-modal knowledge graph completion

Reinforcement Learning Based Markov Edge Decoupled Fusion Network for Fusion Classification of Hyperspectral and LiDAR

Online multi-hypergraph fusion learning for cross-subject emotion recognition

HGNN$^+$: General Hypergraph Neural Networks

Weakly paired multimodal fusion using multilayer extreme learning machine

Analyzing Unaligned Multimodal Sequence via Graph Convolution and Graph Pooling Fusion

Multimodal Graph for Unaligned Multimodal Sequence Analysis via Graph Convolution and Graph Pooling