Personalized Coupled Tensor Decomposition for Multimodal Data Fusion: Uniqueness and Algorithms

Ricardo Augusto Borsoi,Konstantin Usevich,David Brie,Tülay Adali
DOI: https://doi.org/10.1109/TSP.2024.3510680
2024-12-02
Abstract:Coupled tensor decompositions (CTDs) perform data fusion by linking factors from different datasets. Although many CTDs have been already proposed, current works do not address important challenges of data fusion, where: 1) the datasets are often heterogeneous, constituting different "views" of a given phenomena (multimodality); and 2) each dataset can contain personalized or dataset-specific information, constituting distinct factors that are not coupled with other datasets. In this work, we introduce a personalized CTD framework tackling these challenges. A flexible model is proposed where each dataset is represented as the sum of two components, one related to a common tensor through a multilinear measurement model, and another specific to each dataset. Both the common and distinct components are assumed to admit a polyadic decomposition. This generalizes several existing CTD models. We provide conditions for specific and generic uniqueness of the decomposition that are easy to interpret. These conditions employ uni-mode uniqueness of different individual datasets and properties of the measurement model. Two algorithms are proposed to compute the common and distinct components: a semi-algebraic one and a coordinate-descent optimization method. Experimental results illustrate the advantage of the proposed framework compared with the state of the art approaches.
Machine Learning,Signal Processing
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the limitations of existing Coupled Tensor Decompositions (CTDs) in multimodal data fusion, especially the following two challenges: 1. **Heterogeneity of datasets**: Different datasets usually form different "views" (i.e., multimodal) of a given phenomenon, such as data obtained by different sensors or measurement methods. 2. **Personalized or dataset - specific information**: Each dataset may contain unique information that is not coupled with other datasets and is specific to this dataset. To solve these problems, the author proposes a personalized CTD framework. This framework allows each dataset to be represented as the sum of two parts: - One part is associated with a common tensor through a multilinear measurement model, representing shared information; - The other part is specific to each dataset, representing personalized or dataset - specific information. This model not only generalizes the existing CTD models but also provides conditions for specific uniqueness and general uniqueness. In addition, the author proposes two algorithms to calculate the common and specific components: one is a semi - algebraic method, and the other is an optimization method based on coordinate descent. ### Main contributions 1. **Generalize existing models**: This framework not only covers previous work (such as [9], [40], etc.), but also considers a wider range of measurement/degradation models and the heterogeneity between datasets. 2. **Multilinear measurement model**: The proposed model is applicable to various practical applications, such as hyperspectral and neuroimaging data fusion problems. 3. **Uniqueness conditions that are easy to interpret**: These conditions show the "weaker" (unimodal) uniqueness of each measurement tensor and the influence of the measurement model on the uniqueness of the entire decomposition. 4. **Algorithm development**: Semi - algebraic and optimization algorithms are proposed to calculate the decomposition, and their advantages over existing methods are verified through experiments. In summary, this paper aims to improve the effective extraction and interpretation ability of shared and specific information in multimodal data fusion by introducing a personalized CTD framework.