Abstract:Manifold alignment is a type of data fusion technique that creates a shared low-dimensional representation of data collected from multiple domains, enabling cross-domain learning and improved performance in downstream tasks. This paper presents an approach to manifold alignment using random forests as a foundation for semi-supervised alignment algorithms, leveraging the model's inherent strengths. We focus on enhancing two recently developed alignment graph-based by integrating class labels through geometry-preserving proximities derived from random forests. These proximities serve as a supervised initialization for constructing cross-domain relationships that maintain local neighborhood structures, thereby facilitating alignment. Our approach addresses a common limitation in manifold alignment, where existing methods often fail to generate embeddings that capture sufficient information for downstream classification. By contrast, we find that alignment models that use random forest proximities or class-label information achieve improved accuracy on downstream classification tasks, outperforming single-domain baselines. Experiments across multiple datasets show that our method typically enhances cross-domain feature integration and predictive performance, suggesting that random forest proximities offer a practical solution for tasks requiring multimodal data alignment.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is that when existing manifold alignment methods generate embeddings for downstream classification tasks, they often fail to capture sufficient information, resulting in these embeddings performing worse in classification tasks than single - domain baseline models. Specifically, the representations generated by many existing manifold alignment methods have poor performance on prediction models and cannot significantly improve the classification accuracy of multimodal data. To solve this problem, the paper proposes a manifold alignment method supervised by Random Forest, aiming to initialize the manifold learning algorithm by using the supervision information of Random Forest. This method can enhance cross - domain feature fusion and prediction performance, thereby improving the performance of downstream classification tasks. The two methods proposed in the paper are: 1. **RF - SPUD (Random Forest - Supervised Shortest Path on Union of Domains)**: Construct cross - domain relationships through the shortest - path method. 2. **RF - MASH (Random Forest - Supervised Manifold Alignment via Stochastic Hopping)**: Construct cross - domain relationships through the diffusion process. Both of these methods use the Random Forest Geometrically - Aware Proximity (RF - GAP proximities) to ensure that the generated embeddings can preserve the local neighborhood structure and show better performance in downstream classification tasks. ### Formulas and Concepts - **Random Forest Proximity**: The Random Forest proximity \(p(x_i, x_j)\) is calculated by the Random Forest model and represents the similarity between data points \(x_i\) and \(x_j\). This similarity can be used to construct a weighted graph, where the edge weights reflect the similarity between data points. \[ p(x_i, x_j)=\frac{\text{The number of times }x_i\text{ and }x_j\text{ fall into the same leaf node in the tree}}{\text{The number of trees}} \] - **Cross - Domain Similarity Matrix**: The cross - domain similarity matrix \(P\) contains two sub - matrices \(P_X\) and \(P_Y\), which represent the similarities within different domains respectively, and a cross - domain similarity matrix \(P_{XY}\). \[ P = \begin{pmatrix} P_X & P_{XY}\\ P_{YX} & P_Y \end{pmatrix} \] where \(P_{YX} = P_{XY}^T\). By introducing the Random Forest proximity, the method in the paper can better capture the relationships between multimodal data while maintaining the local neighborhood structure, thereby improving the accuracy of downstream classification tasks. Experimental results show that the manifold alignment method initialized by Random Forest can significantly improve the classification performance on multiple datasets.

Random Forest-Supervised Manifold Alignment

Learning Visually Aligned Semantic Graph for Cross-Modal Manifold Matching.

Graph Integration for Diffusion-Based Manifold Alignment

Manifold Alignment Via Corresponding Projections.

Attention-based Cross-Layer Domain Alignment for Unsupervised Domain Adaptation

Manifold Alignment Via Global and Local Structures Preserving PCA Framework.

Face Alignment Based on 3D Face Shape Model and Markov Random Field.

Joint Domain Alignment and Discriminative Feature Learning for Unsupervised Deep Domain Adaptation

Semi-definite Manifold Alignment

Unsupervised Random Forest Manifold Alignment for Lipreading

Manifold adversarial training for supervised and semi-supervised learning

Geodesic Based Semi-supervised Multi-manifold Feature Extraction

Local Coordinates Alignment (Lca): A Novel Manifold Learning Approach

Deep Joint Discriminative Feature Learning and Class-Aware Domain Alignment for Unsupervised Domain Adaptation

Unsupervised image matching based on manifold alignment.

Learnable Manifold Alignment (LeMA) : A Semi-supervised Cross-modality Learning Framework for Land Cover and Land Use Classification

Federated Learning with Manifold Regularization and Normalized Update Reaggregation

Federated Learning under Partially Class-Disjoint Data via Manifold Reshaping

Scalable unsupervised alignment of general metric and non-metric structures

Unsupervised Image Classifier based on Manifold Learning

Manifold-Aware Self-Training for Unsupervised Domain Adaptation on Regressing 6D Object Pose