Abstract:In this paper, we propose a novel semi-supervised feature selection framework by mining correlations among multiple tasks and apply it to different multimedia applications. Instead of independently computing the importance of features for each task, our algorithm leverages shared knowledge from multiple related tasks, thus, improving the performance of feature selection. Note that we build our algorithm on assumption that different tasks share common structures. The proposed algorithm selects features in a batch mode, by which the correlations between different features are taken into consideration. Besides, considering the fact that labeling a large amount of training data in real world is both time-consuming and tedious, we adopt manifold learning which exploits both labeled and unlabeled training data for feature space analysis. Since the objective function is non-smooth and difficult to solve, we propose an iterative algorithm with fast convergence. Extensive experiments on different applications demonstrate that our algorithm outperforms other state-of-the-art feature selection algorithms.

What problem does this paper attempt to address?

### Problems the paper attempts to solve This paper aims to solve how to utilize the correlations among multiple related tasks and the advantages of semi - supervised learning to enhance the effect of feature selection during the feature selection process. Specifically, the author proposes a novel semi - supervised feature selection framework (Semi - supervised Feature selection by Mining Correlations among multiple tasks, SFMC), which performs feature selection by mining the correlations among multiple tasks and combining labeled and unlabeled data. #### Main problems include: 1. **Redundant features in high - dimensional data**: - In many computer vision and pattern recognition applications, the dimension of data representation is usually very high. Many features are noisy or correlated with each other, which will reduce the performance of subsequent data analysis tasks. 2. **Information loss in feature selection**: - Existing feature selection algorithms usually independently evaluate the importance of each feature, ignoring the correlations between different features. Moreover, they select features for each task separately and fail to mine the correlations among multiple related tasks. 3. **Insufficient labeled data**: - In real - world applications, it is unrealistic to manually label a large number of training samples. Therefore, how to effectively utilize unlabeled data becomes an important issue. 4. **The advantages of multi - task learning are not fully utilized**: - Existing research on multi - task learning shows that jointly learning multiple related tasks can improve performance. However, existing feature selection algorithms fail to fully utilize this. ### Solutions To overcome the above problems, the author proposes the following solutions: - **Combining semi - supervised learning and multi - task learning**: - Use labeled and unlabeled data for feature selection and consider the correlations between different features, thereby improving feature selection performance. - **Introducing manifold learning**: - Explore the structure of multimedia data through manifold learning to better handle feature space analysis. - **Optimizing the objective function**: - Propose a fast - converging iterative algorithm to solve the non - smooth and difficult - to - solve objective function, so as to obtain the optimal solution. ### Experimental verification The author verifies the effectiveness of the proposed method through experiments such as video classification, image annotation, human action recognition, and 3D motion data analysis. The experimental results show that the SFMC algorithm outperforms other existing methods in different application scenarios, especially when the labeled data is insufficient. ### Summary The main contribution of this paper lies in combining semi - supervised feature selection and multi - task learning in one framework, which not only improves the effect of feature selection but also can better handle the problem of insufficient labeled data.

Semi-supervised Feature Analysis by Mining Correlations among Multiple Tasks

Semisupervised Feature Analysis by Mining Correlations among Multiple Tasks.

Multi-View Correlated Feature Learning by Uncovering Shared Component.

Feature Selection for Multimedia Analysis by Sharing Information among Multiple Tasks

A Multi-Task Learning Strategy for Unsupervised Clustering Via Explicitly Separating the Commonality

U^2F^2S^2 : Uncovering Feature-level Similarities for Unsupervised Feature Selection

Discriminating Joint Feature Analysis for Multimedia Data Understanding

Adaptive Collaborative Correlation Learning-based Semi-Supervised Multi-Label Feature Selection

Adaptive Structure Discovery for Multimedia Analysis Using Multiple Features.

A Convex Formulation for Semi-Supervised Multi-Label Feature Selection.

DUAL DICTIONARY LEARNING FOR MINING A UNIFIED FEATURE SUBSPACE BETWEEN DIFFERENT HYPERSPECTRAL IMAGE SCENES

An Adaptive Semisupervised Feature Analysis for Video Semantic Recognition

Dynamic multi-label feature selection algorithm based on label importance and label correlation

Online Multi-label Streaming Feature Selection with Label Correlation

Label Correlations-Based Multi-Label Feature Selection with Label Enhancement

A Hybrid Feature Selection Approach by Correlation-Based Filters and SVM-RFE

Multi-Feature Fusion Via Hierarchical Regression for Multimedia Analysis

A-SFS: Semi-supervised Feature Selection based on Multi-task Self-supervision

Semisupervised Feature Selection via Structured Manifold Learning

Semi-Supervised Multiview Feature Selection With Adaptive Graph Learning

Online Multi-Label Streaming Feature Selection Based on Label Group Correlation and Feature Interaction