Multi-Task Learning with Group-Specific Feature Space Sharing

Niloofar Yousefi,Michael Georgiopoulos,Georgios C. Anagnostopoulos
DOI: https://doi.org/10.48550/arXiv.1508.03329
2015-08-14
Abstract:When faced with learning a set of inter-related tasks from a limited amount of usable data, learning each task independently may lead to poor generalization performance. Multi-Task Learning (MTL) exploits the latent relations between tasks and overcomes data scarcity limitations by co-learning all these tasks simultaneously to offer improved performance. We propose a novel Multi-Task Multiple Kernel Learning framework based on Support Vector Machines for binary classification tasks. By considering pair-wise task affinity in terms of similarity between a pair's respective feature spaces, the new framework, compared to other similar MTL approaches, offers a high degree of flexibility in determining how similar feature spaces should be, as well as which pairs of tasks should share a common feature space in order to benefit overall performance. The associated optimization problem is solved via a block coordinate descent, which employs a consensus-form Alternating Direction Method of Multipliers algorithm to optimize the Multiple Kernel Learning weights and, hence, to determine task affinities. Empirical evaluation on seven data sets exhibits a statistically significant improvement of our framework's results compared to the ones of several other Clustered Multi-Task Learning methods.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in multi - task learning (MTL), how to effectively use a limited amount of data to improve the generalization performance of multiple related tasks. Specifically, the author points out that traditional MTL methods assume that all tasks have similar correlations and can share information equally, which may lead to "negative transfer" in practical applications, that is, information sharing between unrelated tasks will instead reduce the generalization performance. To solve this problem, the author proposes a new multi - task multiple kernel learning framework (MT - MKL), which is based on support vector machines (SVMs) for binary classification tasks. By considering the pair - wise task affinity, that is, the similarity of the feature spaces of task pairs, this framework can flexibly determine which task pairs should share a common feature space, thereby improving the overall performance. ### Main contributions 1. **Flexibility**: Compared with traditional methods, this framework allows the feature spaces between task pairs to be either similar or different, providing greater flexibility. 2. **Optimization method**: Use the block coordinate descent method combined with the alternating direction method of multipliers (ADMM) in the consensus form to optimize the MKL weights, thereby determining the task affinity. 3. **Experimental verification**: Experiments on seven datasets have proven that this framework has a significant performance improvement compared with other clustered multi - task learning (CMTL) methods. ### Mathematical formula representation - The affinity between task pairs is measured by the feature space similarity, and the formula is as follows: \[ \|\theta_t - \theta_s\|^2 \] where $\theta_t$ and $\theta_s$ represent the feature space weight vectors of task $t$ and task $s$ respectively. - Regularized risk minimization problem: \[ \min_{w \in \Omega(w), \theta \in \Omega(\theta), b} \sum_{t = 1}^T \|w_t\|_2^2 + C \sum_{t = 1}^T \sum_{i = 1}^n [1 - y_i^t f_t(x_i^t)]_+ + \lambda \sum_{t = 1}^{T - 1} \sum_{s > t} \|\theta_t - \theta_s\|^2 \] where $[u]_+=\max(u, 0)$ represents the hinge loss function, and $C$ and $\lambda$ are non - negative regularization parameters. Through this method, the new framework proposed in the paper can handle the relationships between tasks more flexibly in multi - task learning, thereby improving the generalization performance.