Expanding boundaries of Gap Safe screening

Cassio F. Dantas,Emmanuel Soubies,Cédric Févotte
DOI: https://doi.org/10.48550/arXiv.2102.10846
2023-01-06
Abstract:Sparse optimization problems are ubiquitous in many fields such as statistics, signal/image processing and machine learning. This has led to the birth of many iterative algorithms to solve them. A powerful strategy to boost the performance of these algorithms is known as safe screening: it allows the early identification of zero coordinates in the solution, which can then be eliminated to reduce the problem's size and accelerate convergence. In this work, we extend the existing Gap Safe screening framework by relaxing the global strong-concavity assumption on the dual cost function. Instead, we exploit local regularity properties, that is, strong concavity on well-chosen subsets of the domain. The non-negativity constraint is also integrated to the existing framework. Besides making safe screening possible to a broader class of functions that includes beta-divergences (e.g., the Kullback-Leibler divergence), the proposed approach also improves upon the existing Gap Safe screening rules on previously applicable cases (e.g., logistic regression). The proposed general framework is exemplified by some notable particular cases: logistic function, beta = 1.5 and Kullback-Leibler divergences. Finally, we showcase the effectiveness of the proposed screening rules with different solvers (coordinate descent, multiplicative-update and proximal gradient algorithms) and different data sets (binary classification, hyperspectral and count data).
Machine Learning,Signal Processing,Optimization and Control
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is how to expand the application scope of the Gap Safe screening framework, especially by relaxing the assumption of global strong concavity of the dual cost function. Specifically, the authors introduce the local regularity property (i.e., strong concavity on selected sub - domains) and combine it with non - negativity constraints, so as to make the Gap Safe screening rule applicable to a broader class of functions, including β - divergences (such as Kullback - Leibler divergence). In addition, they also improve the existing Gap Safe screening rules to make them perform better in previously applicable cases (such as logistic regression). ### Core Contributions of the Paper 1. **Relaxing the Global Strong Concavity Assumption**: The authors propose to utilize the local regularity property, that is, strong concavity on specific sub - domains, instead of requiring the entire dual cost function to be globally strongly concave. This enables the Gap Safe screening rule to be applied to a wider class of functions. 2. **Introducing Non - negativity Constraints**: Incorporate non - negativity constraints into the existing framework, so that the Gap Safe screening rule can handle problems with non - negativity constraints. 3. **Improving Existing Screening Rules**: For cases where the Gap Safe screening rule has been applicable (such as logistic regression), the new method provides better performance. 4. **Application Examples**: The paper demonstrates the effectiveness of the new method on different solvers (such as coordinate descent, multiplicative update, and proximal gradient algorithms) and different types of datasets (such as binary classification, hyperspectral, and count data). ### Mathematical Formula Representation To ensure the correctness and readability of the formulas, the following are some key formulas involved in the paper: - **Original Form of the Optimization Problem**: \[ x^\star \in \arg\min_{x \in C} P_\lambda(x) := F(Ax) + \lambda \Omega(x) \] where \(A\in\mathbb{R}^{m\times n}\), \(F:\mathbb{R}^m\rightarrow\mathbb{R}\), \(\Omega:\mathbb{R}^n\rightarrow\mathbb{R}_+\), \(C\subseteq\mathbb{R}^n\), and \(\lambda > 0\). - **Dual Problem**: \[ \theta^\star=\arg\max_{\theta\in\Delta_A} D_\lambda(\theta) := -\sum_{i = 1}^m f_i^*(-\lambda\theta_i) \] where \(\Delta_A=\{\theta\in\mathbb{R}^m\mid\forall g\in G,\Omega_g(\varphi(A_g^T\theta))\leq1\}\cap\text{dom}(D_\lambda)\). - **Gap Safe Sphere**: \[ B(\theta, r),\quad\text{where}\quad r = \sqrt{\frac{2\text{Gap}_\lambda(x,\theta)}{\alpha_S}} \] Here \(\alpha_S\) is the strong concavity constant of \(D_\lambda\) on the sub - domain \(S\). Through these improvements, the authors significantly expand the application scope of the Gap Safe screening rule and improve its performance on a variety of problems.