Abstract:We apply the network Lasso to classify partially labeled data points which are characterized by high-dimensional feature vectors. In order to learn an accurate classifier from limited amounts of labeled data, we borrow statistical strength, via an intrinsic network structure, across the dataset. The resulting logistic network Lasso amounts to a regularized empirical risk minimization problem using the total variation of a classifier as a regularizer. This minimization problem is a non-smooth convex optimization problem which we solve using a primal-dual splitting method. This method is appealing for big data applications as it can be implemented as a highly scalable message passing algorithm.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to perform classification in partially labeled network data. Specifically, the author focuses on data points represented by high - dimensional feature vectors, which are inter - related through the inherent network structure. Due to the limited labeled data, traditional classification methods may not be able to provide sufficient statistical strength to learn an accurate classifier. Therefore, this paper proposes a new method - **Logistic Network Lasso (lnLasso)** - to utilize the network structure of the data to enhance classification performance. ### Specific description of the problem 1. **Partially labeled data**: In practical applications, obtaining a large amount of labeled data is usually expensive and time - consuming. Therefore, how to learn an effective classifier from a small amount of labeled data is an important challenge. 2. **High - dimensional feature vectors**: Each data point is represented by a high - dimensional feature vector, which makes the classification task more complex. 3. **Network structure**: There is an inherent network structure among data points (such as social networks, literature citation networks, etc.), and this structure can provide additional information for classification. ### Solutions To address the above challenges, the paper proposes the following solutions: - **Logistic Network Lasso (lnLasso)**: This method learns the classifier by minimizing the empirical risk with regularization. Among them, the regularization term uses the total variation (TV) of the classifier to ensure that the classifier is approximately constant on closely connected sub - graphs (clusters). Specifically, the optimization problem can be expressed as: \[ \hat{w} \in \arg\min_{w \in C} \hat{E}(w)+\lambda\|w\|_{TV} \] where: - \(\hat{E}(w)\) is the empirical risk, which measures the error of the classifier on the training set. - \(\|w\|_{TV}=\sum_{\{i, j\} \in E} A_{ij}\|w(j)-w(i)\|\) is the total variation regularization term, which measures the difference of the classifier on adjacent nodes. - \(\lambda\) is the regularization parameter, which is used to balance the empirical risk and the regularization term. - **Large - scale scalability**: To handle large - scale data sets, the paper proposes an efficient solution algorithm based on the primal - dual splitting method. This algorithm can be implemented on the network structure through message passing and has good scalability. ### Main contributions 1. **Novel implementation method**: The efficient solution of Logistic Network Lasso is achieved by applying the primal - dual method. 2. **Convergence proof**: The convergence of the proposed primal - dual method is proved. 3. **Experimental verification**: The effectiveness of this method is verified through data sets with chain - like and grid - like structures. In conclusion, this paper aims to propose an efficient classification method by combining network structure and partially labeled data to meet the challenges brought by high - dimensional features and limited labeled data.

Classifying Partially Labeled Networked Data via Logistic Network Lasso

Analysis of Network Lasso for Semi-Supervised Regression

Regularized Multinomial Regression Method for Hyperspectral Data Classification Via Pathwise Coordinate Optimization

On the Duality between Network Flows and Network Lasso

Network linear discriminant analysis

Network Lasso: Clustering and Optimization in Large Graphs

Penalized polytomous ordinal logistic regression using cumulative logits. Application to network inference of zero-inflated variables

A Bayesian approach to multi-task learning with network lasso

The joint graphical lasso for inverse covariance estimation across multiple classes

Robust adaptive LASSO in high-dimensional logistic regression

LOGISTIC REGRESSION WITH NETWORK STRUCTURE

Single-Label Multi-Class Image Classification by Deep Logistic Regression

OT-LLP: Optimal Transport for Learning from Label Proportions

High Dimensional Logistic Regression Under Network Dependence

Differential Network Analysis via the Lasso Penalized D-Trace Loss

High-dimensional classification by sparse logistic regression

Discriminative Subnetworks with Regularized Spectral Learning for Global-State Network Data

Differential Network Analysis Via Lasso Penalized D-trace Loss

Weighted Lasso Estimates for Sparse Logistic Regression: Non-Asymptotic Properties with Measurement Errors

Robust and Efficient Network Reconstruction in Complex System via Adaptive Signal Lasso

Class-Distributed Learning for Multinomial Logistic Regression with High Dimensional Features and a Large Number of Classes