Sparse learning of maximum likelihood model for optimization of complex loss function

Ning Zhang,Prathamesh Chandrasekar
DOI: https://doi.org/10.48550/arXiv.1511.05743
2015-11-18
Abstract:Traditional machine learning methods usually minimize a simple loss function to learn a predictive model, and then use a complex performance measure to measure the prediction performance. However, minimizing a simple loss function cannot guarantee that an optimal performance. In this paper, we study the problem of optimizing the complex performance measure directly to obtain a predictive model. We proposed to construct a maximum likelihood model for this problem, and to learn the model parameter, we minimize a com- plex loss function corresponding to the desired complex performance measure. To optimize the loss function, we approximate the upper bound of the complex loss. We also propose impose the sparsity to the model parameter to obtain a sparse model. An objective is constructed by combining the upper bound of the loss function and the sparsity of the model parameter, and we develop an iterative algorithm to minimize it by using the fast iterative shrinkage- thresholding algorithm framework. The experiments on optimization on three different complex performance measures, including F-score, receiver operating characteristic curve, and recall precision curve break even point, over three real-world applications, aircraft event recognition of civil aviation safety, in- trusion detection in wireless mesh networks, and image classification, show the advantages of the proposed method over state-of-the-art methods.
Machine Learning
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is that traditional machine - learning methods optimize simple loss functions (such as hinge loss, logistic loss, etc.) during the training process, but when using complex performance metrics (such as F - score, area under the ROC curve (AUROC), recall - precision break - even point (RPBEP), etc.) to evaluate prediction results during the testing process, optimal performance cannot be guaranteed. Specifically, optimizing simple loss functions does not directly optimize the required complex performance metrics, which leads to poor performance of the model in practical applications. To solve this problem, the author proposes a new method to obtain a prediction model by directly optimizing the complex loss function corresponding to the required complex performance metric. Specifically, the author constructs a maximum - likelihood model and learns the model parameters by minimizing the complex loss function. To ensure the sparsity of the model, the author also introduces a sparse constraint on the parameters. In addition, the author proposes an iterative algorithm to minimize the objective function, which is based on the fast iterative shrinkage - thresholding algorithm (FISTA) framework. ### Main contributions 1. **Propose for the first time** the use of a maximum - likelihood model to construct a prediction model to optimize complex losses. 2. **Construct a new optimization problem** that takes into account both the sparsity of the model and the minimization of complex losses. 3. **Develop a new iterative algorithm** to optimize the proposed minimization problem and propose a new method to approximate the upper bound of complex losses. The upper bound of complex losses is represented as a logarithmic function and optimized by the FISTA algorithm. ### Experimental verification The author verifies the effectiveness of the proposed method in three different practical application scenarios: - **Aviation event recognition**: Used to predict whether an aircraft landing is normal or abnormal. - **Wireless mesh network intrusion detection**: Used to detect network attacks. - **Image classification**: Use the large - scale image dataset ImageNet for multi - class classification. The experimental results show that the proposed method is superior to several existing advanced methods in optimizing complex performance metrics such as F - score, AUROC, and RPBEP, especially when dealing with large - scale data. ### Conclusion This paper significantly improves the performance of prediction models on complex performance metrics by directly optimizing complex loss functions and introducing sparse constraints. This method has been verified in multiple practical application scenarios, demonstrating its advantages in optimizing complex performance metrics.