Two-pass AUC optimization

Xun LUAN,Wei GAO
DOI: https://doi.org/10.11992/tis.201706079
2018-01-01
Abstract:The area under an ROC curve (AUC) has been an important performance index for class-imbalanced learning,cost-sensitive learning,learning to rank,etc.Traditional AUC optimization requires the entire dataset to be stored because AUC is defined as pairs of positive and negative instances.To solve this problem,the one-pass AUC (OPAUC) algorithm was introduced previously to scan the data only once and store the first-and second-order statistics.However,in many real applications,the second-order statistics require high storage and are computationally costly,especially for high-dimensional datasets.We introduce the two-pass AUC (TPAUC) optimization to calculate the mean of positive and negative instances in the first pass and then use the stochastic gradient descent method in the second pass.The new algorithm requires the storage of the first-order statistics but not the second-order statistics;hence,the efficiency is improved.Finally,experiments are used to verify the effectiveness of the proposed algorithm.
What problem does this paper attempt to address?