Are Algorithms Directly Optimizing IR Measures Really Direct
Tie‐Yan Liu
2008-01-01
Abstract:In information retrieval (IR), the objective of ranking problem is to construct and return a ranked list of relevant documents to the user. The document ranking list is demanded to satisfy user’s information need as much as possible with respect to a user’s query. To evaluate the goodness of the returned document ranking list, performance measures, such as Normalized Discounted Cumulative Gain (NDCG) and Mean Average Precision (MAP), are adopted. Many learning to rank algorithms, which automatically learn ranking function through optimizing specially designed objective functions, are proposed to resolve the ranking problem. Intuitively, the IR performance measures are the ideal objective functions to be optimized to learn ranking function. However, IR performance measures, such as NDCG and MAP, are non-smooth and non-differentiable with respect to the ranking function parameter. Thus, most existing learning to rank algorithms are designed to optimize objective functions that are loosely related to the IR performance measures. As a result, such algorithms may only achieve sub-optimization of the IR performance measures even they can perform very well on optimizing their adopted objective functions. Therefore, it is highly demanded that learning to rank algorithms should be improved to be able to directly or approximately directly optimize information retrieval performance measures. To tackle the challenge of direct optimization of IR performance measures, several approaches, such as SoftRank and SVM-MAP are proposed. Although these algorithms can achieve good empirical performance, there are still some questions that are unclear and not yet answered: a) can ranking function learned by direct optimization of IR performance measures still perform well over unseen queries with respect to the optimized IR performance measures? b) how directly are IR performance measures optimized by the proposed approaches? In this report, we will attempt to answer the above questions. We first point out that, under some conditions, the ranking function learned by direct optimization of IR performance measures can also perform well upon unseen queries with respect to the optimized IR performance measures. Then, to study how directly IR performancemeasures are optimized by previous approaches, we proposed a directness evaluate metric. Based on this metric, SoftRank is analyzed and corresponding results are presented.