A Mathematical Information Retrieval System Based On Rankboost

Ke Yuan,Liangcai Gao,Yuehan Wang,Xiaohan Yi,Zhi Tang
DOI: https://doi.org/10.1145/2910896.2925460
2016-01-01
Abstract:Mathematical Information Retrieval (MIR) systems are designed to help users to find related formulae and further understand the formulae in scientific documents. However, in existing MIR systems, nearly all the ranker models of MIR systems are based on tf-idf model, and few efforts have been made to discover the features besides the relevance between the query formula and related formulae. In this paper, we investigate a supervised ranking approach (RankBoost) in an MIR system, and we consider not only the relevance between a query formula and related formulae, but also the features of the query formula itself and plentiful features about the documents where the related formulae appear. Experimental results show that our system achieves better performance by comparing with state-of-the-art MIR systems.
What problem does this paper attempt to address?