Learning to Rank for Biomedical Information Retrieval
Bo Xu,Hongfei Lin,Yuan Lin,Yunlong Ma,Liang Yang,Jian Wang,Zhihao Yang
DOI: https://doi.org/10.1109/bibm.2015.7359729
2015-01-01
Abstract:Research articles in biomedicine domain have increased exponentially, which makes it more and more difficult for biologists to manually capture all the information they need. Information retrieval technologies can help to obtain the users' needed information automatically. However, it is a great challenge to apply these technologies to biomedicine domain directly because of some domain specific characteristics, such as the abundance of terminologies. To enhance the effectiveness of the biomedical information retrieval, we propose a novel framework based on the state-of-the-art information retrieval methods, called learning to rank, which has been proved effective to rank documents based on their relevance degree. In the framework, we attempt to tackle the problem of the abundance of terminologies by constructing ranking models, which focus on not only retrieving the most relevant documents but also diversifying the searching results to increase the completeness of the resulting list for a given query. In the model training, we propose two novel document labeling strategies, and combine several traditional retrieval models as learning features. Besides, we also investigate the usefulness of different learning to rank approaches in our framework. Experimental results on TREC Genomics datasets demonstrate our proposed framework is effective in improving the performance of biomedical information retrieval.