Learning to Predict Citation-Based Impact Measures

Luca Weihs,Oren Etzioni
DOI: https://doi.org/10.1109/jcdl.2017.7991559
2017-06-01
Abstract:Citations implicitly encode a community's judgment of a paper's importance and thus provide a unique signal by which to study scientific impact. Efforts in understanding and refining this signal are reflected in the probabilistic modeling of citation networks and the proliferation of citation-based impact measures such as Hirsch's h-index. While these efforts focus on understanding the past and present, they leave open the question of whether scientific impact can be predicted into the future. Recent work addressing this deficiency has employed linear and simple probabilistic models; we show that these results can be handily outperformed by leveraging non-linear techniques. In particular, we find that these AI methods can predict measures of scientific impact for papers and authors, namely citation rates and h-indices, with surprising accuracy, even 10 years into the future. Moreover, we demonstrate how existing probabilistic models for paper citations can be extended to better incorporate refined prior knowledge. While predictions of''scientific impact“ should be approached with healthy skepticism, our results improve upon prior efforts and form a baseline against which future progress can be easily judged.
What problem does this paper attempt to address?