Modeling the author bias between two on-line computer science citation databases

Vaclav Petricek,Ingemar J. Cox,Hui Han,Isaac G. Councill,C. Lee Giles
DOI: https://doi.org/10.1145/1062745.1062869
2005-01-01
Abstract:We examine the difference and similarities between two on-line computer science citation databases DBLP and CiteSeer. The database entries in DBLP are inserted manually while the CiteSeer entries are obtained autonomously. We show that the CiteSeer database contains considerably fewer single author papers. This bias can be modeled by an exponential process with intuitive explanation. The model permits us to predict that the DBLP database covers approximately 30% of the entire literature of Computer Science.
What problem does this paper attempt to address?