Who Wrote the Web? Revisiting Influential Author Identification Research Applicable to Information Retrieval
Martin Potthast,Sarah Braun,Tolga Buz,Fabian Duffhauss,Florian Friedrich,Jörg Marvin Gülzow,Jakob Köhler,Winfried Lötzsch,Fabian Müller,Maike Elisa Müller,Robert Paßmann,Bernhard Reinke,Lucas Rettenmeier,Thomas Rometsch,Timo Sommer,Michael Träger,Sebastian Wilhelm,Benno Stein,Efstathios Stamatatos,Matthias Hagen
DOI: https://doi.org/10.1007/978-3-319-30671-1_29
2016-01-01
Abstract:In this paper, we revisit author identification research by conducting a new kind of large-scale reproducibility study: we select 15 of the most influential papers for author identification and recruit a group of students to reimplement them from scratch. Since no open source implementations have been released for the selected papers to date, our public release will have a significant impact on researchers entering the field. This way, we lay the groundwork for integrating author identification with information retrieval to eventually scale the former to the web. Furthermore, we assess the reproducibility of all reimplemented papers in detail, and conduct the first comparative evaluation of all approaches on three well-known corpora.