Surveying Stylometry Techniques and Applications
Tempestt Neal,Kalaivani Sundararajan,Aneez Fatima,Yiming Yan,Yingfei Xiang,Damon Woodard
DOI: https://doi.org/10.1145/3132039
IF: 16.6
2018-11-30
ACM Computing Surveys
Abstract:The analysis of authorial style, termed stylometry, assumes that style is quantifiably measurable for evaluation of distinctive qualities. Stylometry research has yielded several methods and tools over the past 200 years to handle a variety of challenging cases. This survey reviews several articles within five prominent subtasks: authorship attribution, authorship verification, authorship profiling, stylochronometry, and adversarial stylometry. Discussions on datasets, features, experimental techniques, and recent approaches are provided. Further, a current research challenge lies in the inability of authorship analysis techniques to scale to a large number of authors with few text samples. Here, we perform an extensive performance analysis on a corpus of 1,000 authors to investigate authorship attribution, verification, and clustering using 14 algorithms from the literature. Finally, several remaining research challenges are discussed, along with descriptions of various open-source and commercial software that may be useful for stylometry subtasks.
computer science, theory & methods