Vandalism Detection in Wikipedia: a Bag-of-Words Classifier Approach

Amit Belani
DOI: https://doi.org/10.48550/arXiv.1001.0700
2010-01-05
Abstract:A bag-of-words based probabilistic classifier is trained using regularized logistic regression to detect vandalism in the English Wikipedia. Isotonic regression is used to calibrate the class membership probabilities. Learning curve, reliability, ROC, and cost analysis are performed.
Machine Learning,Computers and Society,Information Retrieval
What problem does this paper attempt to address?