What are You Saying? Using Topic to Detect Financial Misreporting

Nerissa C. Brown,W. Brooke Elliott
DOI: https://doi.org/10.2139/ssrn.2803733
2016-01-01
SSRN Electronic Journal
Abstract:This study uses a machine learning technique to assess whether the thematic content of financial statement disclosures (labeled as topic) is incrementally informative in predicting intentional misreporting. Using a Bayesian topic modeling algorithm, we determine and empirically quantify the topic content of a large collection of 10-K narratives spanning the 1994 to 2012 period. We find that the algorithm produces a valid set of semantically meaningful topics that are predictive of financial misreporting based on samples of SEC enforcement actions (AAERs) and reporting irregularities identified from financial restatements and 10-K filing amendments. Our out-of-sample tests indicate that topic significantly improves the detection of financial misreporting by as much as 59% when added to models based on commonly-used financial and textual style variables. Furthermore, models that incorporate topic as an additional predictor significantly outperform traditional models when detecting long-duration misreporting events. Taken together, our results suggest that the content of annual report narratives and the attention devoted to each topic are useful signals in detecting financial misreporting.
What problem does this paper attempt to address?