"Don't quote me on that": Finding Mixtures of Sources in News Articles

Alexander Spangher,Nanyun Peng,Jonathan May,Emilio Ferrara
DOI: https://doi.org/10.48550/arXiv.2104.09656
2021-04-19
Computation and Language
Abstract:Journalists publish statements provided by people, or \textit{sources} to contextualize current events, help voters make informed decisions, and hold powerful individuals accountable. In this work, we construct an ontological labeling system for sources based on each source's \textit{affiliation} and \textit{role}. We build a probabilistic model to infer these attributes for named sources and to describe news articles as mixtures of these sources. Our model outperforms existing mixture modeling and co-clustering approaches and correctly infers source-type in 80\% of expert-evaluated trials. Such work can facilitate research in downstream tasks like opinion and argumentation mining, representing a first step towards machine-in-the-loop \textit{computational journalism} systems.
What problem does this paper attempt to address?