Abstract:Current social science efforts automatically populate event databases of "who did what to whom?" tuples, by applying event extraction (EE) to text such as news. The event databases are used to analyze sociopolitical dynamics between actor pairs (dyads) in, e.g., international relations. While most EE methods heavily rely on rules or supervised learning, \emph{zero-shot} event extraction could potentially allow researchers to flexibly specify arbitrary event classes for new research questions. Unfortunately, we find that current zero-shot EE methods, as well as a naive zero-shot approach of simple generative language model (LM) prompting, perform poorly for dyadic event extraction; most suffer from word sense ambiguity, modality sensitivity, and computational inefficiency. We address these challenges with a new fine-grained, multi-stage instruction-following generative LM pipeline, proposing a Monte Carlo approach to deal with, and even take advantage of, nondeterminism of generative outputs. Our pipeline includes explicit stages of linguistic analysis (synonym generation, contextual disambiguation, argument realization, event modality), \textit{improving control and interpretability} compared to purely neural methods. This method outperforms other zero-shot EE approaches, and outperforms naive applications of generative LMs by at least 17 F1 percent points. The pipeline's filtering mechanism greatly improves computational efficiency, allowing it to perform as few as 12% of queries that a previous zero-shot method uses. Finally, we demonstrate our pipeline's application to dyadic international relations analysis.

Political DEBATE: Efficient Zero-shot and Few-shot Classifiers for Political Text

Selecting Between BERT and GPT for Text Classification in Political Science Research

Automatic Text Classification With Large Language Models: A Review of openai for Zero- and Few-Shot Classification

Deciphering Political Entity Sentiment in News with Large Language Models: Zero-Shot and Few-Shot Strategies

Analysis of Socially Unacceptable Discourse with Zero-shot Learning

PoliPrompt: A High-Performance Cost-Effective LLM-Based Text Classification Framework for Political Science

Political Footprints: Political Discourse Analysis using Pre-Trained Word Vectors

A Monte Carlo Language Model Pipeline for Zero-Shot Sociopolitical Event Extraction

Deceptively simple: An outsider's perspective on natural language processing

Leveraging Codebook Knowledge with NLI and ChatGPT for Zero-Shot Political Relation Classification

Small Language Models are Good Too: An Empirical Study of Zero-Shot Classification

A Context-Aware Approach for Detecting Check-Worthy Claims in Political Debates

The Power of LLM-Generated Synthetic Data for Stance Detection in Online Political Discussions

Balancing Transparency and Accuracy: A Comparative Analysis of Rule-Based and Deep Learning Models in Political Bias Classification

Detecting Statements in Text: A Domain-Agnostic Few-Shot Solution

Automated stance detection in complex topics and small languages: The challenging case of immigration in polarizing news media

Examining Political Rhetoric with Epistemic Stance Detection

From Experts to the Public: Governing Multimodal Language Models in Politically Sensitive Video Analysis

Topic Classification for Political Texts with Pretrained Language Models

Zero is Not Hero Yet: Benchmarking Zero-Shot Performance of LLMs for Financial Tasks