Renard: A Modular Pipeline for Extracting Character Networks from Narrative Texts

Arthur Amalvy,Vincent Labatut,Richard Dufour
DOI: https://doi.org/10.21105/joss.06574
2024-07-02
Abstract:Renard (Relationships Extraction from NARrative Documents) is a Python library that allows users to define custom natural language processing (NLP) pipelines to extract character networks from narrative texts. Contrary to the few existing tools, Renard can extract dynamic networks, as well as the more common static networks. Renard pipelines are modular: users can choose the implementation of each NLP subtask needed to extract a character network. This allows users to specialize pipelines to particular types of texts and to study the impact of each subtask on the extracted network.
Computation and Language
What problem does this paper attempt to address?
The problem addressed in this paper is how to automatically and effectively extract static and dynamic role relationship networks from narrative texts and allow users to customize the natural language processing (NLP) workflow according to specific requirements. Existing tools mostly only extract static networks and cannot comprehensively study the impact of errors in NLP tasks on the quality of network extraction. The paper introduces a Python library called Renard, which is a modular pipeline system capable of processing multiple types of texts and allows users to choose the implementation of each NLP subtask to optimize performance and study the impact of different steps on the quality of network extraction. Renard aims to serve digital humanities researchers and NLP researchers and practitioners, helping them quickly extract role networks for literary analysis or as input for other NLP tasks.