Actuarial Applications of Natural Language Processing Using Transformers: Case Studies for Using Text Features in an Actuarial Context

Andreas Troxler,Jürg Schelldorfer

2023-09-25

Abstract:This tutorial demonstrates workflows to incorporate text data into actuarial classification and regression tasks. The main focus is on methods employing transformer-based models. A dataset of car accident descriptions with an average length of 400 words, available in English and German, and a dataset with short property insurance claims descriptions are used to demonstrate these techniques. The case studies tackle challenges related to a multi-lingual setting and long input sequences. They also show ways to interpret model output, to assess and improve model performance, by fine-tuning the models to the domain of application or to a specific prediction task. Finally, the tutorial provides practical approaches to handle classification tasks in situations with no or only few labeled data, including but not limited to ChatGPT. The results achieved by using the language-understanding skills of off-the-shelf natural language processing (NLP) models with only minimal pre-processing and fine-tuning clearly demonstrate the power of transfer learning for practical applications.

Computation and Language

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to use natural language processing (NLP) techniques, especially Transformer - based models, to process text data in actuarial science and apply it to classification and regression tasks. Specifically, the paper focuses on the following aspects: 1. **Challenges in a multilingual environment**: How to maintain the effectiveness and accuracy of the model when processing text data containing multiple languages (such as English and German). 2. **Processing of long input sequences**: How to effectively process long - text inputs, for example, descriptions of traffic accidents with an average length of 400 words. 3. **Model interpretability**: How to improve the interpretability of the model to make the prediction or classification results of the model more transparent, which is especially important for actuarial science that requires transparent decision - making. 4. **Situations with a small amount of labeled data**: How to handle classification tasks when the labeled data is limited, including but not limited to using pre - trained models such as ChatGPT for information extraction. 5. **Model performance evaluation and improvement**: How to evaluate and improve model performance by fine - tuning the model to adapt to specific application areas or specific prediction tasks. The paper demonstrates the application of these techniques through two actual datasets: - **Automobile accident description dataset**: It contains approximately 7,000 automobile accident descriptions in English (partially translated into German), as well as some tabular data (such as the number of vehicles involved, whether there are any casualties, etc.). - **Property insurance claim record dataset**: It contains approximately 6,000 property insurance claim records. Each record includes the claim amount, a brief English description, and 9 different types of disaster types. Through these case studies, the paper shows how to use Transformer models to deal with the above - mentioned challenges and provides practical methods to address problems in practical applications.

Actuarial Applications of Natural Language Processing Using Transformers: Case Studies for Using Text Features in an Actuarial Context

Enhancing Actuarial Non-Life Pricing Models via Transformers

A survey on natural language processing (nlp) and applications in insurance

A Survey of Text Classification With Transformers: How Wide? How Large? How Long? How Accurate? How Expensive? How Safe?

Transformers and large language models in healthcare: A review

Transformers in health: a systematic review on architectures for longitudinal data analysis

Database Tuning using Natural Language Processing

Fighting crime with Transformers: Empirical analysis of address parsing methods in payment data

Natural language processing with transformers: a review

Analysing similarities between legal court documents using natural language processing approaches based on Transformers

Combining Transformers with Natural Language Explanations

Transformers-sklearn: a toolkit for medical language understanding with transformer-based models

Large Scale Legal Text Classification Using Transformer Models

Anatomy of Neural Language Models

Comparative Analysis of Transformers for Modeling Tabular Data: A Casestudy using Industry Scale Dataset

Transformer Models in Healthcare: A Survey and Thematic Analysis of Potentials, Shortcomings and Risks

Exploring Automatic Text Simplification of German Narrative Documents

Data Mining in Clinical Trial Text: Transformers for Classification and Question Answering Tasks

Legal Transformer Models May Not Always Help

HuggingFace's Transformers: State-of-the-art Natural Language Processing

Transformers in Time-series Analysis: A Tutorial