Abstract:News is a pertinent source of information on financial risks and stress factors, which nevertheless is challenging to harness due to the sparse and unstructured nature of natural text. We propose an approach based on distributional semantics and deep learning with neural networks to model and link text to a scarce set of bank distress events. Through unsupervised training, we learn semantic vector representations of news articles as predictors of distress events. The predictive model that we learn can signal coinciding stress with an aggregated index at bank or European level, while crucially allowing for automatic extraction of text descriptions of the events, based on passages with high stress levels. The method offers insight that models based on other types of data cannot provide, while offering a general means for interpreting this type of semantic-predictive model. We model bank distress with data on 243 events and 6.6M news articles for 101 large European banks.

What problem does this paper attempt to address?

### Problems the Paper Attempts to Solve This paper aims to identify and describe bank distress through news text. Specifically, the authors propose a method based on distributional semantics and deep learning to associate news articles with a small number of bank distress events, thereby achieving the prediction and description of bank distress. ### Main Background Issues 1. **Importance of Timely Information**: - The global financial crisis has triggered numerous regulatory innovations, but progress in timely obtaining information on bank vulnerabilities and risks has been limited. - Accounting data, although rich in information, has low reporting frequency and delayed release. - Market data can reflect imbalances, stress, and volatility but lacks descriptive information and is limited to publicly listed companies. 2. **Limitations of Existing Methods**: - Sentiment analysis typically relies on manually constructed sentiment lexicons, which are difficult to adapt and incomplete for specific tasks. - Data-driven methods, while providing good predictive performance, still have room for improvement in semantic modeling. 3. **Potential of Text Data**: - News text, as an important source for understanding bank distress, contains rich information, but its sparse and unstructured nature makes it difficult to utilize. ### Research Objectives - **Predict Bank Distress**: Extract semantic representations from news text using deep learning models to predict bank distress events. - **Describe Distress Events**: Not only provide quantitative prediction results but also automatically generate textual descriptions of distress events to enhance model interpretability. ### Method Overview 1. **Data Preparation**: - Use data from 101 large European banks, covering 243 distress events from 2007Q3 to 2012Q2. - Collect 6.6M news articles from Reuters online archives, identifying articles related to the target banks. 2. **Deep Learning Model**: - **Pre-training**: Use the Distributed Memory Model to learn document vectors, capturing semantic information in news reports. - **Supervised Learning**: Train a neural network model to predict bank distress events based on document vectors. 3. **Stress Index and Description Extraction**: - Generate a bank stress index by aggregating article-level stress scores. - Use trained semantic representations and prediction signal strength to extract highly relevant paragraphs and keywords from articles, providing detailed event descriptions. ### Experimental Results - **Predictive Performance**: The model achieved an area under the ROC curve of 0.710 on the test set, indicating good predictive ability. - **Stress Index**: The generated stress index effectively reflects the temporal dynamics of bank distress, especially during the 2008 financial crisis. - **Description Extraction**: By extracting high-ranking keywords and article paragraphs, the model provided detailed descriptions of distress events during specific periods, enhancing interpretability. ### Conclusion The method proposed in this paper not only predicts bank distress but also automatically generates textual descriptions of distress events, providing new tools and perspectives for understanding and responding to bank distress.

Detect & Describe: Deep learning of bank stress in the news

Bank distress in the news: Describing events through deep learning

Deep learning bank distress from news and numerical financial data

Deep Learning for Assessing Banks’ Distress from News and Numerical Financial Data

Management of Norovirus gastroenteritis in the community.

Predicting Distresses using Deep Learning of Text Segments in Annual Reports

Textual Data Mining for Financial Fraud Detection: A Deep Learning Approach

Why do banks fail? An investigation via text mining

A deep learning approach of financial distress recognition combining text

Predicting financial distress using current reports: A novel deep learning method based on user-response-guided attention

Deep Learning based Topic Analysis on Financial Emerging Event Tweets

Stress detection using natural language processing and machine learning over social interactions

Improving financial distress prediction using textual sentiment of annual reports

From Text to Bank Interrelation Maps

Research on Deep Learning-Based Financial Risk Prediction

A novel semisupervised learning method with textual information for financial distress prediction

Detection of Temporality at Discourse Level on Financial News by Combining Natural Language Processing and Machine Learning

Large-Scale Textual Datasets and Deep Learning for the Prediction of Depressed Symptoms

Can Text-Based Statistical Models Reveal Impending Banking Crises?

Interfacing learning methods for anomaly detection in multi-country financial stress indicators

A Novel Distributed Representation of News (DRNews) for Stock Market Predictions