BioFinBERT: Finetuning Large Language Models (LLMs) to Analyze Sentiment of Press Releases and Financial Text Around Inflection Points of Biotech Stocks

Valentina Aparicio,Daniel Gordon,Sebastian G. Huayamares,Yuhuai Luo
2024-01-20
Abstract:Large language models (LLMs) are deep learning algorithms being used to perform natural language processing tasks in various fields, from social sciences to finance and biomedical sciences. Developing and training a new LLM can be very computationally expensive, so it is becoming a common practice to take existing LLMs and finetune them with carefully curated datasets for desired applications in different fields. Here, we present BioFinBERT, a finetuned LLM to perform financial sentiment analysis of public text associated with stocks of companies in the biotechnology sector. The stocks of biotech companies developing highly innovative and risky therapeutic drugs tend to respond very positively or negatively upon a successful or failed clinical readout or regulatory approval of their drug, respectively. These clinical or regulatory results are disclosed by the biotech companies via press releases, which are followed by a significant stock response in many cases. In our attempt to design a LLM capable of analyzing the sentiment of these press releases,we first finetuned BioBERT, a biomedical language representation model designed for biomedical text mining, using financial textual databases. Our finetuned model, termed BioFinBERT, was then used to perform financial sentiment analysis of various biotech-related press releases and financial text around inflection points that significantly affected the price of biotech stocks.
General Finance,Computational Finance,Trading and Market Microstructure
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to use large - language models (LLMs) to conduct sentiment analysis on biotech companies' press releases and financial reports in order to quantify and predict the stock price movements of these companies. Specifically, researchers are concerned with how biotech companies' stock prices respond after they release important clinical, regulatory, financial or commercial events. These events are usually announced through press releases and may lead to significant increases or decreases in stock prices. The main objectives of the study include: 1. **Evaluating FinBERT**: First, use FinBERT to conduct sentiment analysis on press releases and 10Q reports before and after important turning points to determine whether it can accurately predict stock price movements. 2. **Fine - tuning BioBERT**: Further train BioBERT to enable it to have the ability of financial sentiment analysis, especially its performance in handling press releases containing clinical and biomedical terms. 3. **Proposing a trading strategy**: Based on the sentiment analysis results of positive value - driven events in press releases, design and back - test a trading strategy and evaluate the performance of this strategy under different holding periods. Through these objectives, the study aims to explore how to use large - language models to improve the understanding and prediction ability of biotech stock market dynamics.