GPT-Signal: Generative AI for Semi-automated Feature Engineering in the Alpha Research Process

Yining Wang,Jinman Zhao,Yuri Lawryshyn
2024-10-24
Abstract:In the trading process, financial signals often imply the time to buy and sell assets to generate excess returns compared to a benchmark (e.g., an index). Alpha is the portion of an asset's return that is not explained by exposure to this benchmark, and the alpha research process is a popular technique aiming at developing strategies to generate alphas and gain excess returns. Feature Engineering, a significant pre-processing procedure in machine learning and data analysis that helps extract and create transformed features from raw data, plays an important role in algorithmic trading strategies and the alpha research process. With the recent development of Generative Artificial Intelligence(Gen AI) and Large Language Models (LLMs), we present a novel way of leveraging GPT-4 to generate new return-predictive formulaic alphas, making alpha mining a semi-automated process, and saving time and energy for investors and traders.
Computational Engineering, Finance, and Science
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: during the financial trading process, by automatically generating new financial signals (i.e., alpha signals) that can predict stock returns, in order to improve the excess return rates of investors and traders. Specifically, the paper aims to utilize large - language models (LLMs), especially GPT - 4, to semi - automate feature engineering, thereby generating new and meaningful financial signals and evaluating the performance of these new signals. ### Specific Background of the Problem 1. **Limitations of Traditional Financial Signals** - Traditional financial signals such as price - to - earnings ratio (P/E), price - to - book ratio (P/B), return on equity (ROE), etc., although widely used in stock market analysis and prediction, these signals are relatively mature and it is difficult to further mine new information. 2. **Importance of Feature Engineering** - Feature engineering plays an important role in machine learning and data analysis, especially in the process of algorithmic trading strategies and alpha research. It helps to extract and create transformed features from raw data, thereby improving the predictive ability of the model. 3. **Challenges of Existing Methods** - Historically, the feature engineering and formulaic alpha research process rely on human intuition and experience or complex algorithms, which may lead to excessive subjectivity or be too time - consuming, and require in - depth professional knowledge. ### Solutions in the Paper The paper proposes an innovative method, using generative artificial intelligence (Gen AI) and large - language models (LLMs), especially GPT - 4, to automatically generate new stock return prediction signals. The advantages of this method include: 1. **Automatically Generate New Signals** - GPT - 4 can automatically generate new, predictive financial signals based on the provided historical data and existing signals. These new signals are not simply a combination of existing signals, but are created through non - linear and high - order combinations. 2. **Improve Efficiency and Innovation Ability** - This method can significantly save the time and energy of investors and traders, while providing innovative financial signals to help them make better decisions. 3. **Wide Applicability** - The newly generated signals perform better than the existing benchmark signals in different industries (such as information technology, healthcare, energy), showing their wide applicability and robustness. ### Experimental Verification To verify the effectiveness of the newly generated signals, the paper uses the data of companies in the S&P 500 index for experiments. Through methods such as Fama - MacBeth two - step regression and Spearman rank correlation matrix, the performance of the new signals is evaluated. The results show that the new signals perform well in multiple industries and can significantly improve the predictive ability of the model. ### Summary In general, this paper realizes the semi - automation of feature engineering by introducing large - language models such as GPT - 4, successfully generates new, predictive financial signals, and provides more efficient and innovative tools for investors and traders.