Automatic Financial Feature Construction

Jie Fang,Shutao Xia,Jianwu Lin,Yong Jiang
DOI: https://doi.org/10.48550/arXiv.1912.06236
2020-10-03
Abstract:In automatic financial feature construction task, the state-of-the-art technic leverages reverse polish expression to represent the features, then use genetic programming (GP) to conduct its evolution process. In this paper, we propose a new framework based on neural network, alpha discovery neural network (ADNN). In this work, we made several contributions. Firstly, in this task, we make full use of neural network overwhelming advantage in feature extraction to construct highly informative features. Secondly, we use domain knowledge to design the object function, batch size, and sampling rules. Thirdly, we use pre-training to replace the GP evolution process. According to neural network universal approximation theorem, pre-training can conduct a more effective and explainable evolution process. Experiment shows that ADNN can remarkably produce more diversified and higher informative features than GP. Besides, ADNN can serve as a data augmentation algorithm. It further improves the the performance of financial features constructed by GP.
Machine Learning,Pricing of Securities,Trading and Market Microstructure
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of automatic feature construction in the financial field, especially how to automatically generate useful, diverse, and information - rich features from financial time - series data. Specifically: 1. **Limitations of existing methods**: - The current state - of - the - art technology is to use genetic programming (GP) combined with reverse Polish notation to represent features and generate new features through an evolutionary process. However, this method has the following problems: - **High feature similarity**: Features generated by GP tend to be very similar and lack diversity. - **Insufficient information**: These features do not contain more useful information than those constructed by human experts. - **Low evolutionary efficiency**: The evolutionary process of GP is more like a search process rather than an effective evolutionary process. 2. **Proposed new method**: - The paper proposes a new framework based on neural networks - Alpha Discovery Neural Network (ADNN) - to overcome the above problems. The main improvements of ADNN include: - **Utilizing the feature extraction ability of neural networks**: Build more informative features through the powerful feature extraction ability of deep neural networks. - **Designing a reasonable optimization objective function**: Combine domain knowledge to design objective functions, batch sizes, and sampling rules. - **Using pre - training instead of the evolutionary process of genetic programming**: According to the universal approximation theorem of neural networks, use pre - training to replace the evolutionary process of GP to achieve a more effective and interpretable evolutionary process. - **Introducing model stealing techniques**: Bring sufficient diversity to the network through model stealing techniques. - **Quantifying feature diversity**: Propose multiple methods to quantify the diversity of generated features. 3. **Experimental verification**: - Through experimental verification, ADNN can generate more diverse and information - rich features compared to GP. In addition, ADNN can also be used as a data augmentation algorithm to further improve the performance of existing technical indicators. In summary, the goal of this paper is to develop a more efficient, more interpretable, and more diverse automatic financial feature construction method to improve opportunity discovery and investment returns in financial trading.