TemporalPaD: a reinforcement-learning framework for temporal feature representation and dimension reduction

Xuechen Mu,Zhenyu Huang,Kewei Li,Haotian Zhang,Xiuli Wang,Yusi Fan,Kai Zhang,Fengfeng Zhou
2024-09-27
Abstract:Recent advancements in feature representation and dimension reduction have highlighted their crucial role in enhancing the efficacy of predictive modeling. This work introduces TemporalPaD, a novel end-to-end deep learning framework designed for temporal pattern datasets. TemporalPaD integrates reinforcement learning (RL) with neural networks to achieve concurrent feature representation and feature reduction. The framework consists of three cooperative modules: a Policy Module, a Representation Module, and a Classification Module, structured based on the Actor-Critic (AC) framework. The Policy Module, responsible for dimensionality reduction through RL, functions as the actor, while the Representation Module for feature extraction and the Classification Module collectively serve as the critic. We comprehensively evaluate TemporalPaD using 29 UCI datasets, a well-known benchmark for validating feature reduction algorithms, through 10 independent tests and 10-fold cross-validation. Additionally, given that TemporalPaD is specifically designed for time series data, we apply it to a real-world DNA classification problem involving enhancer category and enhancer strength. The results demonstrate that TemporalPaD is an efficient and effective framework for achieving feature reduction, applicable to both structured data and sequence datasets. The source code of the proposed TemporalPaD is freely available as supplementary material to this article and at <a class="link-external link-http" href="http://www.healthinformaticslab.org/supp/" rel="external noopener nofollow">this http URL</a>.
Machine Learning,Artificial Intelligence,Genomics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to effectively perform feature representation and dimension reduction in time - series data. Specifically, the author proposes a new end - to - end deep - learning framework named TemporalPaD, which aims to simultaneously achieve feature representation and feature dimension reduction and is especially suitable for time - series data. ### Problem Background With the development of big - data technology and storage solutions, the amount of data has grown exponentially, which poses challenges to data - storage capacity and also requires efficient computational algorithms to extract valuable information from massive amounts of data. Feature Representation (FR) and Dimension Reduction (DR) are key techniques for dealing with this problem: - **Feature Representation**: Reduce data redundancy by extracting the most relevant information. - **Dimension Reduction**: Simplify the model by reducing the number of features and improve computational efficiency. However, for time - series data, existing methods have deficiencies in feature representation and dimension reduction, especially lacking a unified solution when integrating these two tasks. ### Proposal of TemporalPaD To solve the above problems, the author proposes TemporalPaD, a framework based on Reinforcement Learning (RL) and neural networks, specifically for feature representation and dimension reduction of time - series data. The main innovations of TemporalPaD include: 1. **Integrating Reinforcement Learning and Neural Networks**: Use RL to guide the neural network in feature selection and dimension reduction. 2. **Modular Design**: The framework consists of three modules that work in synergy: - **Policy Module**: Responsible for dimension reduction through RL, acting as an "actor". - **Representation Module**: Responsible for feature extraction. - **Classification Module**: Responsible for classification tasks, acting as a "critic". ### Experimental Verification To verify the effectiveness of TemporalPaD, the author conducted the following experiments: - **UCI Dataset Evaluation**: Use 29 UCI benchmark datasets for 10 independent tests and 10 - fold cross - validation, covering multiple fields (such as health, business, physics, etc.). - **DNA Classification Problem**: Apply TemporalPaD to solve the actual DNA classification problem, involving the classification of enhancer categories and enhancer strength. ### Results The experimental results show that TemporalPaD performs excellently in feature dimension reduction and classification tasks, especially having significant advantages in time - series data. It can not only effectively reduce the feature dimension but also maintain or even improve the classification performance. In conclusion, this paper solves the problems of feature representation and dimension reduction in time - series data by proposing the TemporalPaD framework, providing an efficient and effective solution.