Reinforcement Feature Transformation for Polymer Property Performance Prediction

Xuanming Hu,Dongjie Wang,Wangyang Ying,Yanjie Fu
2024-09-24
Abstract:Polymer property performance prediction aims to forecast specific features or attributes of polymers, which has become an efficient approach to measuring their performance. However, existing machine learning models face challenges in effectively learning polymer representations due to low-quality polymer datasets, which consequently impact their overall performance. This study focuses on improving polymer property performance prediction tasks by reconstructing an optimal and explainable descriptor representation space. Nevertheless, prior research such as feature engineering and representation learning can only partially solve this task since they are either labor-incentive or unexplainable. This raises two issues: 1) automatic transformation and 2) explainable enhancement. To tackle these issues, we propose our unique Traceable Group-wise Reinforcement Generation Perspective. Specifically, we redefine the reconstruction of the representation space into an interactive process, combining nested generation and selection. Generation creates meaningful descriptors, and selection eliminates redundancies to control descriptor sizes. Our approach employs cascading reinforcement learning with three Markov Decision Processes, automating descriptor and operation selection, and descriptor crossing. We utilize a group-wise generation strategy to explore and enhance reward signals for cascading agents. Ultimately, we conduct experiments to indicate the effectiveness of our proposed framework.
Machine Learning
What problem does this paper attempt to address?
This paper aims to solve two key problems in polymer performance prediction: 1. **Automatic Transformation**: How to automatically transform descriptors to enhance polymer performance prediction? 2. **Explainable Enhancement**: How to ensure that the reconstructed descriptor space is interpretable? ### Background and Challenges Polymer performance prediction is an important research area, which evaluates the performance of polymers by predicting specific features or properties. However, existing machine - learning models face challenges when dealing with low - quality polymer datasets, which affects their overall performance. Specifically, although traditional feature - engineering methods can partially solve this problem, these methods often require a great deal of manual intervention and lack interpretability. And although existing representation - learning methods can extract meaningful latent representations, these representations are usually unclear and difficult to interpret, which is a major obstacle in polymer performance prediction that requires high prediction accuracy and reliable understanding. ### Solutions To solve the above problems, the authors propose a framework named "Traceable Group - wise Reinforcement Generation Perspective". The main features of this framework are as follows: 1. **Iterative Generation and Selection Strategy**: - **Generation Step**: Generate new descriptors from one or two descriptors through mathematical transformations (such as \( f_1\times f_2 \), \( f_1 - f_2 \), \( \sin(f_1) \) etc.). - **Selection Step**: Control the size of the descriptor set, eliminate redundant descriptors, and ensure that the generated descriptor set is both effective and interpretable. 2. **Three - layer Markov Decision Processes (MDPs)**: - **Descriptor Group Selection**: Select two descriptor groups and one operation through three cascading agents respectively. Each agent makes decisions based on the selection results of the previous agent, forming an automatically correlated and collaborative decision - making structure. - **Operation Selection**: Select appropriate mathematical operations (such as squaring, exponentiation, logarithm, addition, multiplication, division, etc.) to generate new descriptors. - **Descriptor Group Crossing**: Generate multiple new descriptors through the crossing between descriptor groups to accelerate the reconstruction of the representation space. 3. **Optimization Objectives**: - **Interpretability**: Provide a traceable generation process and understand the meaning of each generated descriptor. - **Self - Optimization**: Automatically generate the optimal descriptor set without the need for professional knowledge in the field of materials science. - **Efficiency Improvement**: Accelerate the generation and exploration speed and improve the learning efficiency through group operations and reward signal enhancement. ### Method Overview 1. **Descriptor Clustering**: Cluster the original descriptors into different groups by maximizing the within - group feature similarity and the between - group feature difference. 2. **Descriptor Generation and Selection**: Use the cascading reinforcement learning method to guide three agents to select informative descriptor groups and operations to generate new descriptors. 3. **Performance Evaluation and Feedback**: Use the generated descriptors for the polymer performance prediction task, collect the prediction accuracy as reward feedback, and update the policy parameters of the agents. 4. **Redundancy Elimination**: Reduce redundant descriptors through the selection step and continue iterating until the maximum limit is reached. ### Formulas - **Descriptor Group - Group Distance**: \[ d_{\text{is}}(C_i, C_j)=\frac{1}{|C_i|\cdot|C_j|}\sum_{f_i\in C_i}\sum_{f_j\in C_j}\frac{|MI(f_i, y)-MI(f_j, y)|}{MI(f_i, f_j)+\epsilon} \] where \( C_i \) and \( C_j \) are two different descriptor groups, \( |C_i| \) and \( |C_j| \) are the numbers of descriptors in \( C_i \) and \( C_j \) respectively, and \( f_