Improved Long Short-Term Memory-based Wastewater Treatment Simulators for Deep Reinforcement Learning

Esmaeel Mohammadi,Daniel Ortiz-Arroyo,Mikkel Stokholm-Bjerregaard,Aviaja Anna Hansen,Petar Durdevic
2024-03-22
Abstract:Even though Deep Reinforcement Learning (DRL) showed outstanding results in the fields of Robotics and Games, it is still challenging to implement it in the optimization of industrial processes like wastewater treatment. One of the challenges is the lack of a simulation environment that will represent the actual plant as accurately as possible to train DRL policies. Stochasticity and non-linearity of wastewater treatment data lead to unstable and incorrect predictions of models over long time horizons. One possible reason for the models' incorrect simulation behavior can be related to the issue of compounding error, which is the accumulation of errors throughout the simulation. The compounding error occurs because the model utilizes its predictions as inputs at each time step. The error between the actual data and the prediction accumulates as the simulation continues. We implemented two methods to improve the trained models for wastewater treatment data, which resulted in more accurate simulators: 1- Using the model's prediction data as input in the training step as a tool of correction, and 2- Change in the loss function to consider the long-term predicted shape (dynamics). The experimental results showed that implementing these methods can improve the behavior of simulators in terms of Dynamic Time Warping throughout a year up to 98% compared to the base model. These improvements demonstrate significant promise in creating simulators for biological processes that do not need pre-existing knowledge of the process but instead depend exclusively on time series data obtained from the system.
Machine Learning,Artificial Intelligence,Systems and Control
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper aims to address the issue of inaccurate simulation environments encountered when applying Deep Reinforcement Learning (DRL) in wastewater treatment processes. Specifically, existing simulators face instability and error accumulation problems when predicting the long-term dynamic behavior of wastewater treatment systems. These issues are mainly due to the randomness and non-linear characteristics of wastewater treatment data, causing models to perform poorly in long-term predictions. The proposed method in the paper aims to improve the accuracy of simulators by enhancing the training model, thereby better supporting the application of DRL algorithms in optimizing wastewater treatment. ### Main Contributions 1. **Improved Single-Step Prediction Model**: Transforming a single-step prediction model based on actual data into a simulator capable of generating system states. 2. **Impact of Training Structures**: Investigating the impact of different training structures (including control and random methods) on model adaptability and learning efficiency, with results showing that introducing randomness can significantly improve the simulation accuracy of the model. 3. **Combination of Shape Loss and Temporal Loss**: Combining shape loss and temporal loss in the model improvement steps to study their impact on the simulator. 4. **Creating Simulators for Wastewater Treatment Processes Using Only Time Series Data from SCADA Systems**: These simulators can mimic the behavior of actual treatment plants and are used to train deep reinforcement learning algorithms. ### Method Overview 1. **Data and Methods**: - Using the dataset from Denmark's Kolding Central WWTP, data preprocessing is performed, including normalization and feature selection. - Using Long Short-Term Memory (LSTM) models to represent the dynamic system and training it as a simulator. 2. **Model Transformation into Simulator**: - At each time step, the simulator uses its predicted state as input to predict the next state. - By introducing the Data as Demonstrator (DaD) method, the accumulation of prediction errors is reduced. 3. **Iterative Improvement**: - Optimizing the trained model under new training methods to reduce prediction errors in the simulation. - Introducing randomness to make the prediction time range a random variable, enhancing the model's robustness and flexibility. 4. **Experimental Setup**: - Designing four experiments to study the impact of different lengths and continuity of training batches on model performance. - Evaluating the improvement of the model by calculating simulation loss. ### Conclusion Through these improvement methods, the paper demonstrates the possibility of creating more accurate simulators in the wastewater treatment process. These simulators can not only better predict the behavior of the system but also support the training of deep reinforcement learning algorithms, thereby optimizing the wastewater treatment process.