Abstract:Even though Deep Reinforcement Learning (DRL) showed outstanding results in the fields of Robotics and Games, it is still challenging to implement it in the optimization of industrial processes like wastewater treatment. One of the challenges is the lack of a simulation environment that will represent the actual plant as accurately as possible to train DRL policies. Stochasticity and non-linearity of wastewater treatment data lead to unstable and incorrect predictions of models over long time horizons. One possible reason for the models' incorrect simulation behavior can be related to the issue of compounding error, which is the accumulation of errors throughout the simulation. The compounding error occurs because the model utilizes its predictions as inputs at each time step. The error between the actual data and the prediction accumulates as the simulation continues. We implemented two methods to improve the trained models for wastewater treatment data, which resulted in more accurate simulators: 1- Using the model's prediction data as input in the training step as a tool of correction, and 2- Change in the loss function to consider the long-term predicted shape (dynamics). The experimental results showed that implementing these methods can improve the behavior of simulators in terms of Dynamic Time Warping throughout a year up to 98% compared to the base model. These improvements demonstrate significant promise in creating simulators for biological processes that do not need pre-existing knowledge of the process but instead depend exclusively on time series data obtained from the system.

What problem does this paper attempt to address?

### Problems Addressed by the Paper The paper aims to address the issue of inaccurate simulation environments encountered when applying Deep Reinforcement Learning (DRL) in wastewater treatment processes. Specifically, existing simulators face instability and error accumulation problems when predicting the long-term dynamic behavior of wastewater treatment systems. These issues are mainly due to the randomness and non-linear characteristics of wastewater treatment data, causing models to perform poorly in long-term predictions. The proposed method in the paper aims to improve the accuracy of simulators by enhancing the training model, thereby better supporting the application of DRL algorithms in optimizing wastewater treatment. ### Main Contributions 1. **Improved Single-Step Prediction Model**: Transforming a single-step prediction model based on actual data into a simulator capable of generating system states. 2. **Impact of Training Structures**: Investigating the impact of different training structures (including control and random methods) on model adaptability and learning efficiency, with results showing that introducing randomness can significantly improve the simulation accuracy of the model. 3. **Combination of Shape Loss and Temporal Loss**: Combining shape loss and temporal loss in the model improvement steps to study their impact on the simulator. 4. **Creating Simulators for Wastewater Treatment Processes Using Only Time Series Data from SCADA Systems**: These simulators can mimic the behavior of actual treatment plants and are used to train deep reinforcement learning algorithms. ### Method Overview 1. **Data and Methods**: - Using the dataset from Denmark's Kolding Central WWTP, data preprocessing is performed, including normalization and feature selection. - Using Long Short-Term Memory (LSTM) models to represent the dynamic system and training it as a simulator. 2. **Model Transformation into Simulator**: - At each time step, the simulator uses its predicted state as input to predict the next state. - By introducing the Data as Demonstrator (DaD) method, the accumulation of prediction errors is reduced. 3. **Iterative Improvement**: - Optimizing the trained model under new training methods to reduce prediction errors in the simulation. - Introducing randomness to make the prediction time range a random variable, enhancing the model's robustness and flexibility. 4. **Experimental Setup**: - Designing four experiments to study the impact of different lengths and continuity of training batches on model performance. - Evaluating the improvement of the model by calculating simulation loss. ### Conclusion Through these improvement methods, the paper demonstrates the possibility of creating more accurate simulators in the wastewater treatment process. These simulators can not only better predict the behavior of the system but also support the training of deep reinforcement learning algorithms, thereby optimizing the wastewater treatment process.

Improved Long Short-Term Memory-based Wastewater Treatment Simulators for Deep Reinforcement Learning

Deep learning based simulators for the phosphorus removal process control in wastewater treatment via deep reinforcement learning algorithms

Application of Soft Actor-Critic Algorithms in Optimizing Wastewater Treatment with Time Delays Integration

Intelligent Control of Wastewater Treatment Plants Based on Model-Free Deep Reinforcement Learning

Reinforcement-Learning-Based Tracking Control of Waste Water Treatment Process under Realistic System Conditions and Control Performance Requirements

Efficient Reservoir Management through Deep Reinforcement Learning

LSTM-based autoencoder models for real-time quality control of wastewater treatment sensor data

Coupling Process-Based Modeling with Machine Learning for Long-Term Simulation of Wastewater Treatment Plant Operations.

Deep Reinforcement Learning for Real-Time Optimization of Pumps in Water Distribution Systems

Multi-agent reinforcement learning-enhanced autonomous calibration method for wastewater treatment modeling: Long-term validation of a full-scale plant

Unified control of diverse actions in a wastewater treatment activated sludge system using reinforcement learning for multi-objective optimization

Hybrid Reinforcement Learning for Optimizing Pump Sustainability in Real-World Water Distribution Networks

Optimizing wastewater treatment plant operational efficiency through integrating machine learning predictive models and advanced control strategies

Application of machine learning techniques to model a full-scale wastewater treatment plant with biological nutrient removal

Development of AI-based process controller of sour water treatment unit using deep reinforcement learning

Event-Driven Model Predictive Control with Deep Learning for Wastewater Treatment Process

Action-Dependent Heuristic Dynamic Programming With Experience Replay for Wastewater Treatment Processes

Dynamic Real-Time Forecasting Technique for Reclaimed Water Volumes in Urban River Environmental Management

A Maintenance Planning Framework using Online and Offline Deep Reinforcement Learning

Optimized Deep Learning Models for Effluent Prediction in Wastewater Treatment Processes