Single-step deep reinforcement learning for open-loop control of laminar and turbulent flows

H. Ghraieb,J. Viquerat,A. Larcher,P. Meliga,E. Hachem
DOI: https://doi.org/10.1111/j.1365-246X.2006.02979.x
2021-03-24
Abstract:This research gauges the ability of deep reinforcement learning (DRL) techniques to assist the optimization and control of fluid mechanical systems. It combines a novel, "degenerate" version of the proximal policy optimization (PPO) algorithm, that trains a neural network in optimizing the system only once per learning episode, and an in-house stabilized finite elements environment implementing the variational multiscale (VMS) method, that computes the numerical reward fed to the neural network. Three prototypical examples of separated flows in two dimensions are used as testbed for developing the methodology, each of which adds a layer of complexity due either to the unsteadiness of the flow solutions, or the sharpness of the objective function, or the dimension of the control parameter space. Relevance is carefully assessed by comparing systematically to reference data obtained by canonical direct and adjoint methods. Beyond adding value to the shallow literature on this subject, these findings establish the potential of single-step PPO for reliable black-box optimization of computational fluid dynamics (CFD) systems, which paves the way for future progress in optimal flow control using this new class of methods.
Optimization and Control,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to use deep reinforcement learning (DRL) technology to optimize and control fluid mechanics systems, especially to achieve effective control of laminar and turbulent flows in open - loop control through the single - step Proximal Policy Optimization (single - step PPO) algorithm. Specifically, the researchers developed a new "degraded" version of the PPO algorithm, which is suitable for state - independent optimal policy learning, such as open - loop control problems. This method aims to reduce the drag of hydrodynamic systems while improving control efficiency. ### Background and Objectives of the Paper Fluid control refers to the ability to adjust fluids to a more ideal state, and this field is of great significance for society and economy. For example, in marine transportation or air traffic, reducing the overall drag by just a few percentage points can significantly reduce fossil fuel consumption and carbon dioxide emissions, saving billions of dollars annually. In addition, many other scenarios involving fluid mechanics systems also require similar engineering design improvements. For example, the aviation industry focuses on reducing structural vibrations and radiated noise under unstable flow conditions, and microfluidic technology and combustion processes benefit from enhanced mixing effects. ### Research Methods 1. **Deep Reinforcement Learning (DRL)**: It combines reinforcement learning and deep neural networks and can learn how to take actions to maximize long - term rewards through interaction with the environment. 2. **Single - step Proximal Policy Optimization (single - step PPO)**: A "degraded" PPO algorithm, which is especially suitable for open - loop control problems, where the optimal policy does not depend on the state, and the neural network can find the optimal solution in just one attempt. 3. **Computational Fluid Dynamics (CFD) Environment**: A self - developed stable finite - element environment is used to implement the Variational Multiscale (VMS) method for calculating the numerical rewards provided to the neural network. ### Experimental Setup - **Test Cases**: - Maximize the average lift of the NACA 0012 airfoil. - Reduce the fluctuating lift of two side - by - side cylinders. - Reduce drag by placing small control cylinders under laminar and turbulent flow conditions. - Control the fluidic pinball in turbulence (an equilateral triangle arrangement of three rotating cylinders) to reduce drag. - **Evaluation Methods**: - Evaluate convergence and accuracy by comparing with internal DNS data. - Verify the effectiveness of the method by comparing with the results of the adjoint method. ### Main Results - **Laminar and Turbulent Flow Control**: The single - step PPO performs well under both laminar and turbulent flow conditions, especially in reducing drag. For example, for a square cylinder at a Reynolds number of several thousand, the drag is reduced by 30%, which is consistent with the experimental data in the literature. - **Fluidic Pinball Control**: Through the so - called stern - effect (the front cylinder rotates slowly and the downstream cylinders rotate in the opposite direction to reduce the fluid flow in the middle), the drag is reduced by nearly 60%. ### Conclusions This study demonstrates the effectiveness and reliability of the single - step PPO in open - loop control, especially when dealing with problems in high - dimensional parameter spaces. Although compared with traditional optimization techniques (such as evolutionary strategies or genetic algorithms), DRL is still in its early stages, its application prospects in the field of fluid mechanics are broad. Future work will further optimize and expand the application range of the single - step PPO to better combine advanced numerical methods and multi - scale, multi - physical - field Computational Fluid Dynamics (CFD).