Abstract:Imposing known physical constraints, such as conservation laws, during neural network training introduces an inductive bias that can improve accuracy, reliability, convergence, and data efficiency for modeling physical dynamics. While such constraints can be softly imposed via loss function penalties, recent advancements in differentiable physics and optimization improve performance by incorporating PDE-constrained optimization as individual layers in neural networks. This enables a stricter adherence to physical constraints. However, imposing hard constraints significantly increases computational and memory costs, especially for complex dynamical systems. This is because it requires solving an optimization problem over a large number of points in a mesh, representing spatial and temporal discretizations, which greatly increases the complexity of the constraint. To address this challenge, we develop a scalable approach to enforce hard physical constraints using Mixture-of-Experts (MoE), which can be used with any neural network architecture. Our approach imposes the constraint over smaller decomposed domains, each of which is solved by an "expert" through differentiable optimization. During training, each expert independently performs a localized backpropagation step by leveraging the implicit function theorem; the independence of each expert allows for parallelization across multiple GPUs. Compared to standard differentiable optimization, our scalable approach achieves greater accuracy in the neural PDE solver setting for predicting the dynamics of challenging non-linear systems. We also improve training stability and require significantly less computation time during both training and inference stages.

What problem does this paper attempt to address?

This paper aims to address the computational and memory costs associated with strictly enforcing physical constraints (such as conservation laws) during neural network training, especially when dealing with complex dynamical systems. Existing methods impose these constraints in a soft manner through loss function penalties, but this may lead to optimization difficulties and convergence issues, and cannot guarantee constraint enforcement during inference. The paper proposes a scalable approach using Mixture-of-Experts (MoE) to enforce hard physical constraints. This approach applies the constraints to smaller decomposition domains, each solved independently by an "expert" through differentiable optimization. Each expert performs local backpropagation, leveraging the implicit function theorem for parallel computing, which improves training stability and computational efficiency. During training, each expert independently optimizes within a local region to apply known physical priors within their respective domains. This enables parallelization of forward and backward propagation across multiple GPUs, reducing computation time and improving training stability. Compared to standard differentiable optimization methods, this approach demonstrates higher accuracy in predicting the dynamics of challenging nonlinear systems and significantly reduces the required computation time in both training and inference stages. The main contributions of the paper include: 1. Introducing a physics-inspired Mixture-of-Experts training framework (PI-HC-MoE) to impose hard physical constraints on neural networks by solving constrained optimization problems, achieving scalability. 2. Instantiating this method in a neural PDE solver setting, demonstrating its application on two challenging nonlinear problems (diffusion absorption and turbulent Navier-Stokes equations), where it significantly improves accuracy compared to soft constraints and standard hard-constraint differentiable optimization methods. 3. Showing sub-linear scaling in execution time for PI-HC-MoE compared to standard differentiable optimization, with improved efficiency as the number of sampling points increases in the spatio-temporal domain. 4. Providing open-source code to promote replicability and further research. Overall, the paper proposes an effective method to address the computational burden of enforcing hard physical constraints in complex systems by decomposing constraint enforcement into parallel tasks, improving the accuracy and efficiency of neural networks in simulating physical phenomena.

Scaling physics-informed hard constraints with mixture-of-experts

Multi-fidelity physics constrained neural networks for dynamical systems

Characterizing and Mitigating the Difficulty in Training Physics-informed Artificial Neural Networks under Pointwise Constraints

MultiAdam: Parameter-wise Scale-invariant Optimizer for Multiscale Training of Physics-informed Neural Networks

Physics-Informed Neural Networks with Hard Linear Equality Constraints

Mixtures of Experts Unlock Parameter Scaling for Deep RL

Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning

Enhancing neurodynamic approach with physics-informed neural networks for solving non-smooth convex optimization problems

Inverse-Dirichlet Weighting Enables Reliable Training of Physics Informed Neural Networks

FasterMoE

M^2M: Learning Controllable Multi of Experts and Multi-Scale Operators Are the Partial Differential Equations Need

A hybrid physics-informed neural network based multiscale solver as a partial differential equation constrained optimization problem

An Analysis of Physics-Informed Neural Networks

Learning a Neural Solver for Parametric PDE to Enhance Physics-Informed Methods

Blending Diverse Physical Priors with Neural Networks

Enhanced physics‐informed neural networks for hyperelasticity

Scaling Laws for Fine-Grained Mixture of Experts

A mixed formulation for physics-informed neural networks as a potential solver for engineering problems in heterogeneous domains: Comparison with finite element method

Constrained or Unconstrained? Neural-Network-Based Equation Discovery from Data

NeuralStagger: Accelerating Physics-Constrained Neural PDE Solver with Spatial-Temporal Decomposition