The key ideas are to gradually expand the solution interval and leverage ensemble learning.
At the beginning (a), only residual points (pre-sampled to fill the entire solution domain) that are close to the points sampled at known initial conditions (solid blue rectangles) are deployed for PDE loss computation. This kicks off the PINN training, where multiple PINN models initialized with different weights are trained, thus forming an ensemble of PINN. After training for certain iterations (b), some of the previously deployed residual points are considered “pseudo points”, if the variances of PINN ensemble predictions are sufficiently small at those locations. Afterward, the solution domain can be expanded (c), such that a new set of residual points will be deployed if there are close enough to the “pseudo points”. The iteration goes on (d) until all residual points are deployed for loss calculation.
The above walk-through assumes only initial conditions are known. For other cases where observation data is available inside of the simulation domain, the solution interval can also expand from there.
According to the authors’ implementation, several new hyperparameters were introduced for this strategy, which determine the threshold of distance to deploy new residual points, the threshold of distance to turn deployed residual points into “pseudo points”, as well as the threshold of variance to decide if PINN ensembles agree with each other. Please refer to the original paper for more details.
2.3 Why the solution might work
The gradual expansion of the solution interval mimics the behavior of the classical numerical solvers that “propagate” the solution from the initial and boundary conditions to the entire domain. As a result, the physical causality is respected, which potentially contributes to more stable PINN training.
By leveraging an ensemble of PINNs, the proposed algorithm would have a higher chance to escape from wrong solutions that are often found with traditional, single-PINN training methods. This explains why the proposed algorithm could reach a higher level of accuracy and reliability.
The paper extensively benchmarked the proposed strategy’s performance on five diverse problems, each representing a unique physical phenomenon:
- Convection equation: this equation models transport phenomena, which are fundamental in various fields, including fluid dynamics, heat transfer, and mass transfer. In these scenarios, understanding the movement of quantities such as energy, mass, or momentum, primarily due to differential forces or gradients, is crucial.
- Reaction system: this problem category models chemical reactions. From basic chemistry classes to complex bioengineering processes, understanding reaction kinetics and equilibrium can be the difference between successful experiments or industrial processes and catastrophic failures.
- Reaction-diffusion equation: this equation models reactions combined with the diffusion of substances. This type of problem is pivotal in areas like biology, where they describe processes like pattern formation in developmental biology, or in chemistry, where they model the spread and interaction of different species in a medium.
- Diffusion equation (with periodic boundary conditions): this type of equation is central to understanding heat conduction, fluid flow, and Brownian motion, among other things. Periodic boundary conditions mean that the system behavior repeats over time or space, a common assumption in problems dealing with cyclic or repeating systems.
- Diffusion equation (same equation as shown above, but with boundary conditions of the Dirichlet-type): Dirichlet-type boundary conditions stipulate the values a solution must take on the boundary of the domain.
The benchmark studies yielded that:
- The proposed algorithm provides stable training and shows competitive performance for all considered problems;
- The proposed algorithm is generally stable with little sensitivity to the hyperparameter choices.
Additionally, the paper suggested that the following practices can help improve PINN accuracy:
- Normalizing the inputs (both spatial input x and temporal input t) to [-1, 1];
- Fine-tuning the PINN training with L-BFGS after the usual Adam optimization
2.5 Strength and Weakness
- Able to stabilize PINNs’ training and yield competitive performance.
- Do not need a pre-defined schedule of interval expansion (in contrast to the time-adaptive strategies explained in the “Alternatives” section).
- Treat time and space equally.
- Flexible enough to allow easy incorporation of measurements in arbitrary locations.
- Confidence intervals of the predictions are automatically obtained thanks to the ensemble approach.
- Computationally expensive compared to vanilla PINNs (common for all ensemble approaches).
- Introduced extra hyperparameters (although the proposed solution is shown to be insensitive to hyperparameter values).
Other methods that address a similar problem are the family of time-adaptive techniques, which heavily influenced the current approach. Time-adaptive techniques split the time interval [T₁, T₂] into multiple sub-intervals and solve the equation on each individual sub-interval sequentially by a separate PINN. The solution from one sub-interval is then used as an initial condition for the subsequent sub-interval.
The current algorithm inherits the merits of time-adaptive techniques, i.e., capable of achieving a more accurate and stable propagation of the solution in time. On top of that, it also removes the constraints of requiring a pre-defined schedule of interval expansion. As a result, the current approach is more flexible to incorporate known measurements in arbitrary locations.