Derivative-Informed Neural Operators

Conventional neural operator training

Neural operators are neural network surrogates for maps on function spaces, where the neural network parametrization is independent of a specific numerical discretization of the target map. Neural operators often arise as surrogates for parametric PDE maps. Neural operators are typically trained in the \(L^2_\mu\) parametric Bochner space.

\[\min_w \mathbb{E}_{m \sim \mu}\left[\|u(m) - u_w(m)\|^2_{\mathcal{U}} \right]\]

DINO training

Derivative-informed neural operators (DINOs) are neural operators that are trained to learn both an operator and its (Fréchet) derivatives, that is the training is formulated in the \(H^1_\mu\) Bochner space (or \(H^k_\mu\) with \(k\in \mathbb{N}\)).

\[\min_w \mathbb{E}_{m \sim \mu}\left[\|u(m) - u_w(m)\|^2_{\mathcal{U}} + \|\mathcal{D}u(m) - \mathcal{D}u_w(m)\|^2_{HS(\mathcal{M},\mathcal{U})} \right]\]

The key issue for DINO is to devise representations of the neural operator and the training problem that lead to efficient representations of the derivative \(\mathcal{D}u\), which leads to efficient offline computation of training data and efficient DINO training.

Advantages of DINO

Improved generalization per unit compute

Derivatives (e.g., Jacobians) of implicitly defined functions can often be computed and compressed at a significantly lower cost when compared to the function itself. Consider an implicitly defined parametric PDE map in a strong residual form.

\[m \mapsto u(m) \quad \text{such that} \quad R(u,m) = 0\]

Then the derivative can be computed efficiently as follows.

\[\mathcal{D}u(m) = - \left[\frac{\partial R (u,m)}{\partial u}\right]^{-1}\frac{\partial R(u,m)}{\partial m}\]

Notably, when utilizing (sparse) direct solvers, the factors for \(\left[\frac{\partial R (u,m)}{\partial u}\right]\) need only be computed once, and then the Jacobian can be compressed, matrix-free, at marginal additional costs. Other amoritizations are possible in other settings such as (i) the use of expensive preconditioners, and (ii) implicit time integrators. In these cases the DINO formulation brings in more, cheap training data; it thereby leads to empirically better \(L^2_\mu\) generalization accuracy per unit compute of training data.

Improved accuracy in optimization and inference tasks

Many tasks regarding the decision support of complex physical systems can

  1. be formulated as optimization problems, or
  2. require derivative information for their efficient solution.

Such tasks include Bayesian inference, stochastic optimization (including optimal design and optimal control), and optimal experimental design. DINOs are uniquely suitable to these tasks as they control not only the operator approximation error but also its derivative(s). These lead to more accurate approximations of optimization gradients and stationary points.

DINOs for Bayesian Inference

Please see our preprint on efficient Markov chain Monte Carlo, led by my collaborator Lianghao Cao.

DINOS for PDE-constrained Optimization Under Uncertainty

Please see our preprint on efficient PDE-constrained optimization under uncertainty, led by my collaborator Dingcheng Luo. We compare DINOs against conventional \(L^2_\mu\) neural operators (NO) and reference PDE solver based benchmark. Generally our results demonstrate that DINOs are \(10\times\) more accurate than NO and \(10 \times\) more sample efficient than reference PDE solver-based implementations when only amortized over one risk-averse optimization problem. When amortizing over many the benefits of DINO increase substantially.

DINOs are able to control viscous flow fields with expensive-to-estimate risk measures (CVaR) at \(\mathcal{O}(10^7)\) online speedup, with little loss of accuracy.

Uncontrolled flow field

Optimally controlled flow field

Application to 2011 Tōhoku earthquake

Through our NSF RISE grant, with collaborators Thorsten Becker, Simone Puel and Umberto Villa we are solving Bayesian inverse problems surrounding the 2011 M9 earthquake. Specifically we seek to infer the subsurface elastic properties as well as the fault slip, both under uncertainty.

Image credit to my collaborator Simone Puel.

Collaboration

My collaborators on DINO-related topics include Nick Alger, Thorsten Becker, Michael Brennan, Lianghao Cao, Joshua Chen, Peng Chen, Blake Christierson, Omar Ghattas, Xindi Gong, Nikola Kovachki, Dingcheng Luo, Youssef Marzouk, Simone Puel, Umberto Villa, Josephine Westermann, Boyuan (John) Yao, Jakob Zech and Ziheng (Marshall) Zhang.