Gaussian Processes, Neural Networks, and Uncertainty Quantification
📽 Slides: Open presentation
A single nonlinear FEM simulation may take hours to days.
Applications requiring thousands of evaluations:
Surrogate model (metamodel): a cheap approximation \(\hat{\mathcal{M}}(\boldsymbol{\theta}) \approx \mathcal{M}(\boldsymbol{\theta})\) trained on a limited number of FEM evaluations.
| Method | Strengths | Limitations |
|---|---|---|
| Polynomial Response Surface | Simple, cheap | Poor for nonlinear/high-dim |
| Kriging / Gaussian Process (GP) | Uncertainty estimates, flexible | Scales as \(O(n^3)\) |
| Radial Basis Functions (RBF) | Mesh-free, good interpolation | No uncertainty |
| Neural Networks (NN) | Very flexible, scales well | Need lots of data, no UQ |
| Polynomial Chaos Expansion (PCE) | Intrinsic UQ, spectral accuracy | Curse of dimensionality |
A GP defines a distribution over functions: \[ f(\mathbf{x}) \sim \mathcal{GP}(m(\mathbf{x}),\, k(\mathbf{x},\mathbf{x}')), \]
where \(m(\cdot)\) is the mean function and \(k(\cdot,\cdot)\) the kernel (covariance function).
Prediction at new point \(\mathbf{x}_*\): \[ \mu_* = k(\mathbf{x}_*,\mathbf{X})[\mathbf{K} + \sigma_n^2\mathbf{I}]^{-1}\mathbf{y}, \qquad \sigma_*^2 = k(\mathbf{x}_*,\mathbf{x}_*) - k(\mathbf{x}_*,\mathbf{X})[\mathbf{K}+\sigma_n^2\mathbf{I}]^{-1}k(\mathbf{X},\mathbf{x}_*). \]
GP gives both prediction and uncertainty estimate — essential for active learning.
Common kernels:
For constitutive modeling: the Matérn class often better matches the finite smoothness of plastic responses.
Data-driven constitutive models replace the analytical form with a NN mapping: \[ \boldsymbol{\sigma}_{n+1} = \text{NN}(\boldsymbol{\varepsilon}_n, \boldsymbol{\varepsilon}_{n+1}, \boldsymbol{\sigma}_n, \boldsymbol{\alpha}_n;\, \mathbf{w}), \]
where \(\mathbf{w}\) are trained weights.
Key considerations:
PINNs embed the governing PDEs in the loss function: \[ \mathcal{L} = \mathcal{L}_\text{data} + \lambda_\text{pde}\mathcal{L}_\text{pde} + \lambda_\text{bc}\mathcal{L}_\text{bc} \]
Applied to constitutive modeling:
Sources of uncertainty:
| Source | Type | Treatment |
|---|---|---|
| Model parameters \(\boldsymbol{\theta}\) | Epistemic | Bayesian inference, MC |
| Model form error | Epistemic | Model comparison, discrepancy |
| Experimental noise | Aleatoric | Statistical noise model |
| Numerical error | Epistemic | Mesh convergence, verification |
Propagation: if \(\boldsymbol{\theta} \sim p(\boldsymbol{\theta})\), then \(\mathbf{y} = \mathcal{M}(\boldsymbol{\theta})\) has a distribution \(p(\mathbf{y})\).
Represent the output as an expansion in orthogonal polynomials of the input random variables: \[ \mathcal{M}(\boldsymbol{\xi}) \approx \sum_{\boldsymbol{\alpha}\in\mathcal{A}} c_{\boldsymbol{\alpha}}\Psi_{\boldsymbol{\alpha}}(\boldsymbol{\xi}) \]
where \(\boldsymbol{\xi}\) are standardised random inputs and \(\Psi_{\boldsymbol{\alpha}}\) are Hermite/Legendre polynomials.
Coefficients \(c_{\boldsymbol{\alpha}}\) computed via non-intrusive sampling (regression or sparse quadrature). Mean and variance extracted analytically from the coefficients.
Five-step process:
Key principle: Exploit structure (polynomial bases, kernel functions) to achieve accuracy with few training points.
Input-to-output propagation:
Given uncertain parameters \(\mathbf{p}\), compute distribution of output \(\mathbf{y} = \mathcal{M}(\mathbf{p})\).
Methods:
| Method | Cost | Accuracy | Suitability |
|---|---|---|---|
| Monte Carlo | \(O(N)\) | Slow \(O(N^{-1/2})\) | Converges for any \(d\) |
| Quasi-MC | \(O(N\log N)\) | \(O(\log^d N/N)\) | High dimension friendly |
| Polynomial Chaos | Spectral | Spectral (if low \(d\)) | Intrinsic UQ |
| Collocation | Grid-based | Curse of dim. | Low-medium \(d\) |
For high-dimensional problems with many parameters, surrogate + Monte Carlo is practical.
Rather than uniform sampling, use surrogate uncertainty estimates to guide new training point placement:
Result: Efficient use of expensive evaluations; converges faster than static DoE.
GPs naturally provide uncertainty → perfect for active learning.
NNs require external UQ (e.g., ensemble, Bayesian approximation, MC dropout).
Once uncertainty is propagated, quantify which parameters drive output variability:
Global sensitivity indices (Sobol’): \[ S_i = \frac{\mathrm{Var}_i[\mathbb{E}[\mathcal{M}|\mathbf{p}_i]]}{\mathrm{Var}[\mathcal{M}]}, \quad S_{ij} = \text{(two-way interactions)} \]
First-order \(S_i\): main effect of parameter \(\mathbf{p}_i\) alone. Total \(S_{T_i}\): includes all interactions involving \(\mathbf{p}_i\).
Computation: Can extract analytically from PCE coefficients or estimate via Monte Carlo sampling from surrogate.
Application: Focus calibration/experiment on high-\(S_i\) parameters; neglect low-sensitivity ones.
Paradigm shift: Replace analytical constitutive law with learned model from data.
Example: Direct mapping \[ \boldsymbol{\sigma}_{n+1} = \mathcal{NN}(\boldsymbol{\varepsilon}_n, \boldsymbol{\varepsilon}_{n+1}, \boldsymbol{\sigma}_n, \boldsymbol{\alpha}_n; \mathbf{w}) \]
Advantages: - Captures complex, multi-scale behavior - No need for analytical model form - Can learn from heterogeneous data (DIC, X-ray, simulations)
Challenges: - Require diverse loading paths (not just uniaxial tension) - Ensure frame-invariance (use tensor invariants or equivariant networks) - Enforce thermodynamic consistency (dissipation inequality as constraint) - Generalization beyond training domain
PINNs approach: Embed physics as soft constraint in loss function: \[ \mathcal{L} = \mathcal{L}_\text{data} + \lambda\mathcal{L}_\text{physics} \]
where \(\mathcal{L}_\text{physics}\) penalizes violation of constitutive relations or balance laws.
Current use cases: - Parameter calibration from test data (Bayesian inverse problems) - Real-time digital twins (replace expensive FEM in on-line control) - Robustness analysis (how sensitive is design to material uncertainty?) - Multi-scale modeling (surrogate for microscale → use in macroscale FEM)
Emerging directions: - Hybrid models: combine classical physics with learned corrections - Multifidelity surrogates: leverage both cheap and expensive simulations - Domain adaptation: transfer surrogates across similar materials - Operator learning: learn entire PDE solution operator (DeepONet, FNO)
When to use surrogates: If you need >100 model evaluations AND budget/time is constrained.
Constitutive Models — Technion | Website