13 L11 — Data-Driven & Surrogate Modeling
Gaussian Processes, Neural Networks, and Uncertainty Quantification
13.1 Motivation: The Computational Cost Problem
A single nonlinear FEM simulation may take hours to days.
Applications requiring thousands of evaluations:
- Parameter calibration (inverse problem)
- Uncertainty propagation / Monte Carlo
- Design optimization
- Real-time digital twins
Surrogate model (metamodel): a cheap approximation \(\hat{\mathcal{M}}(\boldsymbol{\theta}) \approx \mathcal{M}(\boldsymbol{\theta})\) trained on a limited number of FEM evaluations.
13.2 Types of Surrogate Models
| Method | Strengths | Limitations |
|---|---|---|
| Polynomial Response Surface | Simple, cheap | Poor for nonlinear/high-dim |
| Kriging / Gaussian Process (GP) | Uncertainty estimates, flexible | Scales as \(O(n^3)\) |
| Radial Basis Functions (RBF) | Mesh-free, good interpolation | No uncertainty |
| Neural Networks (NN) | Very flexible, scales well | Need lots of data, no UQ |
| Polynomial Chaos Expansion (PCE) | Intrinsic UQ, spectral accuracy | Curse of dimensionality |
13.3 Gaussian Process Regression
A GP defines a distribution over functions: \[ f(\mathbf{x}) \sim \mathcal{GP}(m(\mathbf{x}),\, k(\mathbf{x},\mathbf{x}')), \]
where \(m(\cdot)\) is the mean function and \(k(\cdot,\cdot)\) the kernel (covariance function).
Prediction at new point \(\mathbf{x}_*\): \[ \mu_* = k(\mathbf{x}_*,\mathbf{X})[\mathbf{K} + \sigma_n^2\mathbf{I}]^{-1}\mathbf{y}, \qquad \sigma_*^2 = k(\mathbf{x}_*,\mathbf{x}_*) - k(\mathbf{x}_*,\mathbf{X})[\mathbf{K}+\sigma_n^2\mathbf{I}]^{-1}k(\mathbf{X},\mathbf{x}_*). \]
GP gives both prediction and uncertainty estimate — essential for active learning.
13.4 Kernels for Mechanics
Common kernels:
- Squared exponential (RBF): \(k(r) = \sigma_f^2\exp(-r^2/2l^2)\) — smooth, infinitely differentiable
- Matérn 3/2: \(k(r) = \sigma_f^2(1+\sqrt{3}r/l)\exp(-\sqrt{3}r/l)\) — once-differentiable, better for response curves
- Periodic: for cyclic loading data
For constitutive modeling: the Matérn class often better matches the finite smoothness of plastic responses.
13.5 Neural Networks for Constitutive Models
Data-driven constitutive models replace the analytical form with a NN mapping: \[ \boldsymbol{\sigma}_{n+1} = \text{NN}(\boldsymbol{\varepsilon}_n, \boldsymbol{\varepsilon}_{n+1}, \boldsymbol{\sigma}_n, \boldsymbol{\alpha}_n;\, \mathbf{w}), \]
where \(\mathbf{w}\) are trained weights.
Key considerations:
- Thermodynamic consistency: encode dissipation inequality as constraint or via special architecture
- Frame invariance: train on invariants, not raw tensor components
- Generalization: needs diverse loading paths, not just monotonic tension
13.6 Physics-Informed Neural Networks (PINNs)
PINNs embed the governing PDEs in the loss function: \[ \mathcal{L} = \mathcal{L}_\text{data} + \lambda_\text{pde}\mathcal{L}_\text{pde} + \lambda_\text{bc}\mathcal{L}_\text{bc} \]
Applied to constitutive modeling:
- Enforce constitutive relations as soft constraints
- Can learn from heterogeneous full-field data (DIC images)
- Enables simultaneous identification + simulation
13.7 Uncertainty Quantification (UQ)
Sources of uncertainty:
| Source | Type | Treatment |
|---|---|---|
| Model parameters \(\boldsymbol{\theta}\) | Epistemic | Bayesian inference, MC |
| Model form error | Epistemic | Model comparison, discrepancy |
| Experimental noise | Aleatoric | Statistical noise model |
| Numerical error | Epistemic | Mesh convergence, verification |
Propagation: if \(\boldsymbol{\theta} \sim p(\boldsymbol{\theta})\), then \(\mathbf{y} = \mathcal{M}(\boldsymbol{\theta})\) has a distribution \(p(\mathbf{y})\).
13.8 Polynomial Chaos Expansion
Represent the output as an expansion in orthogonal polynomials of the input random variables: \[ \mathcal{M}(\boldsymbol{\xi}) \approx \sum_{\boldsymbol{\alpha}\in\mathcal{A}} c_{\boldsymbol{\alpha}}\Psi_{\boldsymbol{\alpha}}(\boldsymbol{\xi}) \]
where \(\boldsymbol{\xi}\) are standardised random inputs and \(\Psi_{\boldsymbol{\alpha}}\) are Hermite/Legendre polynomials.
Coefficients \(c_{\boldsymbol{\alpha}}\) computed via non-intrusive sampling (regression or sparse quadrature). Mean and variance extracted analytically from the coefficients.
13.9 Surrogate Modeling Workflow
Five-step process:
- Design of Experiments (DoE): Strategically sample the parameter space (e.g., Latin Hypercube Sampling).
- Training Data: Run expensive FEM at sampled points; collect input-output pairs.
- Surrogate Fit: Train model (GP, NN, PCE) on collected data.
- Validation: Test accuracy on held-out data; compare predictions vs. true model.
- Deployment: Use fast surrogate for UQ, optimization, inverse problems, sensitivity analysis.
Key principle: Exploit structure (polynomial bases, kernel functions) to achieve accuracy with few training points.
13.10 Uncertainty Quantification Framework
Input-to-output propagation:
Given uncertain parameters \(\mathbf{p}\), compute distribution of output \(\mathbf{y} = \mathcal{M}(\mathbf{p})\).
Methods:
| Method | Cost | Accuracy | Suitability |
|---|---|---|---|
| Monte Carlo | \(O(N)\) | Slow \(O(N^{-1/2})\) | Converges for any \(d\) |
| Quasi-MC | \(O(N\log N)\) | \(O(\log^d N/N)\) | High dimension friendly |
| Polynomial Chaos | Spectral | Spectral (if low \(d\)) | Intrinsic UQ |
| Collocation | Grid-based | Curse of dim. | Low-medium \(d\) |
For high-dimensional problems with many parameters, surrogate + Monte Carlo is practical.
13.11 Active Learning and Adaptive Sampling
Rather than uniform sampling, use surrogate uncertainty estimates to guide new training point placement:
- Evaluate surrogate at candidate points; compute prediction variance \(\sigma_*^2\).
- Select points with high uncertainty / high model disagreement.
- Evaluate true model at selected point; add to training data.
- Refit surrogate and repeat.
Result: Efficient use of expensive evaluations; converges faster than static DoE.
GPs naturally provide uncertainty → perfect for active learning.
NNs require external UQ (e.g., ensemble, Bayesian approximation, MC dropout).
13.12 Sensitivity Analysis: Identifying Important Parameters
Once uncertainty is propagated, quantify which parameters drive output variability:
Global sensitivity indices (Sobol’): \[ S_i = \frac{\mathrm{Var}_i[\mathbb{E}[\mathcal{M}|\mathbf{p}_i]]}{\mathrm{Var}[\mathcal{M}]}, \quad S_{ij} = \text{(two-way interactions)} \]
First-order \(S_i\): main effect of parameter \(\mathbf{p}_i\) alone. Total \(S_{T_i}\): includes all interactions involving \(\mathbf{p}_i\).
Computation: Can extract analytically from PCE coefficients or estimate via Monte Carlo sampling from surrogate.
Application: Focus calibration/experiment on high-\(S_i\) parameters; neglect low-sensitivity ones.
13.13 Data-Driven Constitutive Models
Paradigm shift: Replace analytical constitutive law with learned model from data.
Example: Direct mapping \[ \boldsymbol{\sigma}_{n+1} = \mathcal{NN}(\boldsymbol{\varepsilon}_n, \boldsymbol{\varepsilon}_{n+1}, \boldsymbol{\sigma}_n, \boldsymbol{\alpha}_n; \mathbf{w}) \]
Advantages: - Captures complex, multi-scale behavior - No need for analytical model form - Can learn from heterogeneous data (DIC, X-ray, simulations)
Challenges: - Require diverse loading paths (not just uniaxial tension) - Ensure frame-invariance (use tensor invariants or equivariant networks) - Enforce thermodynamic consistency (dissipation inequality as constraint) - Generalization beyond training domain
PINNs approach: Embed physics as soft constraint in loss function: \[ \mathcal{L} = \mathcal{L}_\text{data} + \lambda\mathcal{L}_\text{physics} \]
where \(\mathcal{L}_\text{physics}\) penalizes violation of constitutive relations or balance laws.
13.14 Industrial Applications and Outlook
Current use cases: - Parameter calibration from test data (Bayesian inverse problems) - Real-time digital twins (replace expensive FEM in on-line control) - Robustness analysis (how sensitive is design to material uncertainty?) - Multi-scale modeling (surrogate for microscale → use in macroscale FEM)
Emerging directions: - Hybrid models: combine classical physics with learned corrections - Multifidelity surrogates: leverage both cheap and expensive simulations - Domain adaptation: transfer surrogates across similar materials - Operator learning: learn entire PDE solution operator (DeepONet, FNO)
When to use surrogates: If you need >100 model evaluations AND budget/time is constrained.