L11 — Data-Driven & Surrogate Modeling

Gaussian Processes, Neural Networks, and Uncertainty Quantification

📽 Slides: Open presentation

Motivation: The Computational Cost Problem

A single nonlinear FEM simulation may take hours to days.

Applications requiring thousands of evaluations:

  • Parameter calibration (inverse problem)
  • Uncertainty propagation / Monte Carlo
  • Design optimization
  • Real-time digital twins

Surrogate model (metamodel): a cheap approximation \(\hat{\mathcal{M}}(\boldsymbol{\theta}) \approx \mathcal{M}(\boldsymbol{\theta})\) trained on a limited number of FEM evaluations.

Types of Surrogate Models

Method Strengths Limitations
Polynomial Response Surface Simple, cheap Poor for nonlinear/high-dim
Kriging / Gaussian Process (GP) Uncertainty estimates, flexible Scales as \(O(n^3)\)
Radial Basis Functions (RBF) Mesh-free, good interpolation No uncertainty
Neural Networks (NN) Very flexible, scales well Need lots of data, no UQ
Polynomial Chaos Expansion (PCE) Intrinsic UQ, spectral accuracy Curse of dimensionality

Gaussian Process Regression

A GP defines a distribution over functions: \[ f(\mathbf{x}) \sim \mathcal{GP}(m(\mathbf{x}),\, k(\mathbf{x},\mathbf{x}')), \]

where \(m(\cdot)\) is the mean function and \(k(\cdot,\cdot)\) the kernel (covariance function).

Prediction at new point \(\mathbf{x}_*\): \[ \mu_* = k(\mathbf{x}_*,\mathbf{X})[\mathbf{K} + \sigma_n^2\mathbf{I}]^{-1}\mathbf{y}, \qquad \sigma_*^2 = k(\mathbf{x}_*,\mathbf{x}_*) - k(\mathbf{x}_*,\mathbf{X})[\mathbf{K}+\sigma_n^2\mathbf{I}]^{-1}k(\mathbf{X},\mathbf{x}_*). \]

GP gives both prediction and uncertainty estimate — essential for active learning.

Kernels for Mechanics

Common kernels:

  • Squared exponential (RBF): \(k(r) = \sigma_f^2\exp(-r^2/2l^2)\) — smooth, infinitely differentiable
  • Matérn 3/2: \(k(r) = \sigma_f^2(1+\sqrt{3}r/l)\exp(-\sqrt{3}r/l)\) — once-differentiable, better for response curves
  • Periodic: for cyclic loading data

For constitutive modeling: the Matérn class often better matches the finite smoothness of plastic responses.

Neural Networks for Constitutive Models

Data-driven constitutive models replace the analytical form with a NN mapping: \[ \boldsymbol{\sigma}_{n+1} = \text{NN}(\boldsymbol{\varepsilon}_n, \boldsymbol{\varepsilon}_{n+1}, \boldsymbol{\sigma}_n, \boldsymbol{\alpha}_n;\, \mathbf{w}), \]

where \(\mathbf{w}\) are trained weights.

Key considerations:

  • Thermodynamic consistency: encode dissipation inequality as constraint or via special architecture
  • Frame invariance: train on invariants, not raw tensor components
  • Generalization: needs diverse loading paths, not just monotonic tension

Physics-Informed Neural Networks (PINNs)

PINNs embed the governing PDEs in the loss function: \[ \mathcal{L} = \mathcal{L}_\text{data} + \lambda_\text{pde}\mathcal{L}_\text{pde} + \lambda_\text{bc}\mathcal{L}_\text{bc} \]

Applied to constitutive modeling:

  • Enforce constitutive relations as soft constraints
  • Can learn from heterogeneous full-field data (DIC images)
  • Enables simultaneous identification + simulation

Uncertainty Quantification (UQ)

Sources of uncertainty:

Source Type Treatment
Model parameters \(\boldsymbol{\theta}\) Epistemic Bayesian inference, MC
Model form error Epistemic Model comparison, discrepancy
Experimental noise Aleatoric Statistical noise model
Numerical error Epistemic Mesh convergence, verification

Propagation: if \(\boldsymbol{\theta} \sim p(\boldsymbol{\theta})\), then \(\mathbf{y} = \mathcal{M}(\boldsymbol{\theta})\) has a distribution \(p(\mathbf{y})\).

Polynomial Chaos Expansion

Represent the output as an expansion in orthogonal polynomials of the input random variables: \[ \mathcal{M}(\boldsymbol{\xi}) \approx \sum_{\boldsymbol{\alpha}\in\mathcal{A}} c_{\boldsymbol{\alpha}}\Psi_{\boldsymbol{\alpha}}(\boldsymbol{\xi}) \]

where \(\boldsymbol{\xi}\) are standardised random inputs and \(\Psi_{\boldsymbol{\alpha}}\) are Hermite/Legendre polynomials.

Coefficients \(c_{\boldsymbol{\alpha}}\) computed via non-intrusive sampling (regression or sparse quadrature). Mean and variance extracted analytically from the coefficients.

Surrogate Modeling Workflow

Five-step process:

  1. Design of Experiments (DoE): Strategically sample the parameter space (e.g., Latin Hypercube Sampling).
  2. Training Data: Run expensive FEM at sampled points; collect input-output pairs.
  3. Surrogate Fit: Train model (GP, NN, PCE) on collected data.
  4. Validation: Test accuracy on held-out data; compare predictions vs. true model.
  5. Deployment: Use fast surrogate for UQ, optimization, inverse problems, sensitivity analysis.

Key principle: Exploit structure (polynomial bases, kernel functions) to achieve accuracy with few training points.

Uncertainty Quantification Framework

Input-to-output propagation:

Given uncertain parameters \(\mathbf{p}\), compute distribution of output \(\mathbf{y} = \mathcal{M}(\mathbf{p})\).

Methods:

Method Cost Accuracy Suitability
Monte Carlo \(O(N)\) Slow \(O(N^{-1/2})\) Converges for any \(d\)
Quasi-MC \(O(N\log N)\) \(O(\log^d N/N)\) High dimension friendly
Polynomial Chaos Spectral Spectral (if low \(d\)) Intrinsic UQ
Collocation Grid-based Curse of dim. Low-medium \(d\)

For high-dimensional problems with many parameters, surrogate + Monte Carlo is practical.

Active Learning and Adaptive Sampling

Rather than uniform sampling, use surrogate uncertainty estimates to guide new training point placement:

  1. Evaluate surrogate at candidate points; compute prediction variance \(\sigma_*^2\).
  2. Select points with high uncertainty / high model disagreement.
  3. Evaluate true model at selected point; add to training data.
  4. Refit surrogate and repeat.

Result: Efficient use of expensive evaluations; converges faster than static DoE.

GPs naturally provide uncertainty → perfect for active learning.

NNs require external UQ (e.g., ensemble, Bayesian approximation, MC dropout).

Sensitivity Analysis: Identifying Important Parameters

Once uncertainty is propagated, quantify which parameters drive output variability:

Global sensitivity indices (Sobol’): \[ S_i = \frac{\mathrm{Var}_i[\mathbb{E}[\mathcal{M}|\mathbf{p}_i]]}{\mathrm{Var}[\mathcal{M}]}, \quad S_{ij} = \text{(two-way interactions)} \]

First-order \(S_i\): main effect of parameter \(\mathbf{p}_i\) alone. Total \(S_{T_i}\): includes all interactions involving \(\mathbf{p}_i\).

Computation: Can extract analytically from PCE coefficients or estimate via Monte Carlo sampling from surrogate.

Application: Focus calibration/experiment on high-\(S_i\) parameters; neglect low-sensitivity ones.

Data-Driven Constitutive Models

Paradigm shift: Replace analytical constitutive law with learned model from data.

Example: Direct mapping \[ \boldsymbol{\sigma}_{n+1} = \mathcal{NN}(\boldsymbol{\varepsilon}_n, \boldsymbol{\varepsilon}_{n+1}, \boldsymbol{\sigma}_n, \boldsymbol{\alpha}_n; \mathbf{w}) \]

Advantages: - Captures complex, multi-scale behavior - No need for analytical model form - Can learn from heterogeneous data (DIC, X-ray, simulations)

Challenges: - Require diverse loading paths (not just uniaxial tension) - Ensure frame-invariance (use tensor invariants or equivariant networks) - Enforce thermodynamic consistency (dissipation inequality as constraint) - Generalization beyond training domain

PINNs approach: Embed physics as soft constraint in loss function: \[ \mathcal{L} = \mathcal{L}_\text{data} + \lambda\mathcal{L}_\text{physics} \]

where \(\mathcal{L}_\text{physics}\) penalizes violation of constitutive relations or balance laws.

Industrial Applications and Outlook

Current use cases: - Parameter calibration from test data (Bayesian inverse problems) - Real-time digital twins (replace expensive FEM in on-line control) - Robustness analysis (how sensitive is design to material uncertainty?) - Multi-scale modeling (surrogate for microscale → use in macroscale FEM)

Emerging directions: - Hybrid models: combine classical physics with learned corrections - Multifidelity surrogates: leverage both cheap and expensive simulations - Domain adaptation: transfer surrogates across similar materials - Operator learning: learn entire PDE solution operator (DeepONet, FNO)

When to use surrogates: If you need >100 model evaluations AND budget/time is constrained.