3 Deriving the Algebraic System and Galerkin’s Method

Companion notebooks

Work through these hands-on notebooks alongside this chapter:

02 – Galerkin Discretization

3.1 Setup

3.2 Review: The Weak Form

Code

```{python}
#| code-fold: true
print_styled("Assuming constant A, B, C and Dirichlet BCs v(0)=v(L)=0:")
weak_term1 = -Integral(A * diff(v, x) * diff(y, x), (x, 0, L))
weak_term2 = Integral(v * B * diff(y, x), (x, 0, L))
weak_term3 = Integral(v * C * y, (x, 0, L))
rhs_term = Integral(v * F, (x, 0, L))
weak_form_eq = Eq(weak_term1 + weak_term2 + weak_term3, rhs_term)
print_styled("The weak form is:")
Lprint(weak_form_eq)
```

Assuming constant A, B, C and Dirichlet BCs v(0)=v(L)=0:

The weak form is:

\(\displaystyle - \int\limits_{0}^{L} A \frac{d}{d x} v{\left(x \right)} \frac{d}{d x} y{\left(x \right)}\, dx + \int\limits_{0}^{L} B v{\left(x \right)} \frac{d}{d x} y{\left(x \right)}\, dx + \int\limits_{0}^{L} C v{\left(x \right)} y{\left(x \right)}\, dx = \int\limits_{0}^{L} F{\left(x \right)} v{\left(x \right)}\, dx\)

3.3 Discretization Using Shape Functions

Code

```{python}
#| code-fold: true
print_styled("The weak form is:")
Lprint(weak_form_eq)
print_styled("Substitute Discretized Approximations")
print_styled("Approximate solution $y(x)$ and test function $v(x)$ using shape functions $\phi_k(x)$:")

y_approx = Sum(a[j] * phi(j,x), (j, 1, N))
v_approx = Sum(b[i] * phi(i,x), (i, 1, N))

print_styled("Approximate solution $y(x)$:")
Lprint(Eq(y, y_approx, evaluate=False))

print_styled("Test function $v(x)$:")
Lprint(Eq(v, v_approx, evaluate=False))

print_styled("Derivatives:")
dy_dx_approx = Derivative(y_approx, x)
dv_dx_approx = Derivative(v_approx, x)
Lprint(Eq(Derivative(y,x), dy_dx_approx, evaluate=False))
Lprint(Eq(Derivative(v,x), dv_dx_approx, evaluate=False))

print_styled("Substitute sums into the weak form terms:")
term1_subst_sym = weak_term1.subs({v: v_approx, y: y_approx})
term2_subst_sym = weak_term2.subs({v: v_approx, y: y_approx})
term3_subst_sym = weak_term3.subs({v: v_approx, y: y_approx})
rhs_subst_sym = rhs_term.subs({v: v_approx})

print_styled("Term 1 ($-\int A v' y' dx$):")
Lprint(term1_subst_sym)
print_styled("Term 2 ($\int B v y' dx$):")
Lprint(term2_subst_sym)
print_styled("Term 3 ($\int C v y dx$):")
Lprint(term3_subst_sym)
print_styled("RHS Term ($\int v F dx$):")
Lprint(rhs_subst_sym)
```

The weak form is:

Substitute Discretized Approximations

Approximate solution \(y(x)\) and test function \(v(x)\) using shape functions \(\phi_k(x)\):

Approximate solution \(y(x)\):

\(\displaystyle y{\left(x \right)} = \sum_{j=1}^{N} {a}_{j} \phi_{j}(x)\)

Test function \(v(x)\):

\(\displaystyle v{\left(x \right)} = \sum_{i=1}^{N} {b}_{i} \phi_{i}(x)\)

Derivatives:

\(\displaystyle \frac{d}{d x} y{\left(x \right)} = \frac{\partial}{\partial x} \sum_{j=1}^{N} {a}_{j} \phi_{j}(x)\)

\(\displaystyle \frac{d}{d x} v{\left(x \right)} = \frac{\partial}{\partial x} \sum_{i=1}^{N} {b}_{i} \phi_{i}(x)\)

Substitute sums into the weak form terms:

Term 1 (\(-\int A v' y' dx\)):

\(\displaystyle - \int\limits_{0}^{L} A \left(\frac{\partial}{\partial x} \sum_{j=1}^{N} {a}_{j} \phi_{j}(x)\right) \frac{\partial}{\partial x} \sum_{i=1}^{N} {b}_{i} \phi_{i}(x)\, dx\)

Term 2 (\(\int B v y' dx\)):

\(\displaystyle \int\limits_{0}^{L} B \left(\frac{\partial}{\partial x} \sum_{j=1}^{N} {a}_{j} \phi_{j}(x)\right) \sum_{i=1}^{N} {b}_{i} \phi_{i}(x)\, dx\)

Term 3 (\(\int C v y dx\)):

\(\displaystyle \int\limits_{0}^{L} C \left(\sum_{j=1}^{N} {a}_{j} \phi_{j}(x)\right) \sum_{i=1}^{N} {b}_{i} \phi_{i}(x)\, dx\)

RHS Term (\(\int v F dx\)):

\(\displaystyle \int\limits_{0}^{L} F{\left(x \right)} \sum_{i=1}^{N} {b}_{i} \phi_{i}(x)\, dx\)

3.4 Rearranging the Summation and Integrals

Code

```{python}
#| code-fold: true
print_styled("Assuming continuity, swap integration and summation, then factor out $b_i$.")
print_styled(r"The weak form $\sum_{\text{term}} (\text{Term}) = \text{RHS}$ becomes:")
Lprint(r"\sum_{i=1}^N b_i \left( [\text{Inner part for Term 1}]_i + [\text{Inner part for Term 2}]_i + [\text{Inner part for Term 3}]_i \right) = \sum_{i=1}^N b_i [\text{Inner part for RHS}]_i")

print_styled(r"Where the inner parts (coefficients of $b_i$) are:")

term1_inner = Sum( -A * Integral(diff(phi(i,x), x) * diff(phi(j,x), x), (x, 0, L)) * a[j], (j, 1, N))
term2_inner = Sum(  B * Integral(phi(i,x) * diff(phi(j,x), x), (x, 0, L)) * a[j], (j, 1, N))
term3_inner = Sum(  C * Integral(phi(i,x) * phi(j,x), (x, 0, L)) * a[j], (j, 1, N))
rhs_inner = Integral(phi(i,x) * F, (x, 0, L))

print_styled("Inner part for Term 1 (coefficient of $b_i$):")
Lprint(term1_inner)
print_styled("Inner part for Term 2 (coefficient of $b_i$):")
Lprint(term2_inner)
print_styled("Inner part for Term 3 (coefficient of $b_i$):")
Lprint(term3_inner)
print_styled("Inner part for RHS (coefficient of $b_i$):")
Lprint(rhs_inner)
```

Assuming continuity, swap integration and summation, then factor out \(b_i\).

The weak form \(\sum_{\text{term}} (\text{Term}) = \text{RHS}\) becomes:

\(\displaystyle \sum_{i=1}^N b_i \left( [\text{Inner part for Term 1}]_i + [\text{Inner part for Term 2}]_i + [\text{Inner part for Term 3}]_i \right) = \sum_{i=1}^N b_i [\text{Inner part for RHS}]_i\)

Where the inner parts (coefficients of \(b_i\)) are:

Inner part for Term 1 (coefficient of \(b_i\)):

\(\displaystyle \sum_{j=1}^{N} - A {a}_{j} \int\limits_{0}^{L} \frac{d}{d x} \phi_{i}(x) \frac{d}{d x} \phi_{j}(x)\, dx\)

Inner part for Term 2 (coefficient of \(b_i\)):

\(\displaystyle \sum_{j=1}^{N} B {a}_{j} \int\limits_{0}^{L} \phi_{i}(x) \frac{d}{d x} \phi_{j}(x)\, dx\)

Inner part for Term 3 (coefficient of \(b_i\)):

\(\displaystyle \sum_{j=1}^{N} C {a}_{j} \int\limits_{0}^{L} \phi_{i}(x) \phi_{j}(x)\, dx\)

Inner part for RHS (coefficient of \(b_i\)):

\(\displaystyle \int\limits_{0}^{L} F{\left(x \right)} \phi_{i}(x)\, dx\)

3.5 Obtaining N Equations

Code

```{python}
#| code-fold: true
print_styled("The rearranged form is:")
print_styled("$\\sum_{i=1}^N b_i ( \\text{LHS}_i - \\text{RHS}_i ) = 0$.")
print_styled("Since this must hold for arbitrary $b_i$, the term in parentheses must be zero for each $i$:")
print_styled("  $\\text{LHS}_i - \\text{RHS}_i = 0$.")

equation_for_i = Eq(term1_inner + term2_inner + term3_inner, rhs_inner)
print_styled("This gives N equations (for $i = 1, ..., N$):")
Lprint(equation_for_i)
```

The rearranged form is:

\(\sum_{i=1}^N b_i ( \text{LHS}_i - \text{RHS}_i ) = 0\).

Since this must hold for arbitrary \(b_i\), the term in parentheses must be zero for each \(i\):

\(\text{LHS}_i - \text{RHS}_i = 0\).

This gives N equations (for \(i = 1, ..., N\)):

\(\displaystyle \sum_{j=1}^{N} - A {a}_{j} \int\limits_{0}^{L} \frac{d}{d x} \phi_{i}(x) \frac{d}{d x} \phi_{j}(x)\, dx + \sum_{j=1}^{N} B {a}_{j} \int\limits_{0}^{L} \phi_{i}(x) \frac{d}{d x} \phi_{j}(x)\, dx + \sum_{j=1}^{N} C {a}_{j} \int\limits_{0}^{L} \phi_{i}(x) \phi_{j}(x)\, dx = \int\limits_{0}^{L} F{\left(x \right)} \phi_{i}(x)\, dx\)

3.6 Assembling the Matrix System

Code

```{python}
#| code-fold: true
print_styled("The $i$-th equation:")
print_styled("$\\sum_{j=1}^N [ ... ]_{ij} a_j = F_i$")
print_styled("leads to the matrix form:")
print_styled("$K a = F$")

K_ij = -A * Integral(diff(phi(i,x), x) * diff(phi(j,x), x), (x, 0, L)) + \
        B * Integral(phi(i,x) * diff(phi(j,x), x), (x, 0, L)) + \
        C * Integral(phi(i,x) * phi(j,x), (x, 0, L))
F_i = Integral(phi(i,x) * F, (x, 0, L))

print_styled("Where the matrix/vector components are:")
Lprint(Eq(Symbol('K_{ij}'), K_ij))
Lprint(Eq(Symbol('F_i'), F_i))

K_sym = MatrixSymbol('K', N, N)
a_sym = MatrixSymbol('a', N, 1)
F_sym = MatrixSymbol('F', N, 1)
matrix_eq = Eq(K_sym * a_sym, F_sym, evaluate=False)
print_styled("The matrix system:")
Lprint(matrix_eq)

print_styled("Writing the full system for $i=1, ..., N$ gives the matrix form $Ka=F$:")

latex_matrix_form = r"""
\begin{bmatrix}
K_{11} & K_{12} & \dots & K_{1N} \\
K_{21} & K_{22} & \dots & K_{2N} \\
\vdots & \vdots & \ddots & \vdots \\
K_{N1} & K_{N2} & \dots & K_{NN}
\end{bmatrix}
\begin{bmatrix}
a_{1} \\
a_{2} \\
\vdots \\
a_{N}
\end{bmatrix}
=
\begin{bmatrix}
F_{1} \\
F_{2} \\
\vdots \\
F_{N}
\end{bmatrix}
"""
Lprint(latex_matrix_form)

print_styled("This linear system $Ka = F$ can be solved for the unknown coefficients vector $a = [a_1, a_2, ..., a_N]^T$.")
```

The \(i\)-th equation:

\(\sum_{j=1}^N [ ... ]_{ij} a_j = F_i\)

leads to the matrix form:

\(K a = F\)

Where the matrix/vector components are:

\(\displaystyle K_{ij} = - A \int\limits_{0}^{L} \frac{d}{d x} \phi_{i}(x) \frac{d}{d x} \phi_{j}(x)\, dx + B \int\limits_{0}^{L} \phi_{i}(x) \frac{d}{d x} \phi_{j}(x)\, dx + C \int\limits_{0}^{L} \phi_{i}(x) \phi_{j}(x)\, dx\)

\(\displaystyle F_{i} = \int\limits_{0}^{L} F{\left(x \right)} \phi_{i}(x)\, dx\)

The matrix system:

\(\displaystyle K a = F\)

Writing the full system for \(i=1, ..., N\) gives the matrix form \(Ka=F\):

\(\displaystyle \begin{bmatrix} K_{11} & K_{12} & \dots & K_{1N} \\ K_{21} & K_{22} & \dots & K_{2N} \\ \vdots & \vdots & \ddots & \vdots \\ K_{N1} & K_{N2} & \dots & K_{NN} \end{bmatrix} \begin{bmatrix} a_{1} \\ a_{2} \\ \vdots \\ a_{N} \end{bmatrix} = \begin{bmatrix} F_{1} \\ F_{2} \\ \vdots \\ F_{N} \end{bmatrix}\)

This linear system \(Ka = F\) can be solved for the unknown coefficients vector \(a = [a_1, a_2, ..., a_N]^T\).

3.7 Note on Boundary Conditions

Code

```{python}
#| code-fold: true
print_styled("We assumed $y(0)=0$ and $y(L)=0$. How are these applied to the matrix system?")
print_styled("If using nodal basis functions where:")
print_styled("$\phi_k(x_{node}) = \delta_{knode}$")
print_styled("then:")
print_styled("$y(x_1) = y(0) = \sum a_j \phi_j(x_1) = a_1$.")
print_styled("Similarly")
print_styled("$y(x_N) = y(L) = a_N$.")
print_styled("So, for $y(0)=y(L)=0$, we know $a_1 = 0$ and $a_N = 0$.")
print_styled("These known values must be incorporated when solving $Ka=F$. Common methods include:")
print_styled("  - Modifying the K matrix and F vector (e.g., setting rows/columns to enforce the condition).")
print_styled("  - Partitioning the system into known and unknown degrees of freedom.")
print_styled("Handling boundary conditions precisely is a key part of FEM implementation and will be detailed later.")
```

We assumed \(y(0)=0\) and \(y(L)=0\). How are these applied to the matrix system?

If using nodal basis functions where:

\(\phi_k(x_{node}) = \delta_{knode}\)

then:

\(y(x_1) = y(0) = \sum a_j \phi_j(x_1) = a_1\).

Similarly

\(y(x_N) = y(L) = a_N\).

So, for \(y(0)=y(L)=0\), we know \(a_1 = 0\) and \(a_N = 0\).

These known values must be incorporated when solving \(Ka=F\). Common methods include:

Modifying the K matrix and F vector (e.g., setting rows/columns to enforce the condition).

Partitioning the system into known and unknown degrees of freedom.

Handling boundary conditions precisely is a key part of FEM implementation and will be detailed later.

3.8 Weighted Residuals and Galerkin’s Method

Code

```{python}
#| code-fold: true
print_styled("Let's revisit the core idea from a different perspective.")
print_styled("Our original differential equation is $A(y) = F$, where $A$ is the differential operator.")
def differential_operator(y_var):
    return A * diff(y_var, x, 2) + B * diff(y_var, x) + C * y_var
eq = sp.Eq(differential_operator(y), F)
Lprint(eq)
print_styled("Next, we represent $y(x)$ as a linear combination of basis functions $\phi_j(x)$, resulting in an approximated function $y^N(x) \\approx y(x)$:")

yn = Function('y^N')(x)
Lprint(Eq(yn, y_approx, evaluate=False))

print_styled("Now, we can define the residual $r^N(x)$ for the approximate solution $y^N(x)$ given by $A(y^N)-F=r^N$:")
A_op = lambda func: A * diff(func, x, 2) + B * diff(func, x) + C * func
r_N = Symbol('r^N')
residual_def = Eq(r_N, A_op(y_approx) - F, evaluate=False)
Lprint(residual_def)
print_styled("The residual $r^N(x)$ is the error in the approximation. The goal of FEM is to make this residual 'small' in some sense.")
n = sp.Symbol('n', integer=True, positive=True)
PI = sp.Symbol('PI', positive=True)
print_styled("Since $r^N$ is a function on $[0,L]$, we can measure its 'size' using:")
Lprint(Eq(PI, Integral(r_N**2, (x, 0, L)), evaluate=False))
print_styled("A possible approach is to minimize $\\Pi$ by taking the derivative with respect to the coefficients $a_j$ and setting it to zero.")
```

Let’s revisit the core idea from a different perspective.

Our original differential equation is \(A(y) = F\), where \(A\) is the differential operator.

\(\displaystyle A \frac{d^{2}}{d x^{2}} y{\left(x \right)} + B \frac{d}{d x} y{\left(x \right)} + C y{\left(x \right)} = F{\left(x \right)}\)

Next, we represent \(y(x)\) as a linear combination of basis functions \(\phi_j(x)\), resulting in an approximated function \(y^N(x) \approx y(x)\):

\(\displaystyle y^{N}{\left(x \right)} = \sum_{j=1}^{N} {a}_{j} \phi_{j}(x)\)

Now, we can define the residual \(r^N(x)\) for the approximate solution \(y^N(x)\) given by \(A(y^N)-F=r^N\):

\(\displaystyle r^{N} = A \sum_{j=1}^{N} {a}_{j} \frac{d^{2}}{d x^{2}} \phi_{j}(x) + B \sum_{j=1}^{N} {a}_{j} \frac{d}{d x} \phi_{j}(x) + C \sum_{j=1}^{N} {a}_{j} \phi_{j}(x) - F{\left(x \right)}\)

The residual \(r^N(x)\) is the error in the approximation. The goal of FEM is to make this residual ‘small’ in some sense.

Since \(r^N\) is a function on \([0,L]\), we can measure its ‘size’ using:

\(\displaystyle \pi = \int\limits_{0}^{L} \left(r^{N}\right)^{2}\, dx\)

A possible approach is to minimize \(\Pi\) by taking the derivative with respect to the coefficients \(a_j\) and setting it to zero.

3.8.1 The Galerkin Choice

Code

```{python}
#| code-fold: true
print_styled("In the Galerkin approach, we define weighting functions $w_i(x)$.")
w = Function('w')
weighted_residual_eq = Eq(Integral(r_N * w(i, x), (x, 0, L)), 0)
Lprint(weighted_residual_eq)
print_styled("The **Method of Weighted Residuals** requires the residual to be orthogonal to the weighting functions $w_i(x)$.")
print_styled("The **Galerkin Method** makes a specific choice: use the *same* functions for weighting as for the basis/shape functions.")
print_styled("This means we set $w_i(x) = \\phi_i(x)$:")
galerkin_eq_from_WR = weighted_residual_eq.subs({w(i,x): phi(i,x)})
Lprint(galerkin_eq_from_WR)
print_styled(r"This must hold for $i = 1, 2, ..., N$.")

print_styled("Substituting the definition of $r^N$ into the Galerkin equation gives:")
galerkin_expanded = galerkin_eq_from_WR.subs({r_N: A_op(y_approx) - F})
Lprint(galerkin_expanded)

print_styled("Applying integration by parts to the second derivative term and requiring that $\\phi_i$ satisfy the boundary conditions, we obtain:")
print_styled(r"$\int_0^L \left( -A \frac{d\phi_i}{dx} \sum_{j=1}^N\frac{d\phi_j}{dx} + B \phi_i \sum_{j=1}^N\frac{d\phi_j}{dx} + C \phi_i \sum_{j=1}^N\phi_j - \phi_i F \right) dx = 0$")

print_styled("This is the *exact same* $i$-th equation we derived earlier from the weak form.")
print_styled("Therefore, the Galerkin method provides a powerful theoretical justification: it systematically minimizes the residual by making it orthogonal to the basis functions used for the approximation.")
```

In the Galerkin approach, we define weighting functions \(w_i(x)\).

\(\displaystyle \int\limits_{0}^{L} r^{N} w{\left(i,x \right)}\, dx = 0\)

The Method of Weighted Residuals requires the residual to be orthogonal to the weighting functions \(w_i(x)\).

The Galerkin Method makes a specific choice: use the same functions for weighting as for the basis/shape functions.

This means we set \(w_i(x) = \phi_i(x)\):

\(\displaystyle \int\limits_{0}^{L} r^{N} \phi_{i}(x)\, dx = 0\)

This must hold for \(i = 1, 2, ..., N\).

Substituting the definition of \(r^N\) into the Galerkin equation gives:

\(\displaystyle \int\limits_{0}^{L} \left(A \sum_{j=1}^{N} {a}_{j} \frac{d^{2}}{d x^{2}} \phi_{j}(x) + B \sum_{j=1}^{N} {a}_{j} \frac{d}{d x} \phi_{j}(x) + C \sum_{j=1}^{N} {a}_{j} \phi_{j}(x) - F{\left(x \right)}\right) \phi_{i}(x)\, dx = 0\)

Applying integration by parts to the second derivative term and requiring that \(\phi_i\) satisfy the boundary conditions, we obtain:

\(\int_0^L \left( -A \frac{d\phi_i}{dx} \sum_{j=1}^N\frac{d\phi_j}{dx} + B \phi_i \sum_{j=1}^N\frac{d\phi_j}{dx} + C \phi_i \sum_{j=1}^N\phi_j - \phi_i F \right) dx = 0\)

This is the exact same \(i\)-th equation we derived earlier from the weak form.

Therefore, the Galerkin method provides a powerful theoretical justification: it systematically minimizes the residual by making it orthogonal to the basis functions used for the approximation.