MathIsimple
SC-06
Course 6

Direct Methods for Linear Systems

Learn to solve linear systems Ax=bAx = b using direct methods. From Gaussian elimination to LU decomposition, understand the algorithms, their stability, and when to use them.

Learning Objectives
  • Implement Gaussian elimination with back substitution
  • Understand and apply pivoting strategies
  • Compute LU decomposition and use it for multiple right-hand sides
  • Apply Cholesky factorization for symmetric positive definite matrices
  • Analyze condition numbers and their effect on solution accuracy
  • Exploit special matrix structures for efficiency

1. Gaussian Elimination

Gaussian elimination transforms a linear system into upper triangular form, which is then solved by back substitution.

Algorithm Gaussian Elimination

Input: Matrix AA, vector bb

Forward Elimination: For k=1,,n1k = 1, \ldots, n-1:

  1. For i=k+1,,ni = k+1, \ldots, n: compute multiplier mik=aik/akkm_{ik} = a_{ik}/a_{kk}
  2. Update row ii: aijaijmikakja_{ij} \leftarrow a_{ij} - m_{ik} a_{kj} for j=k,,nj = k, \ldots, n
  3. Update bibimikbkb_i \leftarrow b_i - m_{ik} b_k

Back Substitution: For k=n,n1,,1k = n, n-1, \ldots, 1:

xk=1akk(bkj=k+1nakjxj)x_k = \frac{1}{a_{kk}} \left( b_k - \sum_{j=k+1}^{n} a_{kj} x_j \right)

Theorem 6.1: Operation Count

Gaussian elimination requires:

  • Forward elimination: 2n33+O(n2)\frac{2n^3}{3} + O(n^2) operations
  • Back substitution: n2n^2 operations

Example: Gaussian Elimination

Solve the system:

(211312212)(x1x2x3)=(8113)\begin{pmatrix} 2 & 1 & -1 \\ -3 & -1 & 2 \\ -2 & 1 & 2 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} = \begin{pmatrix} 8 \\ -11 \\ -3 \end{pmatrix}

After forward elimination: x3=1,x2=3,x1=2x_3 = -1, x_2 = 3, x_1 = 2

2. Pivoting Strategies

When the pivot akka_{kk} is zero or very small, we need to reorder rows (and possibly columns) to maintain numerical stability.

Definition 6.1: Partial Pivoting

At step kk, find the row pkp \geq k with maxikaik\max_{i \geq k} |a_{ik}|, then swap rows kk and pp.

This ensures mik1|m_{ik}| \leq 1 for all multipliers.

Definition 6.2: Complete Pivoting

Search both rows and columns: find maxi,jkaij\max_{i,j \geq k} |a_{ij}| and swap both rows and columns. More stable but rarely needed in practice.

Remark:

Scaled partial pivoting: Considers relative element sizes by comparingaik/si|a_{ik}|/s_i where si=maxjaijs_i = \max_j |a_{ij}| is the row scale factor.

Note:

Partial pivoting is almost always sufficient and is the default in production software. The growth factor (maximum element during elimination divided by maximum in original matrix) is typically small with partial pivoting.

3. LU Decomposition

Definition 6.3: LU Factorization

Express AA as A=LUA = LU where:

  • LL is lower triangular (with 1s on diagonal for Doolittle)
  • UU is upper triangular

Theorem 6.2: Existence of LU

If all leading principal minors of AA are nonzero, then AA has a unique LU factorization with unit diagonal on LL.

Algorithm Solving with LU

To solve Ax=bAx = b given A=LUA = LU:

  1. Forward substitution: Solve Ly=bLy = b for yy
  2. Back substitution: Solve Ux=yUx = y for xx

Each solve is O(n2)O(n^2), efficient for multiple right-hand sides.

Definition 6.4: LU with Pivoting

With partial pivoting, we get PA=LUPA = LU where PP is a permutation matrix.

To solve: Ax=bPAx=PbLUx=PbAx = b \Rightarrow PAx = Pb \Rightarrow LUx = Pb

Example: LU Decomposition

A=(2143)=(1021)(2101)=LUA = \begin{pmatrix} 2 & 1 \\ 4 & 3 \end{pmatrix} = \begin{pmatrix} 1 & 0 \\ 2 & 1 \end{pmatrix} \begin{pmatrix} 2 & 1 \\ 0 & 1 \end{pmatrix} = LU

4. Special Matrix Structures

Definition 6.5: Cholesky Factorization

For symmetric positive definite AA: A=LLTA = LL^T where LL is lower triangular with positive diagonal.

ljj=ajjk=1j1ljk2,lij=1ljj(aijk=1j1likljk)l_{jj} = \sqrt{a_{jj} - \sum_{k=1}^{j-1} l_{jk}^2}, \quad l_{ij} = \frac{1}{l_{jj}}\left(a_{ij} - \sum_{k=1}^{j-1} l_{ik}l_{jk}\right)
Remark:

Cholesky factorization requires only n36\frac{n^3}{6} operations (half of LU) and is numerically stable without pivoting for SPD matrices.

Definition 6.6: Tridiagonal Systems

For tridiagonal matrix AA with diagonals ai,bi,cia_i, b_i, c_i, theThomas algorithm solves in O(n)O(n):

(b1c1a2b2c2anbn)\begin{pmatrix} b_1 & c_1 \\ a_2 & b_2 & c_2 \\ & \ddots & \ddots & \ddots \\ & & a_n & b_n \end{pmatrix}

Algorithm Thomas Algorithm

Forward sweep: For i=2,,ni = 2, \ldots, n:

w=ai/bi1,bibiwci1,didiwdi1w = a_i / b_{i-1}, \quad b_i \leftarrow b_i - w c_{i-1}, \quad d_i \leftarrow d_i - w d_{i-1}

Back substitution: xn=dn/bnx_n = d_n/b_n, then for i=n1,,1i = n-1, \ldots, 1:

xi=(dicixi+1)/bix_i = (d_i - c_i x_{i+1}) / b_i

5. Error Analysis and Condition Numbers

Definition 6.7: Matrix Norms

Common matrix norms:

  • A1=maxjiaij\|A\|_1 = \max_j \sum_i |a_{ij}| (max column sum)
  • A=maxijaij\|A\|_\infty = \max_i \sum_j |a_{ij}| (max row sum)
  • A2=ρ(ATA)\|A\|_2 = \sqrt{\rho(A^T A)} (spectral norm)
  • AF=i,jaij2\|A\|_F = \sqrt{\sum_{i,j} |a_{ij}|^2} (Frobenius norm)

Definition 6.8: Condition Number

κ(A)=AA1\kappa(A) = \|A\| \cdot \|A^{-1}\|

For the 2-norm: κ2(A)=σmax/σmin\kappa_2(A) = \sigma_{\max} / \sigma_{\min} (ratio of largest to smallest singular value).

Theorem 6.3: Error Bound

If Ax=bAx = b and (A+δA)(x+δx)=b+δb(A + \delta A)(x + \delta x) = b + \delta b, then:

δxxκ(A)(δAA+δbb)\frac{\|\delta x\|}{\|x\|} \leq \kappa(A) \left( \frac{\|\delta A\|}{\|A\|} + \frac{\|\delta b\|}{\|b\|} \right)

Example: Ill-Conditioned System

The Hilbert matrix Hij=1/(i+j1)H_{ij} = 1/(i+j-1) is extremely ill-conditioned:

κ(H5)4.8×105\kappa(H_5) \approx 4.8 \times 10^5

κ(H10)1.6×1013\kappa(H_{10}) \approx 1.6 \times 10^{13}

Solving systems with such matrices loses many digits of accuracy.

Remark:

Rule of thumb: If κ(A)10k\kappa(A) \approx 10^k and you work indd-digit precision, expect about dkd - k correct digits in the solution.

Practice Quiz

Direct Linear Systems Quiz
10
Questions
0
Correct
0%
Accuracy
1
Gaussian elimination without pivoting transforms Ax=bAx = b into:
Easy
Not attempted
2
The computational complexity of Gaussian elimination for an n×nn \times n system is:
Easy
Not attempted
3
LU decomposition factors AA as:
Easy
Not attempted
4
Partial pivoting in Gaussian elimination selects the pivot by:
Medium
Not attempted
5
For a symmetric positive definite matrix, the Cholesky factorization gives:
Medium
Not attempted
6
The condition number κ(A)\kappa(A) measures:
Medium
Not attempted
7
A tridiagonal system can be solved in:
Medium
Not attempted
8
If κ(A)=1010\kappa(A) = 10^{10} and we use double precision, we can expect to lose approximately:
Hard
Not attempted
9
The Doolittle LU decomposition sets:
Hard
Not attempted
10
For the pp-norm, which inequality relates the condition number to relative errors?
Hard
Not attempted

Frequently Asked Questions

When should I use LU vs. Gaussian elimination?

Single right-hand side: Both are equivalent in cost.

Multiple right-hand sides: Use LU. Factor once (O(n3)O(n^3)), then solve each system in O(n2)O(n^2).

Why is pivoting necessary?

Without pivoting, a zero or near-zero pivot causes division by zero or severe error amplification.

Partial pivoting keeps multipliers bounded (mik1|m_{ik}| \leq 1), limiting error growth. It's essential for numerical stability.

When should I use Cholesky vs. LU?

Cholesky: When AA is symmetric positive definite. It's twice as fast, more stable, and needs no pivoting.

LU: For general (non-symmetric or indefinite) matrices.

How do I know if my matrix is ill-conditioned?

Compute κ(A)\kappa(A) using your linear algebra library. Ifκ1/ϵmach\kappa \approx 1/\epsilon_{\text{mach}} (about 101610^{16} for double precision), the matrix is effectively singular. For κ>1010\kappa > 10^{10}, be cautious about the accuracy of results.

What if the matrix is sparse?

Use sparse matrix data structures and algorithms (sparse LU with fill-reducing orderings). Libraries like UMFPACK, SuperLU, or SuiteSparse are designed for this. Direct methods may still produce significant fill-in; iterative methods (next chapter) may be better.