MathIsimple
SC-03
Course 3

Spline Interpolation and Data Fitting

Learn piecewise polynomial interpolation to avoid Runge's phenomenon, construct smooth cubic splines, and apply least squares methods for data fitting.

Learning Objectives
  • Understand piecewise interpolation and its advantages
  • Construct cubic spline interpolating functions
  • Apply different boundary conditions for splines
  • Formulate and solve least squares fitting problems
  • Understand orthogonal polynomials and their properties
  • Compare interpolation versus fitting approaches

1. Piecewise Interpolation

High-degree polynomial interpolation can lead to oscillations (Runge's phenomenon). Piecewise interpolation uses low-degree polynomials on each subinterval.

Definition 3.1: Piecewise Polynomial Interpolation

Given a partition a=x0<x1<<xn=ba = x_0 < x_1 < \cdots < x_n = b, a piecewise polynomial of degree k, denoted Sk(x)S_k(x), is a function such that on each subinterval[xi1,xi][x_{i-1}, x_i], Sk(x)S_k(x) is a polynomial of degree at most kk.

Example: Piecewise Linear Interpolation

On each subinterval [xi1,xi][x_{i-1}, x_i]:

S1(x)=yi1xxixi1xi+yixxi1xixi1S_1(x) = y_{i-1} \frac{x - x_i}{x_{i-1} - x_i} + y_i \frac{x - x_{i-1}}{x_i - x_{i-1}}

Theorem 3.1: Piecewise Linear Error

If fC2[a,b]f \in C^2[a, b], the error in piecewise linear interpolation satisfies:

f(x)S1(x)h28maxaxbf(x)|f(x) - S_1(x)| \leq \frac{h^2}{8} \max_{a \leq x \leq b} |f''(x)|

where h=maxi(xixi1)h = \max_i (x_i - x_{i-1}).

Definition 3.2: Piecewise Cubic Hermite Interpolation

On each subinterval [xi1,xi][x_{i-1}, x_i], use a cubic polynomial matchingf(xi1),f(xi),f(xi1),f(xi)f(x_{i-1}), f(x_i), f'(x_{i-1}), f'(x_i).

Error: f(x)H(x)h4384maxf(4)(x)|f(x) - H(x)| \leq \frac{h^4}{384} \max |f^{(4)}(x)|

2. Cubic Spline Interpolation

Definition 3.3: Cubic Spline

A cubic spline s(x)s(x) on [a,b][a, b] with nodesa=x0<x1<<xn=ba = x_0 < x_1 < \cdots < x_n = b satisfies:

  1. s(xi)=yis(x_i) = y_i for i=0,1,,ni = 0, 1, \ldots, n (interpolation)
  2. s(x)s(x) is a cubic polynomial on each [xi,xi+1][x_i, x_{i+1}]
  3. s(x)C2[a,b]s(x) \in C^2[a, b] (continuous second derivative)

Definition 3.4: Boundary Conditions

Common boundary conditions for cubic splines:

  • Natural (Free): s(a)=s(b)=0s''(a) = s''(b) = 0
  • Clamped: s(a)=f(a),  s(b)=f(b)s'(a) = f'(a), \; s'(b) = f'(b)
  • Periodic: s(a)=s(b),  s(a)=s(b)s'(a) = s'(b), \; s''(a) = s''(b)

Theorem 3.2: Spline System

Let hi=xi+1xih_i = x_{i+1} - x_i and mi=s(xi)m_i = s'(x_i). The spline conditions give a tridiagonal system:

(1αi)mi1+2mi+αimi+1=βi,i=1,,n1(1 - \alpha_i) m_{i-1} + 2m_i + \alpha_i m_{i+1} = \beta_i, \quad i = 1, \ldots, n-1

where αi=hi1hi1+hi\alpha_i = \frac{h_{i-1}}{h_{i-1} + h_i} andβi\beta_i depends on the function values.

Note:

The tridiagonal system can be solved efficiently in O(n)O(n) time using theThomas algorithm (tridiagonal matrix algorithm).

Remark:

Cubic splines are widely used in computer graphics, CAD systems, and data visualization because they provide smooth curves with minimal oscillation.

3. Least Squares Fitting

Unlike interpolation (which passes through all points), fitting finds a function that best approximates the data in some sense, allowing for measurement errors.

Definition 3.5: Least Squares Problem

Given data (xi,yi)(x_i, y_i) for i=1,,ni = 1, \ldots, n, find parameters(a,b)(a, b) for ϕ(x)=a+bx\phi(x) = a + bx that minimize:

i=1nδi2=i=1n(yiabxi)2\sum_{i=1}^{n} \delta_i^2 = \sum_{i=1}^{n} (y_i - a - bx_i)^2

Theorem 3.3: Normal Equations

Setting partial derivatives to zero gives the normal equations. In matrix form with X=(1,x)X = (\mathbf{1}, \mathbf{x}) and α=(a,b)T\alpha = (a, b)^T:

XTXα=XTyX^T X \alpha = X^T y

The solution is α=(XTX)1XTy\alpha = (X^T X)^{-1} X^T y.

Example: Linear Regression

For data (1,2),(2,3),(3,5),(4,4)(1, 2), (2, 3), (3, 5), (4, 4):

Design matrix: X=(11121314)X = \begin{pmatrix} 1 & 1 \\ 1 & 2 \\ 1 & 3 \\ 1 & 4 \end{pmatrix}

Normal equations: XTX=(4101030)X^T X = \begin{pmatrix} 4 & 10 \\ 10 & 30 \end{pmatrix},XTy=(1439)X^T y = \begin{pmatrix} 14 \\ 39 \end{pmatrix}

Solution: a=1.5,b=0.7a = 1.5, b = 0.7, so y=1.5+0.7xy = 1.5 + 0.7x

Definition 3.6: General Least Squares

For fitting ϕ(x)=k=0makPk(x)\phi(x) = \sum_{k=0}^{m} a_k P_k(x) where PkP_k are basis functions:

mina0,,ami=1nwi(yik=0makPk(xi))2\min_{a_0, \ldots, a_m} \sum_{i=1}^{n} w_i \left( y_i - \sum_{k=0}^{m} a_k P_k(x_i) \right)^2

where wi>0w_i > 0 are weights.

4. Best Approximation

Definition 3.7: Best Square Approximation

For continuous functions with inner product (f,g)=abρ(x)f(x)g(x)dx(f, g) = \int_a^b \rho(x) f(x) g(x) \, dx, find pnPnp_n \in P_n minimizing:

fpn=(fpn,fpn)\|f - p_n\| = \sqrt{(f - p_n, f - p_n)}
Remark:

The Gram matrix for polynomial approximation on [0,1][0, 1] with ρ(x)=1\rho(x) = 1 is theHilbert matrix:

Hij=1i+j1H_{ij} = \frac{1}{i + j - 1}

This matrix is extremely ill-conditioned, making the problem numerically unstable.

5. Orthogonal Polynomials

Using orthogonal polynomials as basis functions eliminates the ill-conditioning problem.

Definition 3.8: Orthogonal Polynomials

Polynomials {P0,P1,P2,}\{P_0, P_1, P_2, \ldots\} are orthogonal on [a,b][a, b]with weight w(x)w(x) if:

(Pm,Pn)=abw(x)Pm(x)Pn(x)dx=0for mn(P_m, P_n) = \int_a^b w(x) P_m(x) P_n(x) \, dx = 0 \quad \text{for } m \neq n

Legendre Polynomials

Orthogonal on [1,1][-1, 1] with w(x)=1w(x) = 1:

P0=1,  P1=x,  P2=12(3x21),  P3=12(5x33x)P_0 = 1, \; P_1 = x, \; P_2 = \frac{1}{2}(3x^2 - 1), \; P_3 = \frac{1}{2}(5x^3 - 3x)

Recurrence: (n+1)Pn+1=(2n+1)xPnnPn1(n+1)P_{n+1} = (2n+1)xP_n - nP_{n-1}

Chebyshev Polynomials

Orthogonal on [1,1][-1, 1] with w(x)=1/1x2w(x) = 1/\sqrt{1 - x^2}:

Tn(x)=cos(narccosx)T_n(x) = \cos(n \arccos x)

Recurrence: Tn+1=2xTnTn1T_{n+1} = 2xT_n - T_{n-1}

Laguerre Polynomials

Orthogonal on [0,)[0, \infty) with w(x)=exw(x) = e^{-x}:

Ln(x)=exdndxn(xnex)L_n(x) = e^x \frac{d^n}{dx^n}(x^n e^{-x})

Theorem 3.4: Three-Term Recurrence

Any sequence of orthogonal polynomials satisfies a three-term recurrence relation:

Pk+1(x)=(xαk)Pk(x)βk1Pk1(x)P_{k+1}(x) = (x - \alpha_k) P_k(x) - \beta_{k-1} P_{k-1}(x)

where αk=(xPk,Pk)(Pk,Pk)\alpha_k = \frac{(xP_k, P_k)}{(P_k, P_k)} andβk=(Pk+1,Pk+1)(Pk,Pk)\beta_k = \frac{(P_{k+1}, P_{k+1})}{(P_k, P_k)}.

Practice Quiz

Spline and Fitting Quiz
10
Questions
0
Correct
0%
Accuracy
1
A cubic spline s(x)s(x) on [a,b][a, b] with n+1n+1 nodes must satisfy s(x)s(x) \in:
Easy
Not attempted
2
How many conditions are needed to uniquely determine a cubic spline with n+1n+1 nodes?
Medium
Not attempted
3
What is the main advantage of piecewise linear interpolation over high-degree polynomial interpolation?
Easy
Not attempted
4
In least squares fitting, we minimize:
Easy
Not attempted
5
The normal equations for linear least squares fitting y=a+bxy = a + bx can be written as:
Medium
Not attempted
6
Natural cubic spline boundary conditions specify:
Medium
Not attempted
7
Legendre polynomials Pn(x)P_n(x) are orthogonal on [1,1][-1,1] with weight function:
Medium
Not attempted
8
The error in piecewise linear interpolation is O(hk)O(h^k) where kk equals:
Hard
Not attempted
9
The Hilbert matrix Hij=1/(i+j1)H_{ij} = 1/(i+j-1) in best approximation is problematic because:
Hard
Not attempted
10
Chebyshev polynomials Tn(x)T_n(x) satisfy the recurrence Tn+1(x)=T_{n+1}(x) = :
Hard
Not attempted

Frequently Asked Questions

What's the difference between interpolation and fitting?

Interpolation passes exactly through all data points, whilefitting finds the best approximation that may not pass through any point.

Use interpolation when data is exact; use fitting when data contains measurement errors.

Why are cubic splines preferred over higher-degree splines?

Cubic splines provide a good balance: they're smooth enough (C2C^2 continuity) for most applications while avoiding the oscillations of higher-degree polynomials. They also have the minimum curvature property among all interpolating curves.

When should I use clamped vs natural boundary conditions?

Clamped (specified derivatives): When you know the slopes at endpoints, such as from physical constraints or additional measurements.

Natural (zero second derivatives): When no endpoint information is available. The spline behaves like a physical spline that relaxes at the ends.

Why use orthogonal polynomials for fitting?

Orthogonal polynomials diagonalize the Gram matrix, eliminating the ill-conditioning problem. Each coefficient can be computed independently, and adding higher-degree terms doesn't change previously computed coefficients.

How do I choose between different orthogonal polynomial families?

Choose based on the domain and weight function that matches your problem:

  • Legendre: Uniform weight on finite interval
  • Chebyshev: Minimizes maximum error, good for uniform approximation
  • Laguerre: Semi-infinite domain with exponential decay
  • Hermite: Entire real line with Gaussian weight