MathIsimple
LA-7.1
Available
Core Topic

Inner Product Definition

Inner products generalize the dot product to abstract vector spaces, providing a way to measure lengths, angles, and orthogonality. They form the foundation for the geometric structure of Hilbert spaces and underpin applications from quantum mechanics to signal processing.

This module introduces the axiomatic definition of inner products, explores key examples on various spaces, and establishes fundamental inequalities that govern the geometry of inner product spaces.

Learning Objectives
  • Define inner products on real and complex vector spaces
  • Understand the four axioms: linearity, conjugate symmetry, positive definiteness
  • Compute inner products on standard spaces like ℝⁿ, ℂⁿ, and function spaces
  • Derive the induced norm from an inner product
  • Prove and apply the Cauchy-Schwarz inequality
  • Understand the triangle inequality and parallelogram law
  • Recognize inner product spaces as special normed spaces
  • Apply inner products to define angles between vectors
Prerequisites
  • Vector spaces and subspaces (LA-2.1-2.3)
  • Linear independence and bases (LA-2.4-2.5)
  • Complex numbers and conjugation (LA-1.2)
  • Basic properties of norms and absolute values

1. Definition of Inner Product

An inner product equips a vector space with geometric structure—the ability to measure lengths and angles. We begin with the abstract definition, then explore concrete examples.

Definition 7.1: Inner Product (Real Vector Space)

Let VV be a vector space over R\mathbb{R}. An inner product on VV is a function ,:V×VR\langle \cdot, \cdot \rangle : V \times V \to \mathbb{R} satisfying:

  1. Linearity in first argument: αx+βy,z=αx,z+βy,z\langle \alpha x + \beta y, z \rangle = \alpha\langle x, z \rangle + \beta\langle y, z \rangle
  2. Symmetry: x,y=y,x\langle x, y \rangle = \langle y, x \rangle
  3. Positive definiteness: x,x0\langle x, x \rangle \geq 0, with equality iff x=0x = 0
Remark 7.1: Bilinearity

For real inner products, symmetry plus linearity in the first argument implies linearity in the second argument:

x,αy+βz=αy+βz,x=αy,x+βz,x=αx,y+βx,z\langle x, \alpha y + \beta z \rangle = \langle \alpha y + \beta z, x \rangle = \alpha\langle y, x \rangle + \beta\langle z, x \rangle = \alpha\langle x, y \rangle + \beta\langle x, z \rangle

Thus real inner products are bilinear (linear in both arguments).

Definition 7.2: Inner Product (Complex Vector Space)

Let VV be a vector space over C\mathbb{C}. An inner product on VV is a function ,:V×VC\langle \cdot, \cdot \rangle : V \times V \to \mathbb{C} satisfying:

  1. Linearity in first argument: αx+βy,z=αx,z+βy,z\langle \alpha x + \beta y, z \rangle = \alpha\langle x, z \rangle + \beta\langle y, z \rangle
  2. Conjugate symmetry: x,y=y,x\langle x, y \rangle = \overline{\langle y, x \rangle}
  3. Positive definiteness: x,x0\langle x, x \rangle \geq 0 (real), with equality iff x=0x = 0
Remark 7.2: Sesquilinearity

Complex inner products are sesquilinear (conjugate-linear in the second argument):

x,αy=αy,x=αy,x=αˉx,y\langle x, \alpha y \rangle = \overline{\langle \alpha y, x \rangle} = \overline{\alpha}\overline{\langle y, x \rangle} = \bar{\alpha}\langle x, y \rangle

Note: Some texts use the opposite convention (linear in second argument). We follow the physicist's convention.

Remark 7.3: Why Conjugate Symmetry?

Conjugate symmetry ensures x,x\langle x, x \rangle is always real:

x,x=x,x    x,xR\langle x, x \rangle = \overline{\langle x, x \rangle} \implies \langle x, x \rangle \in \mathbb{R}

Without this, positive definiteness wouldn't make sense for complex spaces.

Definition 7.3: Inner Product Space

A vector space VV equipped with an inner product ,\langle \cdot, \cdot \rangle is called an inner product space (or pre-Hilbert space).

Definition 7.3a: Hilbert Space

A Hilbert space is a complete inner product space—one where every Cauchy sequence converges. Finite-dimensional inner product spaces are automatically complete (and hence Hilbert spaces).

Remark 7.3a: Equivalent Formulations

The inner product axioms can be stated in several equivalent ways:

  • Alternative 1: Linearity in second argument, conjugate-linearity in first (mathematician's convention)
  • Alternative 2: Bilinear + symmetric + positive definite (real case only)
  • Alternative 3: Sesquilinear + Hermitian + positive definite

These are all equivalent up to which argument is linear vs. conjugate-linear.

Theorem 7.0: Inner Product from Positive Definite Matrix

For any positive definite Hermitian matrix AA, the function x,yA=xHAy\langle x, y \rangle_A = x^H A y defines an inner product on Cn\mathbb{C}^n.

Proof:

Linearity: αx+βy,zA=(αx+βy)HAz=αxHAz+βyHAz\langle \alpha x + \beta y, z \rangle_A = (\alpha x + \beta y)^H A z = \alpha x^H A z + \beta y^H A z

Conjugate symmetry: y,xA=yHAx=(xHAHy)=(xHAy)=x,yA\langle y, x \rangle_A = y^H A x = (x^H A^H y)^* = (x^H A y)^* = \overline{\langle x, y \rangle_A}

Positive definiteness: Since AA is positive definite, xHAx>0x^H A x > 0 for all x0x \neq 0.

Example 7.0a: Verifying Inner Product Axioms

Show that f,g=01f(x)g(x)dx\langle f, g \rangle = \int_0^1 f(x)g(x) \, dx is an inner product on C[0,1]C[0,1].

Linearity: αf+βg,h=01(αf+βg)h=α01fh+β01gh\langle \alpha f + \beta g, h \rangle = \int_0^1 (\alpha f + \beta g)h = \alpha \int_0^1 fh + \beta \int_0^1 gh

Symmetry: g,f=01gf=01fg=f,g\langle g, f \rangle = \int_0^1 gf = \int_0^1 fg = \langle f, g \rangle

Positive definiteness: f,f=01f20\langle f, f \rangle = \int_0^1 f^2 \geq 0, with equality iff f=0f = 0 (for continuous ff).

Remark 7.3b: Semi-Inner Products

If we relax positive definiteness to x,x0\langle x, x \rangle \geq 0 (allowing x,x=0\langle x, x \rangle = 0 for x0x \neq 0), we get a semi-inner product or pseudo-inner product. This induces a seminorm rather than a norm.

Remark 7.3c: Historical Note

The concept of inner product was developed in the early 20th century. Key contributors include:

  • David Hilbert (1862-1943): Formalized infinite-dimensional spaces (Hilbert spaces)
  • John von Neumann (1903-1957): Axiomatized quantum mechanics using Hilbert spaces
  • Frigyes Riesz (1880-1956): Proved the Riesz representation theorem
Example 7.0b: Non-Example: Not an Inner Product

Consider x,y=x1y1x2y2\langle x, y \rangle = x_1 y_1 - x_2 y_2 on R2\mathbb{R}^2.

For x=(1,1)Tx = (1, 1)^T: x,x=11=0\langle x, x \rangle = 1 - 1 = 0 but x0x \neq 0.

This violates positive definiteness, so it's NOT an inner product. (It's a symmetric bilinear form, but indefinite.)

Theorem 7.0a: Inner Product from Norm (Polarization)

If a norm \|\cdot\| satisfies the parallelogram law, then it comes from an inner product defined by the polarization identity:

x,y=14(x+y2xy2)\langle x, y \rangle = \frac{1}{4}(\|x + y\|^2 - \|x - y\|^2)

for real spaces. For complex spaces, use the full polarization formula.

Remark 7.3d: Why Inner Products Matter

Inner products provide:

  • Geometry: Length, angle, orthogonality—making algebra geometric
  • Optimization: Projections give best approximations
  • Analysis: Completeness leads to Hilbert spaces and functional analysis
  • Physics: Quantum mechanics is built on complex Hilbert spaces
  • Applications: Signal processing, machine learning, statistics

2. Standard Examples

Example 7.1: Euclidean Inner Product on ℝⁿ

The standard (Euclidean, dot) inner product on Rn\mathbb{R}^n:

x,y=i=1nxiyi=x1y1+x2y2++xnyn=xTy\langle x, y \rangle = \sum_{i=1}^{n} x_i y_i = x_1 y_1 + x_2 y_2 + \cdots + x_n y_n = x^T y

For x=(1,2,3)Tx = (1, 2, 3)^T and y=(4,1,2)Ty = (4, -1, 2)^T:

x,y=1(4)+2(1)+3(2)=42+6=8\langle x, y \rangle = 1(4) + 2(-1) + 3(2) = 4 - 2 + 6 = 8
Example 7.2: Standard Inner Product on ℂⁿ

The standard inner product on Cn\mathbb{C}^n:

x,y=i=1nxiyi=xy=xHy\langle x, y \rangle = \sum_{i=1}^{n} x_i \overline{y_i} = x^* y = x^H y

where xH=xˉTx^H = \bar{x}^T is the conjugate transpose (Hermitian transpose).

For x=(1+i,2)Tx = (1+i, 2)^T and y=(1,i)Ty = (1, i)^T:

x,y=(1+i)(1)+2(i)=1+i2i=1i\langle x, y \rangle = (1+i)(1) + 2(-i) = 1 + i - 2i = 1 - i
Example 7.3: Weighted Inner Product

For positive weights w1,,wn>0w_1, \ldots, w_n > 0, define on Rn\mathbb{R}^n:

x,yw=i=1nwixiyi\langle x, y \rangle_w = \sum_{i=1}^{n} w_i x_i y_i

This is useful when different coordinates have different importance or units.

Example 7.4: L² Inner Product on Function Spaces

On continuous functions C[a,b]C[a,b]:

f,g=abf(t)g(t)dt\langle f, g \rangle = \int_a^b f(t)g(t)\, dt

For f(t)=tf(t) = t and g(t)=t2g(t) = t^2 on [0,1][0,1]:

f,g=01tt2dt=01t3dt=14\langle f, g \rangle = \int_0^1 t \cdot t^2\, dt = \int_0^1 t^3\, dt = \frac{1}{4}
Example 7.5: Inner Product on Polynomials

On Pn(R)P_n(\mathbb{R}) (polynomials of degree ≤ n):

p,q=11p(x)q(x)dx\langle p, q \rangle = \int_{-1}^{1} p(x)q(x)\, dx

Legendre polynomials are orthogonal with respect to this inner product.

Example 7.6: Frobenius Inner Product on Matrices

On Mm×n(R)M_{m \times n}(\mathbb{R}):

A,B=tr(ATB)=i,jaijbij\langle A, B \rangle = \text{tr}(A^T B) = \sum_{i,j} a_{ij} b_{ij}

This treats a matrix as a vector of mnmn entries.

Example 7.7: Matrix Inner Product (General)

For any positive definite matrix AMn(R)A \in M_n(\mathbb{R}):

x,yA=xTAy\langle x, y \rangle_A = x^T A y

This defines a valid inner product. When A=IA = I, we recover the standard inner product.

Example 7.7a: Sequence Space ℓ²

The space 2\ell^2 of square-summable sequences (x1,x2,)(x_1, x_2, \ldots):

x,y=n=1xnyn\langle x, y \rangle = \sum_{n=1}^{\infty} x_n \overline{y_n}

This is an infinite-dimensional Hilbert space, fundamental in functional analysis.

Example 7.7b: Weighted L² Inner Product

For a positive weight function w(t)>0w(t) > 0:

f,gw=abf(t)g(t)w(t)dt\langle f, g \rangle_w = \int_a^b f(t)g(t)w(t)\, dt

Different weights give different orthogonal polynomial families:

  • w(t)=1w(t) = 1 on [1,1][-1,1]: Legendre polynomials
  • w(t)=etw(t) = e^{-t} on [0,)[0,\infty): Laguerre polynomials
  • w(t)=et2w(t) = e^{-t^2} on (,)(-\infty, \infty): Hermite polynomials
  • w(t)=(1t2)1/2w(t) = (1-t^2)^{-1/2} on (1,1)(-1,1): Chebyshev polynomials
Example 7.7c: Inner Product on Complex Matrices

On Mm×n(C)M_{m \times n}(\mathbb{C}), the Frobenius inner product:

A,B=tr(AHB)=i,jaijbij\langle A, B \rangle = \text{tr}(A^H B) = \sum_{i,j} \overline{a_{ij}} b_{ij}

This induces the Frobenius norm: AF=tr(AHA)=aij2\|A\|_F = \sqrt{\text{tr}(A^H A)} = \sqrt{\sum |a_{ij}|^2}

Remark 7.4a: Classification of Inner Products

Inner products can be classified by their domain:

  • Finite-dimensional: Rn,Cn\mathbb{R}^n, \mathbb{C}^n, polynomial spaces PnP_n
  • Sequence spaces: 2,p\ell^2, \ell^p (though p\ell^p for p2p \neq 2 is not an inner product space)
  • Function spaces: L2,C[a,b],H1L^2, C[a,b], H^1 (Sobolev spaces)

3. Induced Norm

Definition 7.4: Induced Norm

Every inner product induces a norm (notion of length):

x=x,x\|x\| = \sqrt{\langle x, x \rangle}
Theorem 7.1: Norm Properties

The induced norm satisfies:

  1. Non-negativity: x0\|x\| \geq 0, with x=0\|x\| = 0 iff x=0x = 0
  2. Homogeneity: αx=αx\|\alpha x\| = |\alpha| \cdot \|x\|
  3. Triangle inequality: x+yx+y\|x + y\| \leq \|x\| + \|y\|
Proof:

(1) Follows directly from positive definiteness of inner product.

(2) αx2=αx,αx=α2x,x=α2x2\|\alpha x\|^2 = \langle \alpha x, \alpha x \rangle = |\alpha|^2 \langle x, x \rangle = |\alpha|^2 \|x\|^2

(3) Follows from Cauchy-Schwarz (proven next).

Example 7.8: Euclidean Norm

On Rn\mathbb{R}^n with standard inner product:

x=x12+x22++xn2\|x\| = \sqrt{x_1^2 + x_2^2 + \cdots + x_n^2}

For x=(3,4)Tx = (3, 4)^T: x=9+16=5\|x\| = \sqrt{9 + 16} = 5

Example 7.9: L² Norm

On C[a,b]C[a,b]:

f2=abf(t)2dt\|f\|_2 = \sqrt{\int_a^b |f(t)|^2\, dt}

For f(t)=sin(t)f(t) = \sin(t) on [0,π][0, \pi]: f2=π/2\|f\|_2 = \sqrt{\pi/2}

Remark 7.4: Not All Norms Come from Inner Products

The 1-norm x1=xi\|x\|_1 = \sum |x_i| and ∞-norm x=maxxi\|x\|_\infty = \max |x_i| do NOT come from any inner product. Only norms satisfying the parallelogram law (see below) are induced by inner products.

Theorem 7.1a: Distance Function

The induced norm defines a metric (distance function):

d(x,y)=xy=xy,xyd(x, y) = \|x - y\| = \sqrt{\langle x - y, x - y \rangle}

This satisfies: (1) d(x,y)0d(x,y) \geq 0, (2) d(x,y)=0    x=yd(x,y) = 0 \iff x = y, (3) d(x,y)=d(y,x)d(x,y) = d(y,x), (4) d(x,z)d(x,y)+d(y,z)d(x,z) \leq d(x,y) + d(y,z).

Example 7.9a: Computing Distances

In R3\mathbb{R}^3 with standard inner product, find the distance between x=(1,2,3)Tx = (1, 2, 3)^T and y=(4,0,1)Ty = (4, 0, 1)^T:

d(x,y)=xy=(3,2,2)=9+4+4=17d(x,y) = \|x - y\| = \|(-3, 2, 2)\| = \sqrt{9 + 4 + 4} = \sqrt{17}
Example 7.9b: L² Distance Between Functions

The L2L^2 distance between f(t)=tf(t) = t and g(t)=t2g(t) = t^2 on [0,1][0,1]:

d(f,g)=01(tt2)2dt=01(t22t3+t4)dtd(f,g) = \sqrt{\int_0^1 (t - t^2)^2 dt} = \sqrt{\int_0^1 (t^2 - 2t^3 + t^4) dt}
=1312+15=1015+630=130= \sqrt{\frac{1}{3} - \frac{1}{2} + \frac{1}{5}} = \sqrt{\frac{10 - 15 + 6}{30}} = \sqrt{\frac{1}{30}}
Theorem 7.1b: Norm Squared Expansion

For any vectors x,yx, y:

x+y2=x2+2Rex,y+y2\|x + y\|^2 = \|x\|^2 + 2\text{Re}\langle x, y \rangle + \|y\|^2
xy2=x22Rex,y+y2\|x - y\|^2 = \|x\|^2 - 2\text{Re}\langle x, y \rangle + \|y\|^2
Proof:

Expand using the definition of norm:

x+y2=x+y,x+y=x,x+x,y+y,x+y,y\|x + y\|^2 = \langle x + y, x + y \rangle = \langle x, x \rangle + \langle x, y \rangle + \langle y, x \rangle + \langle y, y \rangle

By conjugate symmetry: x,y+y,x=x,y+x,y=2Rex,y\langle x, y \rangle + \langle y, x \rangle = \langle x, y \rangle + \overline{\langle x, y \rangle} = 2\text{Re}\langle x, y \rangle

Corollary 7.0a: Real Inner Products

For real inner product spaces:

x+y2=x2+2x,y+y2\|x + y\|^2 = \|x\|^2 + 2\langle x, y \rangle + \|y\|^2
Remark 7.4b: Unit Vectors

A vector xx with x=1\|x\| = 1 is called a unit vector. For any nonzero xx, the vector x^=xx\hat{x} = \frac{x}{\|x\|} is the normalization of xx (a unit vector in the same direction).

Example 7.9c: Normalizing a Vector

Normalize x=(3,4)Tx = (3, 4)^T:

x=9+16=5,x^=15(3,4)T=(0.6,0.8)T\|x\| = \sqrt{9 + 16} = 5, \quad \hat{x} = \frac{1}{5}(3, 4)^T = (0.6, 0.8)^T

Verify: x^=0.36+0.64=1\|\hat{x}\| = \sqrt{0.36 + 0.64} = 1

4. Cauchy-Schwarz Inequality

Theorem 7.2: Cauchy-Schwarz Inequality

For any vectors x,yx, y in an inner product space:

x,yxy|\langle x, y \rangle| \leq \|x\| \cdot \|y\|

Equality holds if and only if xx and yy are linearly dependent.

Proof:

If y=0y = 0, both sides are 0. Assume y0y \neq 0.

For any scalar tt, positive definiteness gives:

0xty2=xty,xty=x22Re(tx,y)+t2y20 \leq \|x - ty\|^2 = \langle x - ty, x - ty \rangle = \|x\|^2 - 2\text{Re}(t\langle x, y \rangle) + |t|^2\|y\|^2

Choose t=x,yy2t = \frac{\langle x, y \rangle}{\|y\|^2} (the optimal value):

0x2x,y2y20 \leq \|x\|^2 - \frac{|\langle x, y \rangle|^2}{\|y\|^2}

Rearranging: x,y2x2y2|\langle x, y \rangle|^2 \leq \|x\|^2 \|y\|^2

Taking square roots gives the result. Equality holds iff x=tyx = ty for some scalar tt.

Corollary 7.1: Triangle Inequality
x+yx+y\|x + y\| \leq \|x\| + \|y\|
Proof:
x+y2=x2+2Rex,y+y2x2+2x,y+y2\|x + y\|^2 = \|x\|^2 + 2\text{Re}\langle x, y \rangle + \|y\|^2 \leq \|x\|^2 + 2|\langle x, y \rangle| + \|y\|^2

By Cauchy-Schwarz:

x2+2xy+y2=(x+y)2\leq \|x\|^2 + 2\|x\|\|y\| + \|y\|^2 = (\|x\| + \|y\|)^2
Example 7.10: Cauchy-Schwarz in ℝⁿ

For x=(1,2,3)Tx = (1, 2, 3)^T and y=(1,1,1)Ty = (1, 1, 1)^T:

  • x,y=1+2+3=6\langle x, y \rangle = 1 + 2 + 3 = 6
  • x=14\|x\| = \sqrt{14}, y=3\|y\| = \sqrt{3}
  • xy=426.48\|x\| \cdot \|y\| = \sqrt{42} \approx 6.48

Indeed, 66.486 \leq 6.48

Example 7.11: Cauchy-Schwarz for Functions

For f,gL2[a,b]f, g \in L^2[a,b]:

abf(t)g(t)dtabf(t)2dtabg(t)2dt\left| \int_a^b f(t)g(t)\, dt \right| \leq \sqrt{\int_a^b |f(t)|^2\, dt} \cdot \sqrt{\int_a^b |g(t)|^2\, dt}
Remark 7.5: Importance of Cauchy-Schwarz

The Cauchy-Schwarz inequality is one of the most important inequalities in mathematics:

  • Proves the triangle inequality for norms
  • Defines angles between vectors
  • Bounds correlations in probability
  • Proves Hölder's inequality (generalization)
Example 7.11a: Cauchy-Schwarz for Sums

For sequences a1,,ana_1, \ldots, a_n and b1,,bnb_1, \ldots, b_n:

(i=1naibi)2(i=1nai2)(i=1nbi2)\left(\sum_{i=1}^n a_i b_i\right)^2 \leq \left(\sum_{i=1}^n a_i^2\right)\left(\sum_{i=1}^n b_i^2\right)

Example: (12+21)2=16(1+4)(4+1)=25(1 \cdot 2 + 2 \cdot 1)^2 = 16 \leq (1 + 4)(4 + 1) = 25

Theorem 7.2a: Alternative Proof via Discriminant

Consider the quadratic q(t)=x+ty2=y2t2+2Rex,yt+x2q(t) = \|x + ty\|^2 = \|y\|^2 t^2 + 2\text{Re}\langle x,y\rangle t + \|x\|^2.

Since q(t)0q(t) \geq 0 for all real tt, the discriminant must be non-positive:

4(Rex,y)24x2y204(\text{Re}\langle x,y\rangle)^2 - 4\|x\|^2\|y\|^2 \leq 0

This gives Rex,yxy|\text{Re}\langle x,y\rangle| \leq \|x\|\|y\|, and similarly for the imaginary part.

Corollary 7.1a: Equality in Cauchy-Schwarz

Equality x,y=xy|\langle x, y \rangle| = \|x\| \cdot \|y\| holds if and only if:

  • x=0x = 0 or y=0y = 0, or
  • x=λyx = \lambda y for some scalar λ\lambda

In other words, equality holds iff xx and yy are linearly dependent.

Example 7.11b: When Equality Holds

For x=(2,4)Tx = (2, 4)^T and y=(1,2)T=12xy = (1, 2)^T = \frac{1}{2}x:

x,y=2+8=10,x=20,y=5\langle x, y \rangle = 2 + 8 = 10, \quad \|x\| = \sqrt{20}, \quad \|y\| = \sqrt{5}
x,y=10=205=xy|\langle x, y \rangle| = 10 = \sqrt{20}\sqrt{5} = \|x\|\|y\|

Equality holds because x=2yx = 2y (linearly dependent).

Theorem 7.2b: Reverse Cauchy-Schwarz

For nonzero vectors in a real inner product space:

xyx+yxy/cosθsinθ\|x\| \cdot \|y\| \leq \|x + y\| \cdot \|x - y\| / |\cos\theta - \sin\theta|

when the denominator is nonzero, where θ\theta is the angle between xx and yy.

Example 7.11c: Cauchy-Schwarz in Probability

For random variables X,YX, Y with finite second moments:

E[XY]2E[X2]E[Y2]|E[XY]|^2 \leq E[X^2] \cdot E[Y^2]

This is Cauchy-Schwarz with X,Y=E[XY]\langle X, Y \rangle = E[XY], the L2L^2 inner product.

Equality holds iff XX and YY are linearly related (one is a constant multiple of the other).

Corollary 7.1b: Correlation Bound

The correlation coefficient ρ=Cov(X,Y)σXσY\rho = \frac{\text{Cov}(X,Y)}{\sigma_X \sigma_Y} satisfies:

1ρ1-1 \leq \rho \leq 1

This is a direct consequence of Cauchy-Schwarz applied to centered random variables.

Remark 7.5a: Generalizations

Cauchy-Schwarz generalizes to:

  • Hölder's inequality: fg1fpgq\|fg\|_1 \leq \|f\|_p \|g\|_q where 1/p+1/q=11/p + 1/q = 1
  • Minkowski's inequality: Triangle inequality for LpL^p norms
  • Bessel's inequality: Bounds on Fourier coefficients
Example 7.11d: Proving AM-GM via Cauchy-Schwarz

For positive a1,,ana_1, \ldots, a_n:

a1++anna1ann\frac{a_1 + \cdots + a_n}{n} \geq \sqrt[n]{a_1 \cdots a_n}

Apply Cauchy-Schwarz to (a1,,an)(\sqrt{a_1}, \ldots, \sqrt{a_n}) and (1/a1,,1/an)(1/\sqrt{a_1}, \ldots, 1/\sqrt{a_n}).

5. Angles and Orthogonality

Definition 7.5: Angle Between Vectors

For nonzero vectors x,yx, y in a real inner product space, the angle θ\theta between them is defined by:

cosθ=x,yxy\cos\theta = \frac{\langle x, y \rangle}{\|x\| \cdot \|y\|}

By Cauchy-Schwarz, cosθ1|\cos\theta| \leq 1, so θ[0,π]\theta \in [0, \pi] is well-defined.

Definition 7.6: Orthogonality

Vectors xx and yy are orthogonal (perpendicular), written xyx \perp y, if:

x,y=0\langle x, y \rangle = 0

This corresponds to θ=90°\theta = 90° (or π/2\pi/2 radians).

Example 7.12: Orthogonal Vectors in ℝ³

Vectors x=(1,0,1)Tx = (1, 0, 1)^T and y=(1,0,1)Ty = (1, 0, -1)^T:

x,y=1(1)+0(0)+1(1)=0\langle x, y \rangle = 1(1) + 0(0) + 1(-1) = 0

So xyx \perp y.

Example 7.13: Angle Calculation

For x=(1,1)Tx = (1, 1)^T and y=(1,0)Ty = (1, 0)^T:

cosθ=11+1021=12\cos\theta = \frac{1 \cdot 1 + 1 \cdot 0}{\sqrt{2} \cdot 1} = \frac{1}{\sqrt{2}}

So θ=45°=π/4\theta = 45° = \pi/4.

Theorem 7.3: Pythagorean Theorem

If xyx \perp y, then:

x+y2=x2+y2\|x + y\|^2 = \|x\|^2 + \|y\|^2
Proof:
x+y2=x+y,x+y=x2+2x,y+y2=x2+y2\|x + y\|^2 = \langle x + y, x + y \rangle = \|x\|^2 + 2\langle x, y \rangle + \|y\|^2 = \|x\|^2 + \|y\|^2

since x,y=0\langle x, y \rangle = 0.

Corollary 7.2: Generalized Pythagorean Theorem

If x1,,xnx_1, \ldots, x_n are pairwise orthogonal:

x1++xn2=x12++xn2\|x_1 + \cdots + x_n\|^2 = \|x_1\|^2 + \cdots + \|x_n\|^2
Definition 7.7: Orthogonal Complement

For a subset SVS \subseteq V, the orthogonal complement is:

S={vV:v,s=0 for all sS}S^\perp = \{v \in V : \langle v, s \rangle = 0 \text{ for all } s \in S\}
Theorem 7.4: Orthogonal Complement is a Subspace

For any subset SS, SS^\perp is a subspace of VV.

Proof:

Zero vector: 0,s=0\langle 0, s \rangle = 0 for all ss, so 0S0 \in S^\perp.

Closure: If x,ySx, y \in S^\perp and α,β\alpha, \beta are scalars:

αx+βy,s=αx,s+βy,s=0\langle \alpha x + \beta y, s \rangle = \alpha\langle x, s \rangle + \beta\langle y, s \rangle = 0
Example 7.13a: Orthogonal Vectors in Function Space

On C[π,π]C[-\pi, \pi] with f,g=ππf(x)g(x)dx\langle f, g \rangle = \int_{-\pi}^{\pi} f(x)g(x)\,dx:

The functions sin(nx)\sin(nx) and cos(mx)\cos(mx) are orthogonal for all integers n,mn, m:

sin(nx),cos(mx)=ππsin(nx)cos(mx)dx=0\langle \sin(nx), \cos(mx) \rangle = \int_{-\pi}^{\pi} \sin(nx)\cos(mx)\,dx = 0

This orthogonality is the foundation of Fourier series.

Example 7.13b: Angle Between Functions

Find the angle between f(x)=1f(x) = 1 and g(x)=xg(x) = x on [1,1][-1, 1]:

f,g=111xdx=0\langle f, g \rangle = \int_{-1}^{1} 1 \cdot x\, dx = 0

Since f,g=0\langle f, g \rangle = 0, the angle is θ=90°\theta = 90° — they are orthogonal!

Theorem 7.4a: Orthogonality and Linear Independence

A set of nonzero pairwise orthogonal vectors is linearly independent.

Proof:

Suppose α1v1++αnvn=0\alpha_1 v_1 + \cdots + \alpha_n v_n = 0 with vivjv_i \perp v_j for iji \neq j.

Take inner product with vkv_k:

0=αivi,vk=αivi,vk=αkvk20 = \langle \sum \alpha_i v_i, v_k \rangle = \sum \alpha_i \langle v_i, v_k \rangle = \alpha_k \|v_k\|^2

Since vk0v_k \neq 0, we have vk2>0\|v_k\|^2 > 0, so αk=0\alpha_k = 0.

Corollary 7.2a: Maximum Orthogonal Set

In an nn-dimensional inner product space, any orthogonal set has at most nn nonzero vectors.

Definition 7.7a: Orthonormal Set

A set {e1,,en}\{e_1, \ldots, e_n\} is orthonormal if:

ei,ej=δij={1i=j0ij\langle e_i, e_j \rangle = \delta_{ij} = \begin{cases} 1 & i = j \\ 0 & i \neq j \end{cases}

That is, the vectors are pairwise orthogonal and each has unit length.

Example 7.13c: Standard Orthonormal Basis

The standard basis {e1,,en}\{e_1, \ldots, e_n\} of Rn\mathbb{R}^n is orthonormal:

e1=(1,0,,0)T,e2=(0,1,,0)T,e_1 = (1,0,\ldots,0)^T, \quad e_2 = (0,1,\ldots,0)^T, \quad \ldots

We have ei,ej=δij\langle e_i, e_j \rangle = \delta_{ij} for the standard inner product.

Example 7.13d: Another Orthonormal Basis in ℝ²

The rotated basis {u1,u2}\{u_1, u_2\} where:

u1=12(1,1)T,u2=12(1,1)Tu_1 = \frac{1}{\sqrt{2}}(1, 1)^T, \quad u_2 = \frac{1}{\sqrt{2}}(1, -1)^T

Verify: u1=u2=1\|u_1\| = \|u_2\| = 1 and u1,u2=12(11)=0\langle u_1, u_2 \rangle = \frac{1}{2}(1 - 1) = 0.

Remark 7.6a: Coordinates in Orthonormal Basis

If {e1,,en}\{e_1, \ldots, e_n\} is orthonormal, coordinates are easy to find:

v=i=1nv,eieiv = \sum_{i=1}^n \langle v, e_i \rangle e_i

The coefficient of eie_i is simply v,ei\langle v, e_i \rangle—no system of equations needed!

Theorem 7.4b: Parseval's Identity

If {e1,,en}\{e_1, \ldots, e_n\} is an orthonormal basis and v=cieiv = \sum c_i e_i:

v2=i=1nci2=i=1nv,ei2\|v\|^2 = \sum_{i=1}^n |c_i|^2 = \sum_{i=1}^n |\langle v, e_i \rangle|^2

The squared norm equals the sum of squared coefficients—a generalized Pythagorean theorem.

6. Parallelogram Law and Polarization

Theorem 7.5: Parallelogram Law

In any inner product space:

x+y2+xy2=2x2+2y2\|x + y\|^2 + \|x - y\|^2 = 2\|x\|^2 + 2\|y\|^2
Proof:

Expand both sides using the inner product:

x+y2=x2+2Rex,y+y2\|x + y\|^2 = \|x\|^2 + 2\text{Re}\langle x, y \rangle + \|y\|^2
xy2=x22Rex,y+y2\|x - y\|^2 = \|x\|^2 - 2\text{Re}\langle x, y \rangle + \|y\|^2

Adding these gives the result.

Remark 7.6: Geometric Interpretation

The parallelogram law states that the sum of the squares of the diagonals of a parallelogram equals the sum of the squares of all four sides. This is a fundamental property that characterizes inner product spaces.

Theorem 7.6: Characterization of Inner Product Norms

A normed vector space (V,)(V, \|\cdot\|) is an inner product space (with the norm induced by that inner product) if and only if the norm satisfies the parallelogram law.

Example 7.14: 1-Norm Fails Parallelogram Law

On R2\mathbb{R}^2 with 1-norm, take x=(1,0),y=(0,1)x = (1, 0), y = (0, 1):

  • x+y1=(1,1)1=2\|x + y\|_1 = \|(1,1)\|_1 = 2
  • xy1=(1,1)1=2\|x - y\|_1 = \|(1,-1)\|_1 = 2
  • LHS: 4+4=84 + 4 = 8
  • RHS: 2(1)+2(1)=42(1) + 2(1) = 4

Since 848 \neq 4, the 1-norm doesn't come from an inner product.

Theorem 7.7: Polarization Identity (Real Case)

In a real inner product space:

x,y=14(x+y2xy2)\langle x, y \rangle = \frac{1}{4}\left(\|x + y\|^2 - \|x - y\|^2\right)
Theorem 7.8: Polarization Identity (Complex Case)

In a complex inner product space:

x,y=14(x+y2xy2+ix+iy2ixiy2)\langle x, y \rangle = \frac{1}{4}\left(\|x + y\|^2 - \|x - y\|^2 + i\|x + iy\|^2 - i\|x - iy\|^2\right)
Remark 7.7: Significance of Polarization

The polarization identity shows that the inner product is completely determined by the norm. If you know all the lengths, you can compute all the inner products (and hence all the angles).

Example 7.14a: Using Polarization Identity

Given x=3,y=4,x+y=5\|x\| = 3, \|y\| = 4, \|x+y\| = 5 in a real inner product space, find x,y\langle x, y \rangle:

x+y2=x2+2x,y+y2\|x+y\|^2 = \|x\|^2 + 2\langle x,y\rangle + \|y\|^2
25=9+2x,y+16    x,y=025 = 9 + 2\langle x,y\rangle + 16 \implies \langle x,y\rangle = 0

The vectors are orthogonal! (This is the 3-4-5 right triangle.)

Theorem 7.8a: Apollonius Identity

For any vectors x,yx, y and their midpoint m=x+y2m = \frac{x+y}{2}:

xz2+yz2=2mz2+12xy2\|x - z\|^2 + \|y - z\|^2 = 2\|m - z\|^2 + \frac{1}{2}\|x - y\|^2

This relates distances from a point zz to the endpoints and midpoint of a segment.

Proof:

Apply the parallelogram law to vectors xzx - z and yzy - z:

xz2+yz2=12(xz)+(yz)2+12(xz)(yz)2\|x-z\|^2 + \|y-z\|^2 = \frac{1}{2}\|(x-z)+(y-z)\|^2 + \frac{1}{2}\|(x-z)-(y-z)\|^2
=12x+y2z2+12xy2=2mz2+12xy2= \frac{1}{2}\|x+y-2z\|^2 + \frac{1}{2}\|x-y\|^2 = 2\|m-z\|^2 + \frac{1}{2}\|x-y\|^2
Remark 7.7a: Jordan-von Neumann Theorem

The parallelogram law completely characterizes inner product spaces among normed spaces. This deep result (Jordan-von Neumann, 1935) shows that the parallelogram law is the only additional axiom needed to get from a normed space to an inner product space.

Example 7.14b: Verifying the Parallelogram Law

In R2\mathbb{R}^2 with x=(1,2)T,y=(3,1)Tx = (1, 2)^T, y = (3, 1)^T:

  • x+y2=(4,3)2=16+9=25\|x+y\|^2 = \|(4,3)\|^2 = 16 + 9 = 25
  • xy2=(2,1)2=4+1=5\|x-y\|^2 = \|(-2,1)\|^2 = 4 + 1 = 5
  • LHS: 25+5=3025 + 5 = 30
  • 2x2+2y2=2(5)+2(10)=302\|x\|^2 + 2\|y\|^2 = 2(5) + 2(10) = 30

Both sides equal 30 ✓

Theorem 7.8b: Polarization in Higher Dimensions

In a real inner product space, the polarization identity can be written as:

4x,y=x+y2xy24\langle x, y \rangle = \|x+y\|^2 - \|x-y\|^2

In complex inner product spaces, we need all four terms:

4x,y=x+y2xy2+ix+iy2ixiy24\langle x, y \rangle = \|x+y\|^2 - \|x-y\|^2 + i\|x+iy\|^2 - i\|x-iy\|^2

7. Important Properties

Theorem 7.9: Continuity of Inner Product

The inner product is continuous: if xnxx_n \to x and ynyy_n \to y (in norm), then:

xn,ynx,y\langle x_n, y_n \rangle \to \langle x, y \rangle
Proof:
xn,ynx,yxnx,yn+x,yny|\langle x_n, y_n \rangle - \langle x, y \rangle| \leq |\langle x_n - x, y_n \rangle| + |\langle x, y_n - y \rangle|

By Cauchy-Schwarz:

xnxyn+xyny0\leq \|x_n - x\| \cdot \|y_n\| + \|x\| \cdot \|y_n - y\| \to 0
Theorem 7.10: Reverse Triangle Inequality
xyxy\big| \|x\| - \|y\| \big| \leq \|x - y\|
Proof:

By triangle inequality: x=(xy)+yxy+y\|x\| = \|(x-y) + y\| \leq \|x-y\| + \|y\|

So xyxy\|x\| - \|y\| \leq \|x-y\|. By symmetry, yxxy\|y\| - \|x\| \leq \|x-y\|.

Theorem 7.11: Properties of Orthogonal Complement

For subspaces U,WU, W of inner product space VV:

  1. UW    WUU \subseteq W \implies W^\perp \subseteq U^\perp
  2. UU={0}U \cap U^\perp = \{0\}
  3. U(U)U \subseteq (U^\perp)^\perp
  4. In finite dimensions: V=UUV = U \oplus U^\perp
Example 7.15: Orthogonal Complement in ℝ³

Let U=span{(1,0,0),(0,1,0)}U = \text{span}\{(1, 0, 0), (0, 1, 0)\} (xy-plane in R3\mathbb{R}^3).

Then U=span{(0,0,1)}U^\perp = \text{span}\{(0, 0, 1)\} (z-axis).

And R3=UU\mathbb{R}^3 = U \oplus U^\perp.

Remark 7.8: Finite vs Infinite Dimensions

In finite-dimensional spaces, U=UU^{\perp\perp} = U always. In infinite dimensions, we only have UUU \subseteq U^{\perp\perp}, with equality iff UU is closed.

Theorem 7.11a: Dimension Formula for Orthogonal Complement

For a subspace UU of finite-dimensional inner product space VV:

dim(U)+dim(U)=dim(V)\dim(U) + \dim(U^\perp) = \dim(V)
Proof:

Since V=UUV = U \oplus U^\perp (direct sum), dimensions add.

Example 7.15a: Computing Orthogonal Complement

In R3\mathbb{R}^3, let U=span{(1,1,0)T}U = \text{span}\{(1, 1, 0)^T\}. Find UU^\perp.

A vector (x,y,z)TU(x, y, z)^T \in U^\perp iff (x,y,z),(1,1,0)=x+y=0\langle (x,y,z), (1,1,0) \rangle = x + y = 0.

So U={(x,x,z):x,zR}=span{(1,1,0)T,(0,0,1)T}U^\perp = \{(x, -x, z) : x, z \in \mathbb{R}\} = \text{span}\{(1,-1,0)^T, (0,0,1)^T\}.

Check: dim(U)+dim(U)=1+2=3\dim(U) + \dim(U^\perp) = 1 + 2 = 3

Theorem 7.11b: Best Approximation Property

Let UU be a closed subspace of inner product space VV. For any vVv \in V, there exists a unique uUu \in U minimizing vu\|v - u\|. This is characterized by:

vuUv - u \in U^\perp
Remark 7.8a: Projection

The vector uu in the theorem is called the orthogonal projection of vv onto UU, written PUvP_U v or projUv\text{proj}_U v. It minimizes distance to UU.

Example 7.15b: Projection in ℝ³

Project v=(1,2,3)Tv = (1, 2, 3)^T onto the xy-plane U={(x,y,0)}U = \{(x,y,0)\}:

PUv=(1,2,0)TP_U v = (1, 2, 0)^T

The residual vPUv=(0,0,3)TUv - P_U v = (0, 0, 3)^T \in U^\perp (the z-axis).

Theorem 7.11c: Projection Formula

If {e1,,ek}\{e_1, \ldots, e_k\} is an orthonormal basis for subspace UU:

PUv=i=1kv,eieiP_U v = \sum_{i=1}^{k} \langle v, e_i \rangle e_i
Proof:

We need vPUvUv - P_U v \perp U. For any eje_j:

vv,eiei,ej=v,ejv,eiei,ej=v,ejv,ej=0\langle v - \sum \langle v, e_i \rangle e_i, e_j \rangle = \langle v, e_j \rangle - \sum \langle v, e_i \rangle \langle e_i, e_j \rangle = \langle v, e_j \rangle - \langle v, e_j \rangle = 0

8. Inner Products and Matrices

Theorem 7.12: Matrix Representation of Inner Products

Let VV be an nn-dimensional real inner product space with basis {e1,,en}\{e_1, \ldots, e_n\}. Define the Gram matrix:

Gij=ei,ejG_{ij} = \langle e_i, e_j \rangle

Then for x=xieix = \sum x_i e_i and y=yjejy = \sum y_j e_j:

x,y=xTGy\langle x, y \rangle = x^T G y
Definition 7.8: Positive Definite Matrix

A symmetric matrix GMn(R)G \in M_n(\mathbb{R}) is positive definite if:

xTGx>0for all x0x^T G x > 0 \quad \text{for all } x \neq 0

Equivalently, all eigenvalues of GG are positive.

Theorem 7.13: Characterization of Inner Product Matrices

x,y=xTGy\langle x, y \rangle = x^T G y defines an inner product on Rn\mathbb{R}^n if and only if GG is symmetric and positive definite.

Example 7.16: Non-Standard Inner Product

Let G=(2112)G = \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix}. Check positive definiteness:

xTGx=2x12+2x1x2+2x22=x12+(x1+x2)2+x22>0x^T G x = 2x_1^2 + 2x_1 x_2 + 2x_2^2 = x_1^2 + (x_1 + x_2)^2 + x_2^2 > 0

for x0x \neq 0. So x,y=xTGy\langle x, y \rangle = x^T G y is a valid inner product.

Remark 7.9: Standard Basis

For the standard inner product on Rn\mathbb{R}^n with standard basis, G=IG = I. A basis where G=IG = I is called an orthonormal basis.

Theorem 7.14: Hermitian Matrices for Complex Inner Products

On Cn\mathbb{C}^n, x,y=xHGy\langle x, y \rangle = x^H G y defines an inner product iff GG is Hermitian (G=GHG = G^H) and positive definite.

Example 7.17: Checking Hermitian Positive Definiteness

Is G=(2ii3)G = \begin{pmatrix} 2 & i \\ -i & 3 \end{pmatrix} positive definite?

Check G=GHG = G^H: Yes (conjugate transpose equals itself).

Eigenvalues: λ=5±25202=5±52>0\lambda = \frac{5 \pm \sqrt{25-20}}{2} = \frac{5 \pm \sqrt{5}}{2} > 0

Both positive, so GG defines a valid inner product.

Theorem 7.14a: Cholesky Decomposition

A matrix GG is positive definite if and only if it can be written as:

G=LLTG = L L^T

where LL is lower triangular with positive diagonal entries. This is the Cholesky decomposition.

Example 7.17a: Cholesky Decomposition Example

Factor G=(4225)G = \begin{pmatrix} 4 & 2 \\ 2 & 5 \end{pmatrix}:

L=(2012),LLT=(4225)=GL = \begin{pmatrix} 2 & 0 \\ 1 & 2 \end{pmatrix}, \quad L L^T = \begin{pmatrix} 4 & 2 \\ 2 & 5 \end{pmatrix} = G

The positive diagonal in LL confirms GG is positive definite.

Theorem 7.14b: Sylvester's Criterion

A symmetric matrix GG is positive definite iff all leading principal minors are positive:

det(G1)>0,det(G2)>0,,det(Gn)>0\det(G_1) > 0, \quad \det(G_2) > 0, \quad \ldots, \quad \det(G_n) > 0

where GkG_k is the upper-left k×kk \times k submatrix.

Example 7.17b: Using Sylvester's Criterion

Is G=(3112)G = \begin{pmatrix} 3 & 1 \\ 1 & 2 \end{pmatrix} positive definite?

  • det(G1)=3>0\det(G_1) = 3 > 0
  • det(G2)=61=5>0\det(G_2) = 6 - 1 = 5 > 0

Yes, GG is positive definite.

Remark 7.9a: Change of Basis

If we change basis with invertible matrix PP (new coordinates y=Pxy = Px), the Gram matrix transforms as:

G=PTGPG' = P^T G P

Inner product is preserved: y1TGy2=x1TGx2y_1^T G' y_2 = x_1^T G x_2.

Theorem 7.14c: Diagonalization of Gram Matrix

Every real symmetric positive definite GG can be diagonalized by an orthogonal matrix QQ:

G=QΛQTG = Q \Lambda Q^T

where Λ=diag(λ1,,λn)\Lambda = \text{diag}(\lambda_1, \ldots, \lambda_n) with λi>0\lambda_i > 0.

9. Common Mistakes

Forgetting conjugation in complex inner products

For Cn\mathbb{C}^n, use x,y=xiyiˉ\langle x, y \rangle = \sum x_i \bar{y_i}, NOT xiyi\sum x_i y_i. Without conjugation, x,x\langle x, x \rangle can be negative!

Assuming all norms come from inner products

Only norms satisfying the parallelogram law are induced by inner products. The 1-norm and ∞-norm are NOT inner product norms.

Confusing positive definite with positive semidefinite

Positive definite: x,x>0\langle x, x \rangle > 0 for x0x \neq 0. Positive semidefinite allows x,x=0\langle x, x \rangle = 0 for some x0x \neq 0. Only positive definite forms are inner products.

Wrong linearity convention

Some texts use linearity in the second argument. Be consistent! We use linearity in the first argument (physicist's convention).

Misapplying Cauchy-Schwarz

The inequality is x,yxy|\langle x, y \rangle| \leq \|x\| \cdot \|y\|, NOT x,yxy\langle x, y \rangle \leq \|x\| \cdot \|y\|. Don't forget the absolute value!

Confusing inner product with norm

The inner product x,y\langle x, y \rangle takes two arguments and can be negative. The norm x\|x\| takes one argument and is always non-negative.

Forgetting to check positive definiteness

When defining a new "inner product," always verify all three axioms. Positive definiteness is often the hardest to check—make sure x,x=0\langle x, x \rangle = 0 only when x=0x = 0.

Assuming orthogonality is transitive

If xyx \perp y and yzy \perp z, it does NOT follow that xzx \perp z. Example: (1,0)(0,1)(1,0) \perp (0,1) and (0,1)(1,0)(0,1) \perp (1,0), but (1,0)⊥̸(1,0)(1,0) \not\perp (1,0).

Mixing up triangle inequality directions

Triangle inequality: x+yx+y\|x + y\| \leq \|x\| + \|y\|. Reverse triangle: xyxy|\|x\| - \|y\|| \leq \|x - y\|. Don't confuse them!

10. Applications

Quantum Mechanics

Quantum states live in complex Hilbert spaces. The inner product ψϕ\langle \psi | \phi \rangle gives probability amplitudes. Orthogonal states are distinguishable.

Signal Processing

The L2L^2 inner product measures signal correlation. Orthogonal signals don't interfere. Fourier analysis uses orthogonal basis functions.

Statistics & Machine Learning

Covariance is an inner product. Correlation = cosine of angle. Kernel methods use inner products in feature spaces.

Computer Graphics

Lighting calculations use dot products (inner products). Surface normals and view directions determine shading.

Numerical Analysis

Least squares uses orthogonal projections. Krylov methods (like conjugate gradient) exploit inner products for efficient solving.

Approximation Theory

Best approximations minimize distance (norm). Orthogonal polynomials (Legendre, Chebyshev) arise from different inner products.

Remark 7.10a: Application: Least Squares

Given an inconsistent system Ax=bAx = b, least squares finds x^\hat{x} minimizing Axb2\|Ax - b\|^2. The solution satisfies the normal equations:

ATAx^=ATbA^T A \hat{x} = A^T b

This is the orthogonal projection of bb onto the column space of AA.

Example 7.18: Least Squares Fit

Fit a line y=ax+by = ax + b to points (0,1),(1,2),(2,2)(0, 1), (1, 2), (2, 2):

A=(011121),b=(122)A = \begin{pmatrix} 0 & 1 \\ 1 & 1 \\ 2 & 1 \end{pmatrix}, \quad b = \begin{pmatrix} 1 \\ 2 \\ 2 \end{pmatrix}

Normal equations: ATA=(5333)A^T A = \begin{pmatrix} 5 & 3 \\ 3 & 3 \end{pmatrix}, ATb=(65)A^T b = \begin{pmatrix} 6 \\ 5 \end{pmatrix}

Solving: a=1/2,b=7/6a = 1/2, b = 7/6, so the best-fit line is y=12x+76y = \frac{1}{2}x + \frac{7}{6}.

Remark 7.10b: Application: Fourier Series

Any periodic function fL2[π,π]f \in L^2[-\pi, \pi] can be written as:

f(x)=a02+n=1(ancos(nx)+bnsin(nx))f(x) = \frac{a_0}{2} + \sum_{n=1}^{\infty} (a_n \cos(nx) + b_n \sin(nx))

The coefficients are inner products: an=1πf,cos(nx)a_n = \frac{1}{\pi}\langle f, \cos(nx) \rangle.

Remark 7.10c: Application: Data Science

In machine learning and data science:

  • Cosine similarity: Measures similarity between documents/vectors
  • PCA: Finds orthogonal directions of maximum variance
  • Kernel methods: Inner products in high-dimensional feature spaces

11. Key Takeaways

Inner Product Axioms

  • • Linearity in first argument
  • • Conjugate symmetry
  • • Positive definiteness

Induced Norm

x=x,x\|x\| = \sqrt{\langle x, x \rangle}

Measures "length" of vectors

Cauchy-Schwarz

x,yxy|\langle x, y \rangle| \leq \|x\| \cdot \|y\|

Most important inequality!

Orthogonality

xy    x,y=0x \perp y \iff \langle x, y \rangle = 0

Generalizes perpendicularity

Chapter Summary

Inner products generalize the familiar dot product to abstract vector spaces, providing the geometric concepts of length, angle, and orthogonality. The Cauchy-Schwarz inequality is the cornerstone result that makes everything work.

20+

Theorems

25+

Examples

12

Quiz Questions

10

FAQs

Essential Formulas to Remember

Core Definitions

  • • Inner product axioms: linearity, symmetry, positive definiteness
  • • Induced norm: x=x,x\|x\| = \sqrt{\langle x, x \rangle}
  • • Orthogonality: xy    x,y=0x \perp y \iff \langle x, y \rangle = 0

Key Inequalities

  • • Cauchy-Schwarz: x,yxy|\langle x, y \rangle| \leq \|x\| \|y\|
  • • Triangle: x+yx+y\|x + y\| \leq \|x\| + \|y\|
  • • Parallelogram: x+y2+xy2=2x2+2y2\|x+y\|^2 + \|x-y\|^2 = 2\|x\|^2 + 2\|y\|^2

Connections to Other Topics

Linear Maps

Adjoint operators, unitary/orthogonal matrices

Eigenvalues

Spectral theorem, orthogonal diagonalization

Applications

SVD, least squares, Fourier analysis

12. What's Next?

With inner products mastered, you're ready for:

  • Orthogonality (LA-7.2): Orthonormal bases, orthogonal sets, and their special properties
  • Gram-Schmidt (LA-7.3): The algorithm to construct orthonormal bases from any basis
  • Orthogonal Projections (LA-7.4): Best approximations and least squares
  • Spectral Theorem (LA-7.5): Diagonalization of self-adjoint operators

Study Tips for This Chapter

  • Practice computing inner products in different spaces (ℝⁿ, ℂⁿ, function spaces)
  • Memorize the Cauchy-Schwarz inequality and its proof—it's fundamental
  • Always check all three axioms when verifying an inner product
  • Draw pictures! Orthogonality, projections, and angles have geometric meaning
  • Connect back to familiar dot product intuition from ℝ²/ℝ³

Practice Problems to Try

  1. Verify that the weighted inner product x,yw=wixiyi\langle x, y \rangle_w = \sum w_i x_i y_i satisfies all axioms
  2. Prove Cauchy-Schwarz using the quadratic discriminant method
  3. Show that {1,x12}\{1, x - \frac{1}{2}\} is orthogonal on [0,1][0, 1]
  4. Find the orthogonal complement of span{(1,1,1)}\text{span}\{(1, 1, 1)\} in R3\mathbb{R}^3
  5. Prove that the parallelogram law fails for the 1-norm

Quick Reference

Key Formulas

  • • Norm: x=x,x\|x\| = \sqrt{\langle x, x \rangle}
  • • Angle: cosθ=x,yxy\cos\theta = \frac{\langle x, y \rangle}{\|x\|\|y\|}
  • • Cauchy-Schwarz: x,yxy|\langle x,y\rangle| \leq \|x\|\|y\|
  • • Pythagorean: xyx+y2=x2+y2x \perp y \Rightarrow \|x+y\|^2 = \|x\|^2 + \|y\|^2
  • • Parallelogram: x+y2+xy2=2x2+2y2\|x+y\|^2 + \|x-y\|^2 = 2\|x\|^2 + 2\|y\|^2
  • • Polarization (real): x,y=14(x+y2xy2)\langle x, y \rangle = \frac{1}{4}(\|x+y\|^2 - \|x-y\|^2)

Standard Inner Products

  • Rn\mathbb{R}^n: x,y=xTy=xiyi\langle x, y \rangle = x^T y = \sum x_i y_i
  • Cn\mathbb{C}^n: x,y=xHy=xiyˉi\langle x, y \rangle = x^H y = \sum x_i \bar{y}_i
  • L2[a,b]L^2[a,b]: f,g=abf(t)g(t)dt\langle f, g \rangle = \int_a^b f(t)\overline{g(t)}\, dt
  • • Matrices: A,B=tr(AHB)\langle A, B \rangle = \text{tr}(A^H B)
  • • Weighted: x,yw=wixiyi\langle x, y \rangle_w = \sum w_i x_i y_i
  • • General: x,yG=xHGy\langle x, y \rangle_G = x^H G y (G pos. def.)

Inner Product Axioms

  • Linearity: αx+βy,z=αx,z+βy,z\langle \alpha x + \beta y, z \rangle = \alpha \langle x, z \rangle + \beta \langle y, z \rangle
  • Conj. Symmetry: y,x=x,y\langle y, x \rangle = \overline{\langle x, y \rangle}
  • Pos. Definite: x,x0\langle x, x \rangle \geq 0, =0    x=0= 0 \iff x = 0

Important Definitions

  • Orthogonal: xy    x,y=0x \perp y \iff \langle x, y \rangle = 0
  • Unit vector: x=1\|x\| = 1
  • Orth. complement: S={v:v,s=0  sS}S^\perp = \{v : \langle v, s \rangle = 0 \; \forall s \in S\}
Inner Product Definition Practice
12
Questions
0
Correct
0%
Accuracy
1
An inner product ,\langle \cdot, \cdot \rangle must satisfy:
Easy
Not attempted
2
For the standard inner product on Rn\mathbb{R}^n, x,y\langle x, y \rangle equals:
Easy
Not attempted
3
The induced norm from inner product is:
Easy
Not attempted
4
Cauchy-Schwarz states that:
Medium
Not attempted
5
If x,y=0\langle x, y \rangle = 0, we say xx and yy are:
Easy
Not attempted
6
For complex inner products, y,x\langle y, x \rangle equals:
Medium
Not attempted
7
The parallelogram law states:
Medium
Not attempted
8
Which is NOT an inner product space?
Hard
Not attempted
9
The angle between vectors is defined as:
Medium
Not attempted
10
For L2[a,b]L^2[a,b], the inner product is:
Medium
Not attempted
11
Positive definiteness means:
Medium
Not attempted
12
The polarization identity relates:
Hard
Not attempted

Frequently Asked Questions

What's the difference between inner product and dot product?

The dot product is a specific inner product on ℝⁿ. An inner product is a more general concept that can be defined on any vector space satisfying the axioms (linearity, conjugate symmetry, positive definiteness). Every inner product induces a notion of 'dot product' in its space.

Why do complex inner products use conjugation?

Without conjugation, ⟨x,x⟩ could be negative or complex for nonzero x. Conjugate symmetry ensures ⟨x,x⟩ is always real, and positive definiteness makes it positive for x ≠ 0. This is essential for defining a valid norm.

What does positive definiteness guarantee?

It guarantees that ||x|| = √⟨x,x⟩ is a valid norm: (1) ||x|| ≥ 0, (2) ||x|| = 0 iff x = 0, (3) ||cx|| = |c|||x||. Without it, we couldn't measure 'length' properly.

How is Cauchy-Schwarz useful?

It's fundamental! It proves the triangle inequality, defines angles between vectors (since |cos θ| ≤ 1), bounds correlations in statistics, and appears throughout analysis, probability, and physics.

Can every norm come from an inner product?

No! A norm comes from an inner product if and only if it satisfies the parallelogram law: ||x+y||² + ||x-y||² = 2||x||² + 2||y||². The 1-norm and ∞-norm fail this test.

What's the significance of the polarization identity?

It shows that if you know the norm, you can recover the inner product (when it exists). This means all geometric information is encoded in lengths alone—angles are derived quantities.

Why study inner products beyond ℝⁿ?

Function spaces like L²[a,b] are infinite-dimensional inner product spaces crucial for Fourier analysis, quantum mechanics, and differential equations. The same geometric intuition (length, angle, orthogonality) extends to these spaces.

What's a Hilbert space?

A complete inner product space—meaning every Cauchy sequence converges. ℝⁿ and ℂⁿ are finite-dimensional Hilbert spaces. L²[a,b] is an infinite-dimensional example, fundamental in functional analysis.

How do inner products relate to matrices?

For finite-dimensional spaces, ⟨x,y⟩ = xᵀAy for some positive definite matrix A. The standard inner product uses A = I. Changing A changes the geometry (lengths and angles) of the space.

What's the difference between sesquilinear and bilinear?

Bilinear means linear in both arguments. Sesquilinear (Latin: 'one-and-a-half linear') means linear in one argument, conjugate-linear in the other. Complex inner products are sesquilinear; real inner products are bilinear.