MathIsimple

Random Variables Limit Theorems

Convergence concepts and asymptotic behavior of random sequences

8-10
Hours Study
12
Lessons
Advanced
Level
4
Key Topics
Learning Objectives
Understand different types of convergence for random variable sequences
Master the fundamental law of large numbers and its applications
Apply central limit theorem to practical probability problems
Analyze convergence properties using characteristic functions
Distinguish between weak and strong convergence concepts
Key Topics Overview
Essential concepts in limit theorems for random variables

Convergence in Distribution

Weak convergence of distribution functions

Convergence in Probability

Probability convergence and Slutsky's lemma

Law of Large Numbers

Weak and strong law of large numbers

Central Limit Theorem

Normal approximation for sums of random variables

Convergence in Distribution
Weak convergence and distribution function limits

Definition and Core Concepts

Convergence in distribution describes the limiting behavior of random variable sequences through their distribution functions, focusing on convergence at continuity points.

Distribution Function Weak Convergence:

Let {Fn(x)}\{F_n(x)\} be a sequence of distribution functions and F(x)F(x)be a distribution function. We say FnF_n converges weakly to FF, denoted FnwFF_n \stackrel{w}{\to} F, if:

limnFn(x)=F(x)\lim_{n\to\infty}F_n(x)=F(x)

for all continuity points xx of F(x)F(x).

Random Variable Convergence in Distribution:

If random variables ξn\xi_n have distribution functions FnF_nthat converge weakly to the distribution function FF of random variableξ\xi, then:

ξndξ\xi_n \stackrel{d}{\to} \xi

Key Properties and Theorems

Helly's Theorem (First)

Any sequence of distribution functions {Fn}\{F_n\} contains a subsequence{Fnk}\{F_{n_k}\} that converges weakly to some monotonic, right-continuous functionFF with 0F(x)10 \leq F(x) \leq 1.

Helly's Theorem (Second)

If FnwFF_n \stackrel{w}{\to} F and g(x)g(x) is bounded and continuous, then:

g(x)dFn(x)g(x)dF(x)\int_{-\infty}^{\infty}g(x)dF_n(x) \to \int_{-\infty}^{\infty}g(x)dF(x)

Lévy Continuity Theorem

ξndξ\xi_n \stackrel{d}{\to} \xi if and only if their characteristic functions converge: fn(t)f(t)f_n(t) \to f(t) for all tRt \in \mathbb{R}.

Poisson Approximation

If ξnB(n,pn)\xi_n \sim B(n,p_n) and limnnpn=λ>0\lim_{n\to\infty}np_n=\lambda>0, then:

ξndP(λ)\xi_n \stackrel{d}{\to} P(\lambda)
Convergence in Probability
Probability convergence and related theorems

Definition and Properties

Convergence in Probability:

Random variables ξn\xi_n converge in probability to ξ\xi, denoted ξnPξ\xi_n \stackrel{P}{\to} \xi, if for any ε>0\varepsilon > 0:

limnP(ξnξε)=0\lim_{n\to\infty}P(|\xi_n-\xi|\geq\varepsilon)=0

Relationship with Convergence in Distribution:

  • Convergence in probability ⇒ Convergence in distribution
  • If ξndc\xi_n \stackrel{d}{\to} c (constant), then ξnPc\xi_n \stackrel{P}{\to} c
  • Convergence in distribution does not imply convergence in probability (in general)

Slutsky's Lemma

Slutsky's Lemma provides rules for combining convergent sequences. If ξndξ\xi_n \stackrel{d}{\to} \xi and ηnPc\eta_n \stackrel{P}{\to} c(where cc is a constant), then:

Addition:

ξn+ηndξ+c\xi_n+\eta_n \stackrel{d}{\to} \xi+c

Subtraction:

ξnηndξc\xi_n-\eta_n \stackrel{d}{\to} \xi-c

Multiplication:

ξnηndcξ\xi_n\eta_n \stackrel{d}{\to} c\xi

Division:

ξn/ηndξ/c (if c0)\xi_n/\eta_n \stackrel{d}{\to} \xi/c \text{ (if } c \neq 0\text{)}
Law of Large Numbers
Weak and strong convergence of sample means

Weak Law of Large Numbers (WLLN)

The weak law describes convergence in probability of sample means to population means.

TheoremConditionsConclusion
Bernoulli WLLNξnBernoulli(p)\xi_n \sim \text{Bernoulli}(p), i.i.d.1nk=1nξkPp\frac{1}{n}\sum_{k=1}^n\xi_k \stackrel{P}{\to} p
Chebyshev WLLNIndependent, 1n2k=1nVarξk0\frac{1}{n^2}\sum_{k=1}^n\text{Var}\xi_k \to 01nk=1nξk1nk=1nEξkP0\frac{1}{n}\sum_{k=1}^n\xi_k - \frac{1}{n}\sum_{k=1}^n E\xi_k \stackrel{P}{\to} 0
Khintchine WLLNi.i.d., Eξ1<E|\xi_1|<\infty, Eξ1=μE\xi_1=\mu1nk=1nξkPμ\frac{1}{n}\sum_{k=1}^n\xi_k \stackrel{P}{\to} \mu

Strong Law of Large Numbers (SLLN)

Almost Sure Convergence:

ξn\xi_n converges almost surely to ξ\xi, denotedξna.s.ξ\xi_n \stackrel{a.s.}{\to} \xi, if there exists Ω0\Omega_0with P(Ω0)=0P(\Omega_0)=0 such that for all ωΩΩ0\omega \in \Omega\setminus\Omega_0:

limnξn(ω)=ξ(ω)\lim_{n\to\infty}\xi_n(\omega)=\xi(\omega)
TheoremConditionsConclusion
Borel SLLNξnBernoulli(p)\xi_n \sim \text{Bernoulli}(p), i.i.d.1nk=1nξka.s.p\frac{1}{n}\sum_{k=1}^n\xi_k \stackrel{a.s.}{\to} p
Kolmogorov SLLNi.i.d.1nk=1nξka.s.μ    Eξ1<,μ=Eξ1\frac{1}{n}\sum_{k=1}^n\xi_k \stackrel{a.s.}{\to} \mu \iff E|\xi_1|<\infty, \mu=E\xi_1

Convergence Relationships

Almost Sure Convergence
Convergence in Probability
Convergence in Distribution

Strong convergence implies weak convergence, but not vice versa

Central Limit Theorem
Normal approximation for sums of random variables

Core Principle

The Central Limit Theorem states that the sum of a large number of independent random variables, when properly normalized, converges in distribution to a normal distribution, regardless of the individual distributions.

Universal Normal Convergence:

k=1nξkE[k=1nξk]Var[k=1nξk]dN(0,1)\frac{\sum_{k=1}^n \xi_k - E[\sum_{k=1}^n \xi_k]}{\sqrt{\text{Var}[\sum_{k=1}^n \xi_k]}} \stackrel{d}{\to} N(0,1)

Major Central Limit Theorems

de Moivre-Laplace Theorem

Conditions: ξnB(n,p)\xi_n \sim B(n,p), q=1pq=1-p

Result: For large nn:

SnnpnpqdN(0,1)\frac{S_n - np}{\sqrt{npq}} \stackrel{d}{\to} N(0,1)

This provides normal approximation to binomial probabilities.

Lindeberg-Lévy Theorem

Conditions: {ξn}\{\xi_n\} i.i.d., Eξ1=aE\xi_1=a,0<Varξ1=σ2<0<\text{Var}\xi_1=\sigma^2<\infty

Result:

k=1nξknanσdN(0,1)\frac{\sum_{k=1}^n\xi_k - na}{\sqrt{n}\sigma} \stackrel{d}{\to} N(0,1)

The classical CLT for identical distributions.

Lindeberg-Feller Theorem

Conditions: {ξn}\{\xi_n\} independent, satisfying Lindeberg condition:

1Bn2k=1nxEξkτBn(xEξk)2dFk(x)0\frac{1}{B_n^2}\sum_{k=1}^n\int_{|x-E\xi_k|\geq\tau B_n}(x-E\xi_k)^2dF_k(x) \to 0

where Bn2=k=1nVarξkB_n^2=\sum_{k=1}^n\text{Var}\xi_k

Result:

k=1nξkE[k=1nξk]BndN(0,1)\frac{\sum_{k=1}^n\xi_k - E[\sum_{k=1}^n\xi_k]}{B_n} \stackrel{d}{\to} N(0,1)

Most general form of CLT for non-identical distributions.

Lyapunov Theorem

Conditions: {ξn}\{\xi_n\} independent, exists δ>0\delta>0 such that:

1Bn2+δk=1nEξkEξk2+δ0\frac{1}{B_n^{2+\delta}}\sum_{k=1}^nE|\xi_k-E\xi_k|^{2+\delta} \to 0

Result: Same as Lindeberg-Feller theorem

Sufficient condition that's easier to verify than Lindeberg condition.

Practical Applications

Binomial Approximation

For B(n,p)B(n,p) with large nn:

P(aSnb)Φ(bnpnpq)Φ(anpnpq)P(a \leq S_n \leq b) \approx \Phi\left(\frac{b-np}{\sqrt{npq}}\right) - \Phi\left(\frac{a-np}{\sqrt{npq}}\right)

Confidence Intervals

For sample mean Xˉ\bar{X} with large samples:

Xˉ±zα/2σn\bar{X} \pm z_{\alpha/2} \frac{\sigma}{\sqrt{n}}

provides approximate confidence intervals.

Quality Control

Control charts use CLT to determine if process means have shifted from target values using sample statistics.

Survey Sampling

Polling and survey results rely on CLT to estimate population proportions and construct margin of error bounds.

Supporting Theorems and Inequalities
Important auxiliary results for limit theorems

Borel-Cantelli Lemma

First Lemma: If n=1P(An)<\sum_{n=1}^{\infty}P(A_n)<\infty, then:

P(lim supnAn)=0P(\limsup_{n\to\infty}A_n)=0

Second Lemma: If events are independent and n=1P(An)=\sum_{n=1}^{\infty}P(A_n)=\infty, then:

P(lim supnAn)=1P(\limsup_{n\to\infty}A_n)=1

Used to prove almost sure convergence results.

Kolmogorov's Inequality

For independent {ξn}\{\xi_n\} with finite variances:

P(max1jnk=1j(ξkEξk)ε)1ε2k=1nVarξkP\left(\max_{1\leq j\leq n}\left|\sum_{k=1}^j(\xi_k-E\xi_k)\right|\geq\varepsilon\right) \leq \frac{1}{\varepsilon^2}\sum_{k=1}^n\text{Var}\xi_k

Provides bounds on maximum partial sum deviations.

Kolmogorov Three-Series Theorem

For independent {ξn}\{\xi_n\}, let ξn=ξnI(ξnc)\xi_n'=\xi_n I(|\xi_n|\leq c). Then n=1(ξnEξn)\sum_{n=1}^{\infty}(\xi_n-E\xi_n) converges a.s. iff all three series converge:

1. n=1P(ξn>c)<\sum_{n=1}^{\infty} P(|\xi_n|>c)<\infty
2. n=1Eξn\sum_{n=1}^{\infty}E\xi_n' converges
3. n=1Varξn<\sum_{n=1}^{\infty}\text{Var}\xi_n'<\infty

Hájek-Rényi Inequality

For independent {ξn}\{\xi_n\} and positive non-increasing {Cn}\{C_n\}:

P(maxmjnCjk=1j(ξkEξk)ε)P\left(\max_{m\leq j\leq n}C_j\left|\sum_{k=1}^j(\xi_k-E\xi_k)\right|\geq\varepsilon\right)
1ε2(Cm2j=1mVarξj+j=m+1nCj2Varξj)\leq \frac{1}{\varepsilon^2}\left(C_m^2\sum_{j=1}^m\text{Var}\xi_j + \sum_{j=m+1}^nC_j^2\text{Var}\xi_j\right)

Generalization of Kolmogorov's inequality.