MathIsimple
Advanced Topic
8-10 Hours

Limit Theorems for Random Variables

Advanced Level
Convergence Theory

Core Concepts

Convergence in Distribution

Definition

Distribution functions FnF_n converge weakly to FF, denoted FnwFF_n \stackrel{w}{\to} F, if:

limnFn(x)=F(x)\lim_{n\to\infty}F_n(x)=F(x)

for all continuity points xx of F(x)F(x).

Lévy Continuity Theorem

ξndξ    fn(t)f(t) for all tR\xi_n \stackrel{d}{\to} \xi \iff f_n(t) \to f(t) \text{ for all } t \in \mathbb{R}

Convergence in distribution is equivalent to convergence of characteristic functions.

Convergence in Probability

Definition

ξnPξ    limnP(ξnξε)=0 for all ε>0\xi_n \stackrel{P}{\to} \xi \iff \lim_{n\to\infty}P(|\xi_n-\xi|\geq\varepsilon)=0 \text{ for all } \varepsilon > 0

Convergence in probability is stronger than convergence in distribution.

Slutsky's Lemma: If ξndξ\xi_n \stackrel{d}{\to} \xi and ηnPc\eta_n \stackrel{P}{\to} c, then ξn+ηndξ+c\xi_n + \eta_n \stackrel{d}{\to} \xi + c, ξnηndcξ\xi_n\eta_n \stackrel{d}{\to} c\xi, etc.

Law of Large Numbers

Weak Law of Large Numbers (WLLN)

Theorem:

If {ξn} i.i.d. with Eξ1<, then 1nk=1nξkPEξ1\text{If } \{\xi_n\} \text{ i.i.d. with } E|\xi_1| < \infty, \text{ then } \frac{1}{n}\sum_{k=1}^n \xi_k \stackrel{P}{\to} E\xi_1

The sample mean converges in probability to the population mean.

Strong Law of Large Numbers (SLLN)

Kolmogorov SLLN:

If {ξn} i.i.d., then 1nk=1nξka.s.Eξ1    Eξ1<\text{If } \{\xi_n\} \text{ i.i.d., then } \frac{1}{n}\sum_{k=1}^n \xi_k \stackrel{a.s.}{\to} E\xi_1 \iff E|\xi_1| < \infty

The sample mean converges almost surely to the population mean if and only if the expectation is finite.

Central Limit Theorem

Lindeberg-Lévy CLT

Theorem:

If {ξn}\{\xi_n\} are i.i.d. with Eξ1=μE\xi_1 = \mu and 0<Var(ξ1)=σ2<0 < \text{Var}(\xi_1) = \sigma^2 < \infty, then:

k=1nξknμnσdN(0,1)\frac{\sum_{k=1}^n \xi_k - n\mu}{\sqrt{n}\sigma} \stackrel{d}{\to} N(0,1)

The normalized sum converges in distribution to standard normal distribution.

de Moivre-Laplace Theorem

Normal Approximation to Binomial:

If SnB(n,p)S_n \sim B(n,p) with q=1pq = 1-p, then for large nn:

SnnpnpqdN(0,1)\frac{S_n - np}{\sqrt{npq}} \stackrel{d}{\to} N(0,1)

This provides normal approximation to binomial probabilities, useful when nn is large.

Worked Examples

Example 1: Applying CLT
Problem:

If X1,,X100X_1, \ldots, X_{100} are i.i.d. with E[Xi]=5E[X_i] = 5 and Var(Xi)=4\text{Var}(X_i) = 4, approximate P(i=1100Xi>520)P(\sum_{i=1}^{100} X_i > 520).

Solution:
  1. 1. By CLT: Xi500100×2=Xi50020N(0,1)\frac{\sum X_i - 500}{\sqrt{100} \times 2} = \frac{\sum X_i - 500}{20} \sim N(0,1)
  2. 2. Transform: P(Xi>520)=P(Xi50020>1)P(\sum X_i > 520) = P(\frac{\sum X_i - 500}{20} > 1)
  3. 3. Standard normal: P(Z>1)=1Φ(1)=10.8413=0.1587P(Z > 1) = 1 - \Phi(1) = 1 - 0.8413 = 0.1587
Example 2: Normal Approximation to Binomial
Problem:

For XB(100,0.3)X \sim B(100, 0.3), approximate P(25X35)P(25 \leq X \leq 35) using normal approximation.

Solution:
  1. 1. Parameters: μ=np=30\mu = np = 30, σ=npq=214.58\sigma = \sqrt{npq} = \sqrt{21} \approx 4.58
  2. 2. Standardize: P(25X35)P(24.5304.58Z35.5304.58)P(25 \leq X \leq 35) \approx P(\frac{24.5-30}{4.58} \leq Z \leq \frac{35.5-30}{4.58})
  3. 3. Calculate: P(1.20Z1.20)=2Φ(1.20)1=2(0.8849)1=0.7698P(-1.20 \leq Z \leq 1.20) = 2\Phi(1.20) - 1 = 2(0.8849) - 1 = 0.7698
  4. 4. Note: Used continuity correction (24.5 and 35.5) for better approximation

Theorem Proofs

Rigorous proofs of fundamental limit theorems

Weak Law of Large Numbers
For i.i.d. random variables X1,X2,,XnX_1, X_2, \ldots, X_n with E[Xi]=μE[X_i] = \mu and Var(Xi)=σ2<\text{Var}(X_i) = \sigma^2 < \infty, limnP(Xˉnμϵ)=0\lim_{n \to \infty} P(|\bar{X}_n - \mu| \geq \epsilon) = 0 for any ϵ>0\epsilon > 0

Proof (using Chebyshev's Inequality)

1
Compute Moments of Sample Mean

Since XiX_i are i.i.d.:

E[Xˉn]=E[1ni=1nXi]=1ni=1nE[Xi]=μE[\bar{X}_n] = E\left[\frac{1}{n}\sum_{i=1}^n X_i\right] = \frac{1}{n}\sum_{i=1}^n E[X_i] = \muVar(Xˉn)=Var(1ni=1nXi)=1n2i=1nVar(Xi)=σ2n\text{Var}(\bar{X}_n) = \text{Var}\left(\frac{1}{n}\sum_{i=1}^n X_i\right) = \frac{1}{n^2}\sum_{i=1}^n \text{Var}(X_i) = \frac{\sigma^2}{n}
2
Apply Chebyshev's Inequality

For any ϵ>0\epsilon > 0:

P(Xˉnμϵ)Var(Xˉn)ϵ2=σ2nϵ2P(|\bar{X}_n - \mu| \geq \epsilon) \leq \frac{\text{Var}(\bar{X}_n)}{\epsilon^2} = \frac{\sigma^2}{n\epsilon^2}
3
Take the Limit
limnP(Xˉnμϵ)limnσ2nϵ2=0\lim_{n \to \infty} P(|\bar{X}_n - \mu| \geq \epsilon) \leq \lim_{n \to \infty} \frac{\sigma^2}{n\epsilon^2} = 0

Since probabilities are non-negative, this proves convergence in probability. \blacksquare

Central Limit Theorem (Sketch)
For i.i.d. random variables X1,X2,,XnX_1, X_2, \ldots, X_n with E[Xi]=μE[X_i] = \mu and 0<Var(Xi)=σ2<0 < \text{Var}(X_i) = \sigma^2 < \infty, n(Xˉnμ)σdN(0,1)\frac{\sqrt{n}(\bar{X}_n - \mu)}{\sigma} \stackrel{d}{\to} N(0,1)

Proof Sketch (via Characteristic Functions)

1
Standardize Variables

Let Zi=(Xiμ)/σZ_i = (X_i - \mu)/\sigma. Then E[Zi]=0E[Z_i] = 0, Var(Zi)=1\text{Var}(Z_i) = 1.

Define Sn=i=1nZinS_n^* = \frac{\sum_{i=1}^n Z_i}{\sqrt{n}}. We want to show SndN(0,1)S_n^* \stackrel{d}{\to} N(0,1).

2
Characteristic Function

Let ϕ(t)=E[eitZi]\phi(t) = E[e^{itZ_i}] be the characteristic function of ZiZ_i. By Taylor expansion:

ϕ(t)=1+itE[Zi]t22E[Zi2]+o(t2)=1t22+o(t2)\phi(t) = 1 + it E[Z_i] - \frac{t^2}{2}E[Z_i^2] + o(t^2) = 1 - \frac{t^2}{2} + o(t^2)
3
Characteristic Function of Sum

Since ZiZ_i are i.i.d., the CF of SnS_n^* is:

ϕSn(t)=[ϕ(tn)]n=[1t22n+o(t2n)]n\phi_{S_n^*}(t) = \left[\phi\left(\frac{t}{\sqrt{n}}\right)\right]^n = \left[1 - \frac{t^2}{2n} + o\left(\frac{t^2}{n}\right)\right]^n
4
Take the Limit
limnϕSn(t)=limn[1t22n]n=et2/2\lim_{n \to \infty} \phi_{S_n^*}(t) = \lim_{n \to \infty} \left[1 - \frac{t^2}{2n}\right]^n = e^{-t^2/2}

This is the characteristic function of N(0,1)N(0,1). By the Lévy continuity theorem, convergence of CFs implies convergence in distribution. \blacksquare

Strong Law of Large Numbers (Outline)
For i.i.d. random variables X1,X2,X_1, X_2, \ldots with E[Xi]<E[|X_i|] < \infty and E[Xi]=μE[X_i] = \mu, P(limnXˉn=μ)=1P(\lim_{n \to \infty} \bar{X}_n = \mu) = 1 (almost sure convergence)

Key Ideas

1
Truncation

Truncate the variables to reduce them to the bounded case, then show the truncation error vanishes.

2
Borel-Cantelli Lemma

Show that n=1P(Xˉnμ>ϵ)<\sum_{n=1}^\infty P(|\bar{X}_n - \mu| > \epsilon) < \infty for any ϵ>0\epsilon > 0.

By Borel-Cantelli, this implies P(Xˉnμ>ϵ i.o.)=0P(|\bar{X}_n - \mu| > \epsilon \text{ i.o.}) = 0, proving almost sure convergence.

3
Fourth Moment Method

For the bounded case with fourth moments, use:

n=1P(Xˉnμ>ϵ)n=1E[(Xˉnμ)4]ϵ4<\sum_{n=1}^\infty P(|\bar{X}_n - \mu| > \epsilon) \leq \sum_{n=1}^\infty \frac{E[(\bar{X}_n - \mu)^4]}{\epsilon^4} < \infty

The full proof requires more sophisticated techniques (martingale theory or ergodic theory). \square

Practice Quiz
10
Questions
0
Correct
0%
Accuracy
1
What is convergence in distribution?
Not attempted
2
What is the relationship between convergence in probability and convergence in distribution?
Not attempted
3
What does the Weak Law of Large Numbers state?
Not attempted
4
What is the Central Limit Theorem?
Not attempted
5
What is Slutsky's Lemma used for?
Not attempted
6
What is almost sure convergence?
Not attempted
7
What does the Strong Law of Large Numbers state?
Not attempted
8
What is the relationship between convergence types?
Not attempted
9
What is Lévy Continuity Theorem?
Not attempted
10
What is the de Moivre-Laplace Theorem?
Not attempted

Frequently Asked Questions

What is the difference between weak and strong law of large numbers?

The weak law gives convergence in probability of the sample mean to the population mean. The strong law gives almost sure convergence, which is a stronger statement because it holds with probability 1.

When can I use the Central Limit Theorem?

Use the Central Limit Theorem for sums or averages of many independent variables with finite variance. After centering and scaling, the distribution approaches a normal law, which justifies common approximations in inference.

What is the relationship between different types of convergence?

Almost sure convergence implies convergence in probability, and convergence in probability implies convergence in distribution. The reverse implications do not hold in general, except for special cases such as convergence in distribution to a constant.

How do I apply Slutsky's Lemma?

Slutsky's Lemma is used when one sequence converges in distribution and another converges in probability to a constant. It lets you combine them through sums, products, and ratios while preserving convergence in distribution.

What is the practical importance of limit theorems?

Limit theorems justify sample averages, normal approximations, and asymptotic statistical procedures. They are the mathematical backbone of estimation, confidence intervals, and hypothesis testing.

Ask AI ✨