MathIsimple
Advanced Topic
8-10 Hours

Limit Theorems for Random Variables

Advanced Level
Convergence Theory

Core Concepts

Convergence in Distribution

Definition

Distribution functions FnF_n converge weakly to FF, denoted FnwFF_n \stackrel{w}{\to} F, if:

limnFn(x)=F(x)\lim_{n\to\infty}F_n(x)=F(x)

for all continuity points xx of F(x)F(x).

Lévy Continuity Theorem

ξndξ    fn(t)f(t) for all tR\xi_n \stackrel{d}{\to} \xi \iff f_n(t) \to f(t) \text{ for all } t \in \mathbb{R}

Convergence in distribution is equivalent to convergence of characteristic functions.

Convergence in Probability

Definition

ξnPξ    limnP(ξnξε)=0 for all ε>0\xi_n \stackrel{P}{\to} \xi \iff \lim_{n\to\infty}P(|\xi_n-\xi|\geq\varepsilon)=0 \text{ for all } \varepsilon > 0

Convergence in probability is stronger than convergence in distribution.

Slutsky's Lemma: If ξndξ\xi_n \stackrel{d}{\to} \xi and ηnPc\eta_n \stackrel{P}{\to} c, then ξn+ηndξ+c\xi_n + \eta_n \stackrel{d}{\to} \xi + c, ξnηndcξ\xi_n\eta_n \stackrel{d}{\to} c\xi, etc.

Law of Large Numbers

Weak Law of Large Numbers (WLLN)

Theorem:

If {ξn} i.i.d. with Eξ1<, then 1nk=1nξkPEξ1\text{If } \{\xi_n\} \text{ i.i.d. with } E|\xi_1| < \infty, \text{ then } \frac{1}{n}\sum_{k=1}^n \xi_k \stackrel{P}{\to} E\xi_1

The sample mean converges in probability to the population mean.

Strong Law of Large Numbers (SLLN)

Kolmogorov SLLN:

If {ξn} i.i.d., then 1nk=1nξka.s.Eξ1    Eξ1<\text{If } \{\xi_n\} \text{ i.i.d., then } \frac{1}{n}\sum_{k=1}^n \xi_k \stackrel{a.s.}{\to} E\xi_1 \iff E|\xi_1| < \infty

The sample mean converges almost surely to the population mean if and only if the expectation is finite.

Central Limit Theorem

Lindeberg-Lévy CLT

Theorem:

If {ξn}\{\xi_n\} are i.i.d. with Eξ1=μE\xi_1 = \mu and 0<Var(ξ1)=σ2<0 < \text{Var}(\xi_1) = \sigma^2 < \infty, then:

k=1nξknμnσdN(0,1)\frac{\sum_{k=1}^n \xi_k - n\mu}{\sqrt{n}\sigma} \stackrel{d}{\to} N(0,1)

The normalized sum converges in distribution to standard normal distribution.

de Moivre-Laplace Theorem

Normal Approximation to Binomial:

If SnB(n,p)S_n \sim B(n,p) with q=1pq = 1-p, then for large nn:

SnnpnpqdN(0,1)\frac{S_n - np}{\sqrt{npq}} \stackrel{d}{\to} N(0,1)

This provides normal approximation to binomial probabilities, useful when nn is large.

Worked Examples

Example 1: Applying CLT
Problem:

If X1,,X100X_1, \ldots, X_{100} are i.i.d. with E[Xi]=5E[X_i] = 5 and Var(Xi)=4\text{Var}(X_i) = 4, approximate P(i=1100Xi>520)P(\sum_{i=1}^{100} X_i > 520).

Solution:
  1. 1. By CLT: Xi500100×2=Xi50020N(0,1)\frac{\sum X_i - 500}{\sqrt{100} \times 2} = \frac{\sum X_i - 500}{20} \sim N(0,1)
  2. 2. Transform: P(Xi>520)=P(Xi50020>1)P(\sum X_i > 520) = P(\frac{\sum X_i - 500}{20} > 1)
  3. 3. Standard normal: P(Z>1)=1Φ(1)=10.8413=0.1587P(Z > 1) = 1 - \Phi(1) = 1 - 0.8413 = 0.1587
Example 2: Normal Approximation to Binomial
Problem:

For XB(100,0.3)X \sim B(100, 0.3), approximate P(25X35)P(25 \leq X \leq 35) using normal approximation.

Solution:
  1. 1. Parameters: μ=np=30\mu = np = 30, σ=npq=214.58\sigma = \sqrt{npq} = \sqrt{21} \approx 4.58
  2. 2. Standardize: P(25X35)P(24.5304.58Z35.5304.58)P(25 \leq X \leq 35) \approx P(\frac{24.5-30}{4.58} \leq Z \leq \frac{35.5-30}{4.58})
  3. 3. Calculate: P(1.20Z1.20)=2Φ(1.20)1=2(0.8849)1=0.7698P(-1.20 \leq Z \leq 1.20) = 2\Phi(1.20) - 1 = 2(0.8849) - 1 = 0.7698
  4. 4. Note: Used continuity correction (24.5 and 35.5) for better approximation

Theorem Proofs

Rigorous proofs of fundamental limit theorems

Weak Law of Large Numbers
For i.i.d. random variables X1,X2,,XnX_1, X_2, \ldots, X_n with E[Xi]=μE[X_i] = \mu and Var(Xi)=σ2<\text{Var}(X_i) = \sigma^2 < \infty, limnP(Xˉnμϵ)=0\lim_{n \to \infty} P(|\bar{X}_n - \mu| \geq \epsilon) = 0 for any ϵ>0\epsilon > 0

Proof (using Chebyshev's Inequality)

1
Compute Moments of Sample Mean

Since XiX_i are i.i.d.:

E[Xˉn]=E[1ni=1nXi]=1ni=1nE[Xi]=μE[\bar{X}_n] = E\left[\frac{1}{n}\sum_{i=1}^n X_i\right] = \frac{1}{n}\sum_{i=1}^n E[X_i] = \muVar(Xˉn)=Var(1ni=1nXi)=1n2i=1nVar(Xi)=σ2n\text{Var}(\bar{X}_n) = \text{Var}\left(\frac{1}{n}\sum_{i=1}^n X_i\right) = \frac{1}{n^2}\sum_{i=1}^n \text{Var}(X_i) = \frac{\sigma^2}{n}
2
Apply Chebyshev's Inequality

For any ϵ>0\epsilon > 0:

P(Xˉnμϵ)Var(Xˉn)ϵ2=σ2nϵ2P(|\bar{X}_n - \mu| \geq \epsilon) \leq \frac{\text{Var}(\bar{X}_n)}{\epsilon^2} = \frac{\sigma^2}{n\epsilon^2}
3
Take the Limit
limnP(Xˉnμϵ)limnσ2nϵ2=0\lim_{n \to \infty} P(|\bar{X}_n - \mu| \geq \epsilon) \leq \lim_{n \to \infty} \frac{\sigma^2}{n\epsilon^2} = 0

Since probabilities are non-negative, this proves convergence in probability. \blacksquare

Central Limit Theorem (Sketch)
For i.i.d. random variables X1,X2,,XnX_1, X_2, \ldots, X_n with E[Xi]=μE[X_i] = \mu and 0<Var(Xi)=σ2<0 < \text{Var}(X_i) = \sigma^2 < \infty, n(Xˉnμ)σdN(0,1)\frac{\sqrt{n}(\bar{X}_n - \mu)}{\sigma} \stackrel{d}{\to} N(0,1)

Proof Sketch (via Characteristic Functions)

1
Standardize Variables

Let Zi=(Xiμ)/σZ_i = (X_i - \mu)/\sigma. Then E[Zi]=0E[Z_i] = 0, Var(Zi)=1\text{Var}(Z_i) = 1.

Define Sn=i=1nZinS_n^* = \frac{\sum_{i=1}^n Z_i}{\sqrt{n}}. We want to show SndN(0,1)S_n^* \stackrel{d}{\to} N(0,1).

2
Characteristic Function

Let ϕ(t)=E[eitZi]\phi(t) = E[e^{itZ_i}] be the characteristic function of ZiZ_i. By Taylor expansion:

ϕ(t)=1+itE[Zi]t22E[Zi2]+o(t2)=1t22+o(t2)\phi(t) = 1 + it E[Z_i] - \frac{t^2}{2}E[Z_i^2] + o(t^2) = 1 - \frac{t^2}{2} + o(t^2)
3
Characteristic Function of Sum

Since ZiZ_i are i.i.d., the CF of SnS_n^* is:

ϕSn(t)=[ϕ(tn)]n=[1t22n+o(t2n)]n\phi_{S_n^*}(t) = \left[\phi\left(\frac{t}{\sqrt{n}}\right)\right]^n = \left[1 - \frac{t^2}{2n} + o\left(\frac{t^2}{n}\right)\right]^n
4
Take the Limit
limnϕSn(t)=limn[1t22n]n=et2/2\lim_{n \to \infty} \phi_{S_n^*}(t) = \lim_{n \to \infty} \left[1 - \frac{t^2}{2n}\right]^n = e^{-t^2/2}

This is the characteristic function of N(0,1)N(0,1). By the Lévy continuity theorem, convergence of CFs implies convergence in distribution. \blacksquare

Strong Law of Large Numbers (Outline)
For i.i.d. random variables X1,X2,X_1, X_2, \ldots with E[Xi]<E[|X_i|] < \infty and E[Xi]=μE[X_i] = \mu, P(limnXˉn=μ)=1P(\lim_{n \to \infty} \bar{X}_n = \mu) = 1 (almost sure convergence)

Key Ideas

1
Truncation

Truncate the variables to reduce them to the bounded case, then show the truncation error vanishes.

2
Borel-Cantelli Lemma

Show that n=1P(Xˉnμ>ϵ)<\sum_{n=1}^\infty P(|\bar{X}_n - \mu| > \epsilon) < \infty for any ϵ>0\epsilon > 0.

By Borel-Cantelli, this implies P(Xˉnμ>ϵ i.o.)=0P(|\bar{X}_n - \mu| > \epsilon \text{ i.o.}) = 0, proving almost sure convergence.

3
Fourth Moment Method

For the bounded case with fourth moments, use:

n=1P(Xˉnμ>ϵ)n=1E[(Xˉnμ)4]ϵ4<\sum_{n=1}^\infty P(|\bar{X}_n - \mu| > \epsilon) \leq \sum_{n=1}^\infty \frac{E[(\bar{X}_n - \mu)^4]}{\epsilon^4} < \infty

The full proof requires more sophisticated techniques (martingale theory or ergodic theory). \square

Practice Quiz
10
Questions
0
Correct
0%
Accuracy
1
What is convergence in distribution?
Not attempted
2
What is the relationship between convergence in probability and convergence in distribution?
Not attempted
3
What does the Weak Law of Large Numbers state?
Not attempted
4
What is the Central Limit Theorem?
Not attempted
5
What is Slutsky's Lemma used for?
Not attempted
6
What is almost sure convergence?
Not attempted
7
What does the Strong Law of Large Numbers state?
Not attempted
8
What is the relationship between convergence types?
Not attempted
9
What is Lévy Continuity Theorem?
Not attempted
10
What is the de Moivre-Laplace Theorem?
Not attempted

Frequently Asked Questions

What is the difference between weak and strong law of large numbers?

Weak Law states convergence in probability: sample mean converges to population mean with probability approaching 1. Strong Law states almost sure convergence: sample mean converges to population mean with probability 1. Strong Law is stronger and requires finite expectation.

When can I use the Central Limit Theorem?

CLT applies when you have a sum of many independent random variables with finite variance. The normalized sum (subtract mean, divide by standard deviation) converges to standard normal. This allows normal approximations for sums, averages, and sample statistics.

What is the relationship between different types of convergence?

Almost sure convergence is strongest, implying convergence in probability, which in turn implies convergence in distribution. The reverse implications are not generally true. However, convergence in distribution to a constant implies convergence in probability to that constant.

How do I apply Slutsky's Lemma?

Use Slutsky's Lemma when you have one sequence converging in distribution and another converging in probability to a constant. You can then combine them: sums, products, and ratios all converge in distribution to the expected limits. This is essential for asymptotic statistics.

What is the practical importance of limit theorems?

Limit theorems justify statistical methods: LLN justifies using sample means as estimates, CLT enables normal approximations for confidence intervals and hypothesis tests, and convergence theory provides theoretical foundation for asymptotic inference. They are fundamental to all statistical practice.

Ask AI ✨
MathIsimple – Simple, Friendly Math Tools & Learning