MathIsimple
Foundation Topic
5-7 Hours

Mathematical Statistics Fundamentals

Master the foundational concepts of mathematical statistics from population theory to statistical inference

The Logic of Mathematical Statistics

What is Mathematical Statistics?

Mathematical statistics is the science of reasoning and decision-making under uncertainty, focusing on inferring unknown population characteristics from observed sample data.

Raw DataStatisticsInformationInferenceDecisions\text{Raw Data} \xrightarrow{\text{Statistics}} \text{Information} \xrightarrow{\text{Inference}} \text{Decisions}
Objective

Infer population distribution F(x)F(x) from random sample X1,X2,,XnX_1, X_2, \ldots, X_n

Methodology

Uses probability theory as foundation, develops inference procedures with quantifiable reliability

Probability Theory vs Mathematical Statistics
AspectProbability TheoryMathematical Statistics
DirectionPopulation → Sample (Forward)Sample → Population (Inverse)
Known InformationDistribution F is knownDistribution F is unknown
Question TypeWhat samples will we get?What is the population?

Population & Samples

Statistical Population

A statistical population is the complete collection of all individuals or units under study, characterized mathematically by its cumulative distribution function:

F(x)=P(Xx)F(x) = P(X \leq x)
Key Characteristics:
  • Can be finite or infinite
  • Population parameters (mumu, sigma2sigma^2) are typically unknown constants
  • May belong to a parametric family: F(x;θ)F(x; \theta) where θΘ\theta \in \Theta is unknown
Simple Random Sample (i.i.d.)

A simple random sample is a collection of nn random variables:

X1,X2,,XnX_1, X_2, \ldots, X_n

that are independent and identically distributed (i.i.d.), each with same distribution as population.

Two Essential Conditions:

1. Independence

X1,X2,,Xn are mutually independentX_1, X_2, \ldots, X_n \text{ are mutually independent}

2. Identical Distribution

XiFfor all i=1,2,,nX_i \sim F \quad \text{for all } i = 1, 2, \ldots, n

Sample Statistics & Empirical Distribution

Sample Mean
Xˉ=1ni=1nXi\bar{X} = \frac{1}{n}\sum_{i=1}^n X_i

Unbiasedness

E[Xˉ]=μE[\bar{X}] = \mu

Variance

Var(Xˉ)=σ2n\text{Var}(\bar{X}) = \frac{\sigma^2}{n}
Sample Variance
S2=1n1i=1n(XiXˉ)2S^2 = \frac{1}{n-1}\sum_{i=1}^n (X_i - \bar{X})^2

Why n-1? Bessel's correction makes S2S^2 an unbiased estimator: E[S2]=σ2E[S^2] = \sigma^2

Empirical Distribution Function
Fn(x)=1ni=1n1{Xix}F_n(x) = \frac{1}{n}\sum_{i=1}^n \mathbb{1}\{X_i \leq x\}

Glivenko-Cantelli Theorem

supxFn(x)F(x)a.s.0as n\sup_x |F_n(x) - F(x)| \xrightarrow{a.s.} 0 \quad \text{as } n \to \infty

Empirical distribution converges uniformly to true distribution almost surely

Fundamental Theorems

Weak Law of Large Numbers (WLLN)
For i.i.d. samples with finite mean μ\mu and variance σ2\sigma^2, the sample mean Xˉn\bar{X}_n converges in probability to μ\mu.

Mathematical Statement

limnP(Xˉnμϵ)=0for any ϵ>0\lim_{n \to \infty} P(|\bar{X}_n - \mu| \geq \epsilon) = 0 \quad \text{for any } \epsilon > 0
1
Moments of Sample Mean
E[Xˉn]=μ,Var(Xˉn)=σ2nE[\bar{X}_n] = \mu, \quad \text{Var}(\bar{X}_n) = \frac{\sigma^2}{n}
2
Apply Chebyshev's Inequality
P(Xˉnμϵ)σ2nϵ2P(|\bar{X}_n - \mu| \geq \epsilon) \leq \frac{\sigma^2}{n\epsilon^2}
3
Take the Limit
limnσ2nϵ2=0\lim_{n \to \infty} \frac{\sigma^2}{n\epsilon^2} = 0

Therefore, the probability limit is 0. \blacksquare

Central Limit Theorem (CLT)
For i.i.d. samples with mean μ\mu and finite variance σ2\sigma^2, the standardized sample mean converges in distribution to N(0,1)N(0,1).

Mathematical Statement

n(Xˉnμσ)dN(0,1)\sqrt{n}\left(\frac{\bar{X}_n - \mu}{\sigma}\right) \xrightarrow{d} N(0,1)
1
Standardize Variables

Let Zi=(Xiμ)/σZ_i = (X_i - \mu)/\sigma. Then E[Zi]=0E[Z_i]=0, Var(Zi)=1Var(Z_i)=1.

2
MGF Expansion

The MGF of ZiZ_i near 0 is MZ(t)=1+t22+o(t2)M_Z(t) = 1 + \frac{t^2}{2} + o(t^2).

3
Take the Limit
limnMSn(t)=et2/2\lim_{n \to \infty} M_{S_n^*}(t) = e^{t^2/2}

et2/2e^{t^2/2} is the MGF of N(0,1)N(0,1). By the continuity theorem for MGFs, the distribution converges. \blacksquare

Examples

1
Example: Sample Mean Properties

Problem: For a population with mean mu=100mu = 100 and variance sigma2=25sigma^2 = 25, find the expected value and variance of the sample mean for a sample of size n=16n = 16.

Solution:

By the properties of sample mean:

E[Xˉ]=μ=100E[\bar{X}] = \mu = 100Var(Xˉ)=σ2n=2516=1.5625\text{Var}(\bar{X}) = \frac{\sigma^2}{n} = \frac{25}{16} = 1.5625
2
Example: Empirical Distribution

Problem: Given a sample X1=2,X2=5,X3=3,X4=5,X5=7X_1 = 2, X_2 = 5, X_3 = 3, X_4 = 5, X_5 = 7, construct the empirical distribution function Fn(x)F_n(x).

Solution:

The empirical CDF is a step function:

Fn(x)={0x<20.22x<30.43x<50.85x<71x7F_n(x) = \begin{cases} 0 & x < 2 \\ 0.2 & 2 \leq x < 3 \\ 0.4 & 3 \leq x < 5 \\ 0.8 & 5 \leq x < 7 \\ 1 & x \geq 7 \end{cases}
3
Example: CLT Application

Problem: For a population with mu=50mu = 50 and sigma=10sigma = 10, approximate P(Xˉ100>52)P(\bar{X}_{100} > 52) using CLT.

Solution:

By CLT, Xˉ100N(50,1)\bar{X}_{100} \approx N(50, 1) since Var(Xˉ100)=100/100=1\text{Var}(\bar{X}_{100}) = 100/100 = 1.

P(Xˉ100>52)=P(Xˉ100501>2)1Φ(2)0.0228P(\bar{X}_{100} > 52) = P\left(\frac{\bar{X}_{100} - 50}{1} > 2\right) \approx 1 - \Phi(2) \approx 0.0228

Practice Quiz

Test your understanding with 10 multiple-choice questions

Practice Quiz
10
Questions
0
Correct
0%
Accuracy
1
What is the fundamental difference between probability theory and mathematical statistics?
Not attempted
2
A statistical population is characterized by:
Not attempted
3
For a simple random sample X1,X2,,XnX_1, X_2, \ldots, X_n to be i.i.d., what conditions must be satisfied?
Not attempted
4
Why do we divide by n1n-1 instead of nn when calculating sample variance?
Not attempted
5
The empirical distribution function Fn(x)F_n(x) converges to the true distribution F(x)F(x) according to:
Not attempted
6
What does the Weak Law of Large Numbers state?
Not attempted
7
According to the Central Limit Theorem, the standardized sample mean converges to:
Not attempted
8
A statistic is a function that:
Not attempted
9
Order statistics X(1)X(2)X(n)X_{(1)} \leq X_{(2)} \leq \cdots \leq X_{(n)} are:
Not attempted
10
The sample mean Xˉ\bar{X} has which property?
Not attempted

Frequently Asked Questions

What is the difference between mathematical statistics and probability theory?

Probability theory works forwards from known distributions to predict sample behavior, while mathematical statistics works backwards from observed samples to infer unknown population characteristics. Probability asks "what samples will we get?", while statistics asks "what is the population?".

Why do we divide by n-1 instead of n when calculating sample variance?

Dividing by n1n-1 (Bessel's correction) makes the sample variance an unbiased estimator of the population variance. This correction accounts for the fact that we're using the sample mean rather than the true population mean, which introduces a slight underestimation that n1n-1 corrects.

What does i.i.d. mean and why is it important?

I.i.d. stands for "independent and identically distributed". It means each observation comes from the same distribution and doesn't depend on other observations. This assumption is crucial because it allows us to apply powerful statistical theorems like the Law of Large Numbers and Central Limit Theorem.

What is a statistical population?

A statistical population is the complete collection of all individuals or units under study, characterized by a distribution function F(x)F(x). It can be finite (e.g., all students in a school) or infinite (e.g., all possible measurements of a physical quantity). The population distribution typically contains unknown parameters we want to estimate.

How large should my sample size be?

Sample size depends on several factors: desired precision, population variability, confidence level, and the specific inference goal. Generally, larger samples provide more precise estimates. Rules of thumb include n30n \geq 30 for CLT applications, but power analysis provides more rigorous sample size determination for specific tests.

Ask AI ✨
MathIsimple – Simple, Friendly Math Tools & Learning