MathIsimple – Simple, Friendly Math Tools & Learning

The Logic of Mathematical Statistics

What is Mathematical Statistics?

Mathematical statistics is the science of reasoning and decision-making under uncertainty, focusing on inferring unknown population characteristics from observed sample data.

\text{Raw Data} \xrightarrow{\text{Statistics}} \text{Information} \xrightarrow{\text{Inference}} \text{Decisions}

Objective

Infer population distribution $F(x)$ from random sample $X_1, X_2, \ldots, X_n$

Methodology

Uses probability theory as foundation, develops inference procedures with quantifiable reliability

Probability Theory vs Mathematical Statistics

Aspect	Probability Theory	Mathematical Statistics
Direction	Population → Sample (Forward)	Sample → Population (Inverse)
Known Information	Distribution F is known	Distribution F is unknown
Question Type	What samples will we get?	What is the population?

Population & Samples

Statistical Population

A statistical population is the complete collection of all individuals or units under study, characterized mathematically by its cumulative distribution function:

F(x) = P(X \leq x)

Key Characteristics:

Can be finite or infinite
Population parameters ( $mu$ , $sigma^2$ ) are typically unknown constants
May belong to a parametric family: $F(x; \theta)$ where $\theta \in \Theta$ is unknown

Simple Random Sample (i.i.d.)

A simple random sample is a collection of $n$ random variables:

X_1, X_2, \ldots, X_n

that are independent and identically distributed (i.i.d.), each with same distribution as population.

Two Essential Conditions:

1. Independence

X_1, X_2, \ldots, X_n \text{ are mutually independent}

2. Identical Distribution

X_i \sim F \quad \text{for all } i = 1, 2, \ldots, n

Sample Statistics & Empirical Distribution

Sample Mean

\bar{X} = \frac{1}{n}\sum_{i=1}^n X_i

Unbiasedness

E[\bar{X}] = \mu

Variance

\text{Var}(\bar{X}) = \frac{\sigma^2}{n}

Sample Variance

S^2 = \frac{1}{n-1}\sum_{i=1}^n (X_i - \bar{X})^2

Why n-1? Bessel's correction makes $S^2$ an unbiased estimator: $E[S^2] = \sigma^2$

Empirical Distribution Function

F_n(x) = \frac{1}{n}\sum_{i=1}^n \mathbb{1}\{X_i \leq x\}

Glivenko-Cantelli Theorem

\sup_x |F_n(x) - F(x)| \xrightarrow{a.s.} 0 \quad \text{as } n \to \infty

Empirical distribution converges uniformly to true distribution almost surely

Fundamental Theorems

Weak Law of Large Numbers (WLLN)

For i.i.d. samples with finite mean

\mu

and variance

\sigma^2

, the sample mean

\bar{X}_n

converges in probability to

\mu

.

Mathematical Statement

\lim_{n \to \infty} P(|\bar{X}_n - \mu| \geq \epsilon) = 0 \quad \text{for any } \epsilon > 0

1

Moments of Sample Mean

E[\bar{X}_n] = \mu, \quad \text{Var}(\bar{X}_n) = \frac{\sigma^2}{n}

2

Apply Chebyshev's Inequality

P(|\bar{X}_n - \mu| \geq \epsilon) \leq \frac{\sigma^2}{n\epsilon^2}

3

Take the Limit

\lim_{n \to \infty} \frac{\sigma^2}{n\epsilon^2} = 0

Therefore, the probability limit is 0. $\blacksquare$

Central Limit Theorem (CLT)

For i.i.d. samples with mean

\mu

and finite variance

\sigma^2

, the standardized sample mean converges in distribution to

N(0,1)

.

Mathematical Statement

\sqrt{n}\left(\frac{\bar{X}_n - \mu}{\sigma}\right) \xrightarrow{d} N(0,1)

1

Standardize Variables

Let $Z_i = (X_i - \mu)/\sigma$ . Then $E[Z_i]=0$ , $Var(Z_i)=1$ .

2

MGF Expansion

The MGF of $Z_i$ near 0 is $M_Z(t) = 1 + \frac{t^2}{2} + o(t^2)$ .

3

Take the Limit

\lim_{n \to \infty} M_{S_n^*}(t) = e^{t^2/2}

$e^{t^2/2}$ is the MGF of $N(0,1)$ . By the continuity theorem for MGFs, the distribution converges. $\blacksquare$

Examples

1

Example: Sample Mean Properties

Problem: For a population with mean $mu = 100$ and variance $sigma^2 = 25$ , find the expected value and variance of the sample mean for a sample of size $n = 16$ .

Solution:

By the properties of sample mean:

E[\bar{X}] = \mu = 100

\text{Var}(\bar{X}) = \frac{\sigma^2}{n} = \frac{25}{16} = 1.5625

2

Example: Empirical Distribution

Problem: Given a sample $X_1 = 2, X_2 = 5, X_3 = 3, X_4 = 5, X_5 = 7$ , construct the empirical distribution function $F_n(x)$ .

Solution:

The empirical CDF is a step function:

F_n(x) = \begin{cases} 0 & x < 2 \\ 0.2 & 2 \leq x < 3 \\ 0.4 & 3 \leq x < 5 \\ 0.8 & 5 \leq x < 7 \\ 1 & x \geq 7 \end{cases}

3

Example: CLT Application

Problem: For a population with $mu = 50$ and $sigma = 10$ , approximate $P(\bar{X}_{100} > 52)$ using CLT.

Solution:

By CLT, $\bar{X}_{100} \approx N(50, 1)$ since $\text{Var}(\bar{X}_{100}) = 100/100 = 1$ .

P(\bar{X}_{100} > 52) = P\left(\frac{\bar{X}_{100} - 50}{1} > 2\right) \approx 1 - \Phi(2) \approx 0.0228

Practice Quiz

Test your understanding with 10 multiple-choice questions

Practice Quiz

10

Questions

0

Correct

0%

Accuracy

1

What is the fundamental difference between probability theory and mathematical statistics?

Not attempted

2

A statistical population is characterized by:

Not attempted

3

For a simple random sample

X_1, X_2, \ldots, X_n

to be i.i.d., what conditions must be satisfied?

Not attempted

4

Why do we divide by

n-1

instead of

n

when calculating sample variance?

Not attempted

5

The empirical distribution function

F_n(x)

converges to the true distribution

F(x)

according to:

Not attempted

6

What does the Weak Law of Large Numbers state?

Not attempted

7

According to the Central Limit Theorem, the standardized sample mean converges to:

Not attempted

8

A statistic is a function that:

Not attempted

9

Order statistics

X_{(1)} \leq X_{(2)} \leq \cdots \leq X_{(n)}

are:

Not attempted

10

The sample mean

\bar{X}

has which property?

Not attempted

Frequently Asked Questions

What is the difference between mathematical statistics and probability theory?

Probability theory works forwards from known distributions to predict sample behavior, while mathematical statistics works backwards from observed samples to infer unknown population characteristics. Probability asks "what samples will we get?", while statistics asks "what is the population?".

Why do we divide by n-1 instead of n when calculating sample variance?

Dividing by $n-1$ (Bessel's correction) makes the sample variance an unbiased estimator of the population variance. This correction accounts for the fact that we're using the sample mean rather than the true population mean, which introduces a slight underestimation that $n-1$ corrects.

What does i.i.d. mean and why is it important?

I.i.d. stands for "independent and identically distributed". It means each observation comes from the same distribution and doesn't depend on other observations. This assumption is crucial because it allows us to apply powerful statistical theorems like the Law of Large Numbers and Central Limit Theorem.

What is a statistical population?

A statistical population is the complete collection of all individuals or units under study, characterized by a distribution function $F(x)$ . It can be finite (e.g., all students in a school) or infinite (e.g., all possible measurements of a physical quantity). The population distribution typically contains unknown parameters we want to estimate.

How large should my sample size be?

Sample size depends on several factors: desired precision, population variability, confidence level, and the specific inference goal. Generally, larger samples provide more precise estimates. Rules of thumb include $n \geq 30$ for CLT applications, but power analysis provides more rigorous sample size determination for specific tests.

Mathematical Statistics Fundamentals

The Logic of Mathematical Statistics

Objective

Methodology

Population & Samples

Key Characteristics:

Two Essential Conditions:

Sample Statistics & Empirical Distribution

Fundamental Theorems

Mathematical Statement

Moments of Sample Mean

Apply Chebyshev's Inequality

Take the Limit

Mathematical Statement

Standardize Variables

MGF Expansion

Take the Limit

Examples

Practice Quiz

Frequently Asked Questions