Lesson 3-3: Normal & Sampling Distributions

Normal Distribution & Sampling Distributions

Master the normal distribution, z-scores, Central Limit Theorem, and sampling distributions for statistical inference and probability calculations.

Learning Objectives

Normal Distribution

Understand the bell curve, standardization, and probability calculations

Central Limit Theorem

Learn how sample means approach normality

t-Distribution

Use when population standard deviation is unknown

Confidence Intervals

Construct intervals for means and proportions

Core Knowledge Points

Normal Distribution Fundamentals

Probability Density Function

f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2}

Parameters: μ (mean) and σ (standard deviation)

Standardization (Z-scores)

Z = \frac{X - \mu}{\sigma}

Converts any normal distribution to standard normal: Z ~ N(0,1)

Cumulative Distribution Function

P(X \leq x) = \Phi\left(\frac{x-\mu}{\sigma}\right)

Φ(z) is the standard normal CDF, available in tables or software

Central Limit Theorem (CLT)

Statement

For a population with mean μ and standard deviation σ, the sampling distribution of the sample mean approaches a normal distribution as sample size increases, regardless of the population distribution shape.

\bar{X} \sim N\left(\mu, \frac{\sigma^2}{n}\right)

Standard Error

SE = \frac{\sigma}{\sqrt{n}}

Measures the precision of the sample mean

Sample Size Rule

n ≥ 30 for approximate normality (n ≥ 15 if population is roughly normal)

t-Distribution

When to Use

When population standard deviation σ is unknown and sample size is small (n < 30)

t-Statistic

t = \frac{\bar{X} - \mu}{s/\sqrt{n}}

where s is the sample standard deviation

Properties

• Symmetric and bell-shaped like normal distribution
• Heavier tails than normal distribution
• Degrees of freedom = n - 1
• Approaches normal distribution as df → ∞

Confidence Intervals

For Population Mean (σ known)

\bar{x} \pm z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}}

For Population Mean (σ unknown)

\bar{x} \pm t_{\alpha/2, df} \cdot \frac{s}{\sqrt{n}}

For Population Proportion

\hat{p} \pm z_{\alpha/2} \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}

where $\hat{p} = \frac{x}{n}$ is the sample proportion

Comprehensive Worked Examples

Example 1: Normal Distribution Probability Calculation

Problem Statement

The heights of adult men are normally distributed with mean 70 inches and standard deviation 3 inches. Find the probability that a randomly selected man is between 68 and 72 inches tall.

Solution Steps

Step 1: Identify the distribution

X ~ N(70, 3²) where μ = 70 and σ = 3

Step 2: Standardize the values

For x = 68: $z_1 = \frac{68 - 70}{3} = -0.67$

For x = 72: $z_2 = \frac{72 - 70}{3} = 0.67$

Step 3: Calculate the probability

P(68 \leq X \leq 72) = P(-0.67 \leq Z \leq 0.67) = \Phi(0.67) - \Phi(-0.67)

= 0.7486 - 0.2514 = 0.4972

Answer

The probability that a randomly selected man is between 68 and 72 inches tall is approximately 49.7%.

Example 2: Central Limit Theorem Application

Problem Statement

A population has mean 50 and standard deviation 12. If we take random samples of size 36, what is the probability that the sample mean will be between 48 and 52?

Solution Steps

Step 1: Apply CLT

Since n = 36 ≥ 30, the sampling distribution is approximately normal:

\bar{X} \sim N\left(50, \frac{12^2}{36}\right) = N(50, 4)

Step 2: Calculate standard error

SE = \frac{\sigma}{\sqrt{n}} = \frac{12}{\sqrt{36}} = 2

Step 3: Standardize and calculate probability

For $\bar{x} = 48$ : $z_1 = \frac{48 - 50}{2} = -1$

For $\bar{x} = 52$ : $z_2 = \frac{52 - 50}{2} = 1$

P(48 \leq \bar{X} \leq 52) = P(-1 \leq Z \leq 1) = \Phi(1) - \Phi(-1) = 0.8413 - 0.1587 = 0.6826

Answer

The probability that the sample mean will be between 48 and 52 is approximately 68.3%.

Example 3: Confidence Interval Construction

Problem Statement

A sample of 25 students has a mean test score of 78 with a standard deviation of 8. Construct a 95% confidence interval for the population mean.

Solution Steps

Step 1: Identify the situation

Population σ is unknown, n = 25 < 30, so use t-distribution

Step 2: Find critical value

df = n - 1 = 24, α = 0.05, so t₀.₀₂₅,₂₄ = 2.064

Step 3: Calculate margin of error

ME = t_{\alpha/2, df} \cdot \frac{s}{\sqrt{n}} = 2.064 \cdot \frac{8}{\sqrt{25}} = 2.064 \cdot 1.6 = 3.30

Step 4: Construct the interval

\bar{x} \pm ME = 78 \pm 3.30 = (74.7, 81.3)

Answer

We are 95% confident that the true population mean test score lies between 74.7 and 81.3.

Practice Problems

Normal Distribution Probability

Given: IQ scores are normally distributed with mean 100 and standard deviation 15

Find: Probability that a randomly selected person has an IQ between 85 and 115

Solution

Standardize the values:

For x = 85:

z_1 = \frac{85 - 100}{15} = -1

For x = 115:

z_2 = \frac{115 - 100}{15} = 1

Calculate the probability:

P(85 \leq X \leq 115) = P(-1 \leq Z \leq 1) = \Phi(1) - \Phi(-1) = 0.8413 - 0.1587 = 0.6826

Answer: The probability is 0.6826 or 68.26%

Central Limit Theorem Application

Given: Population mean μ = 25, standard deviation σ = 6, sample size n = 49

Find: Probability that the sample mean exceeds 26

Solution

Apply Central Limit Theorem:

\bar{X} \sim N\left(\mu, \frac{\sigma^2}{n}\right) = N\left(25, \frac{6^2}{49}\right) = N(25, 0.735)

Calculate z-score:

z = \frac{\bar{x} - \mu}{\sigma/\sqrt{n}} = \frac{26 - 25}{6/\sqrt{49}} = \frac{1}{6/7} = \frac{7}{6} \approx 1.17

Find the probability:

P(\bar{X} > 26) = P(Z > 1.17) = 1 - \Phi(1.17) = 1 - 0.8790 = 0.1210

Answer: The probability is 0.1210 or 12.10%

Confidence Interval Construction

Given: Sample of 16 measurements with mean 42.5 and standard deviation 3.2

Find: 90% confidence interval for the population mean

Solution

Find critical value and degrees of freedom:

df = n - 1 = 16 - 1 = 15, α = 0.10, t₀.₀₅,₁₅ = 1.753

Calculate margin of error:

ME = t_{\alpha/2, df} \cdot \frac{s}{\sqrt{n}} = 1.753 \cdot \frac{3.2}{\sqrt{16}} = 1.753 \cdot 0.8 = 1.40

Construct the confidence interval:

\bar{x} \pm ME = 42.5 \pm 1.40 = (41.1, 43.9)

Answer: We are 90% confident that the true population mean lies between 41.1 and 43.9

Advanced Topics

Sampling Distribution of Proportions

For large samples (np ≥ 10 and n(1-p) ≥ 10), the sampling distribution of sample proportions is approximately normal:

\hat{p} \sim N\left(p, \frac{p(1-p)}{n}\right)

This allows us to construct confidence intervals and perform hypothesis tests for population proportions.

Chi-Square Distribution

Used for testing variance and goodness-of-fit tests. If $Z_1, Z_2, \ldots, Z_k$ are independent standard normal variables:

\chi^2 = Z_1^2 + Z_2^2 + \cdots + Z_k^2 \sim \chi^2(k)

where k is the degrees of freedom.

F-Distribution

Used for comparing variances and in ANOVA. If $U \sim \chi^2(d_1)$ and $V \sim \chi^2(d_2)$ are independent:

F = \frac{U/d_1}{V/d_2} \sim F(d_1, d_2)

where d₁ and d₂ are the degrees of freedom.

Common Misconceptions

"All data follows a normal distribution"

This is false. Many real-world distributions are skewed, bimodal, or have other shapes. The normal distribution is an approximation that works well in many cases due to the CLT.

"Z-scores are always between -3 and 3"

While most z-scores fall within this range (about 99.7% of the time), extreme values are possible. Z-scores can theoretically be any real number.

"The CLT applies to any sample size"

The CLT requires sufficiently large sample sizes. For very small samples or highly skewed populations, the normal approximation may not be appropriate.

Real-World Applications

Quality Control

• Manufacturing process monitoring
• Product dimension tolerances
• Defect rate estimation
• Six Sigma methodology

Medical Research

• Drug efficacy studies
• Clinical trial design
• Diagnostic test accuracy
• Treatment outcome prediction

Finance and Economics

• Risk assessment models
• Portfolio optimization
• Market analysis
• Economic forecasting

Social Sciences

• Survey data analysis
• Educational assessment
• Psychological testing
• Public opinion polling

Summary

The normal distribution is fundamental to statistical inference, providing a model for many natural phenomena and serving as the foundation for the Central Limit Theorem. Understanding z-scores, standardization, and sampling distributions enables us to make probabilistic statements about population parameters.

Key concepts: Normal distribution properties, standardization, Central Limit Theorem, t-distribution, confidence intervals, and their applications in real-world statistical inference.

Practical skills: Calculate probabilities using normal distributions, construct confidence intervals, and understand when to use different distributions based on sample size and known/unknown parameters.

← Back to Lesson 3-2 Back to Unit 3