MathIsimple
Hypothesis Testing
6-8 Hours

Hypothesis Testing

Master the fundamental principles of statistical hypothesis testing: from basic concepts and error analysis to advanced methods and real-world applications in statistical inference.

Essential Definitions

Core concepts in hypothesis testing theory

Null Hypothesis (H₀)

The baseline hypothesis under test, typically containing '=', '≥', or '≤', representing the status quo or no effect condition.

Mathematical:

H0:θΘ0H_0: \theta \in \Theta_0

Example:

H₀: μ = μ₀ (population mean equals specified value)

Alternative Hypothesis (H₁)

The hypothesis that contradicts H₀, typically containing '≠', '>', or '<', representing what we're trying to detect.

Mathematical:

H1:θΘ1H_1: \theta \in \Theta_1

Example:

H₁: μ ≠ μ₀ (two-sided), H₁: μ > μ₀ (right-sided)

Type I Error (α)

The probability of rejecting H₀ when it is actually true (false positive). Controlled by significance level.

Mathematical:

α(θ)=Pθ(XDθΘ0)\alpha(\theta) = P_{\theta}(X \in D | \theta \in \Theta_0)

Example:

α = 0.05 means 5% chance of false rejection

Type II Error (β)

The probability of failing to reject H₀ when H₁ is true (false negative). Related to statistical power.

Mathematical:

β(θ)=Pθ(XDθΘ1)\beta(\theta) = P_{\theta}(X \in \overline{D} | \theta \in \Theta_1)

Example:

Power = 1 - β measures test's ability to detect true effects

Hypothesis Construction

Hypothesis Construction Principles
Guidelines for properly formulating null and alternative hypotheses

Complementary Hypotheses

H₀ and H₁ must be mutually exclusive and collectively exhaustive

Mathematical:
Θ0Θ1= and Θ0Θ1=Θ\Theta_0 \cap \Theta_1 = \emptyset \text{ and } \Theta_0 \cup \Theta_1 = \Theta
Example: For μ: H₀: μ = μ₀ vs H₁: μ ≠ μ₀

Status Quo in H₀

H₀ typically represents the current belief, no change, or no effect

Mathematical:
H0:parameter=claimed valueH_0: \text{parameter} = \text{claimed value}
Example: Testing drug effectiveness: H₀: drug has no effect

Burden of Proof

H₁ represents what requires evidence to establish (burden of proof)

Mathematical:
H1:what we want to detectH_1: \text{what we want to detect}
Example: Proving guilt: H₀: innocent, H₁: guilty

Directionality

Choose one-sided or two-sided based on research question

Mathematical:
Two-sided: H1:θθ0, One-sided: H1:θ>θ0\text{Two-sided: } H_1: \theta \neq \theta_0 \text{, One-sided: } H_1: \theta > \theta_0
Example: Quality control often uses one-sided tests
Types of Hypothesis Tests
Classification based on alternative hypothesis structure

Two-Sided (Two-Tailed)

Structure: H0:θ=θ0vsH1:θθ0H₀: θ = θ₀ vs H₁: θ ≠ θ₀

Rejection Region: T<c1orT>c2T < c₁ or T > c₂

Example: Testing if population mean differs from specified value

Applications:

  • Quality assurance
  • Scientific experiments
  • A/B testing

Right-Sided (Upper-Tailed)

Structure: H0:θθ0vsH1:θ>θ0H₀: θ ≤ θ₀ vs H₁: θ > θ₀

Rejection Region: T>cT > c

Example: Testing if new process increases efficiency

Applications:

  • Process improvement
  • Treatment effectiveness
  • Performance enhancement

Left-Sided (Lower-Tailed)

Structure: H0:θθ0vsH1:θ<θ0H₀: θ ≥ θ₀ vs H₁: θ < θ₀

Rejection Region: T<cT < c

Example: Testing if new method reduces error rate

Applications:

  • Cost reduction
  • Risk minimization
  • Error rate improvement

Error Analysis & Statistical Power

Understanding Type I and Type II errors, and optimizing test power

Type I Error (False Positive)
Rejecting true H₀ - the probability of a false alarm
α=maxθΘ0Pθ(Reject H0)\alpha = \max_{\theta \in \Theta_0} P_{\theta}(\text{Reject } H_0)

Characteristics:

  • Controlled by significance level α
  • Common values: α = 0.01, 0.05, 0.10
Real-world analogy: Convicting an innocent person (α-error in justice system)
Type II Error (False Negative)
Failing to reject false H₀ - missing a true effect
β(θ)=Pθ(Accept H0θΘ1)\beta(\theta) = P_{\theta}(\text{Accept } H_0 | \theta \in \Theta_1)

Characteristics:

  • Depends on true parameter value
  • Related to test power: Power = 1 - β
Real-world analogy: Failing to detect a disease when it's present
Power Function
Probability of correctly rejecting H₀ when it's false
g(θ)=Pθ(XD)=Pθ(Reject H0)g(\theta) = P_{\theta}(X \in D) = P_{\theta}(\text{Reject } H_0)

Key Properties:

  • For θΘ0\theta \in \Theta_0: g(θ)=α(θ)g(\theta) = \alpha(\theta) (Type I error rate)
  • For θΘ1\theta \in \Theta_1: g(θ)=1β(θ)g(\theta) = 1 - \beta(\theta) (Power)

Factors Affecting Power:

  • Sample size (larger n\text{larger n} increases power)
  • Significance level (larger α\text{larger } \alpha increases power)
Error Trade-off Relationship
For fixed sample size, Type I and Type II errors are inversely related
As α, then β (for fixed n)\text{As } \alpha \downarrow \text{, then } \beta \uparrow \text{ (for fixed n)}

Solutions to Improve Both Errors:

  • Increase sample size n (reduces both errors)
  • Use more informative experimental design

Neyman-Pearson Principle

Optimal test construction for simple hypotheses

Neyman-Pearson Principle
The fundamental framework for controlling Type I error while minimizing Type II error

Principle Statement:

Control the maximum Type I error probability at level α, and among all such tests, choose the one with minimum Type II error (maximum power).

supθΘ0α(θ)α\sup_{\theta \in \Theta_0} \alpha(\theta) \leq \alpha

Key Advantages:

  • Provides clear error control framework
  • Enables comparison between different tests
  • Forms basis for optimal test construction
  • Widely applicable across statistical problems
Significance Level (α)
The maximum allowable Type I error probability
α=0.01α = 0.01

Very strong evidence required

Common in: Medical trials, Safety testing
α=0.05α = 0.05

Standard in most fields

Common in: Social sciences, Quality control
Critical Value Selection:
ChoosecriticalvaluecsuchthatsupθΘ0P(T>cH0)=αChoose critical value c such that \sup_{\theta \in \Theta_0} P(T > c | H_0) = \alpha
Optimal Test Construction
Steps to construct tests following Neyman-Pearson principle
  1. 1
    Specify H₀ and H₁ clearly
  2. 2
    Choose significance level α and identify test statistic T
  3. 3
    Determine rejection region D to satisfy size constraint
Optimality Concept:

Among all tests with same significance level, choose the one with highest power (Uniformly Most Powerful when exists)

Testing Procedure & Decision Rules

General Hypothesis Testing Procedure
Standard framework for conducting statistical hypothesis tests
1

Formulate Hypotheses & Choose Test

State H₀ and H₁, then select appropriate test statistic

Key Details:
  • Define parameter of interest
  • Select test statistic based on assumptions
Example:
H0:μ=50 vs H1:μ50, use T=Xˉμ0S/nH_0: \mu = 50 \text{ vs } H_1: \mu \neq 50, \text{ use } T = \frac{\bar{X} - \mu_0}{S/\sqrt{n}}
2

Determine Rejection Region

Based on H₁ direction and significance level α

Key Details:
  • Two-sided: |T| > c
  • One-sided: T > c or T < c
Example:
For α=0.05, two-sided: T>t0.025(n1)\text{For } \alpha = 0.05\text{, two-sided: } |T| > t_{0.025}(n-1)
3

Calculate Test Statistic

Compute statistic value using sample data

Key Details:
  • Substitute sample values
  • Verify calculation accuracy
Example:
t=15.215.00.8/25=1.25t = \frac{15.2 - 15.0}{0.8/\sqrt{25}} = 1.25
4

Make Decision & Report P-value

Compare statistic to critical value and calculate P-value

Key Details:
  • State decision clearly
  • Report P-value
Example:
Since 1.25<2.064, fail to reject H0; P-value=0.224\text{Since } |1.25| < 2.064\text{, fail to reject } H_0; \text{ P-value} = 0.224
Decision Rules and Interpretation
Guidelines for making and interpreting test decisions

Critical Value Approach

Compare test statistic to critical value

Decision Rule:
If T>cα/2, reject H0\text{If } |T| > c_{\alpha/2} \text{, reject } H_0

Advantages: Direct comparison, Clear decision boundary

Disadvantages: Doesn't show strength of evidence

P-value Approach

Compare P-value to significance level

Decision Rule:
If P-value<α, reject H0\text{If P-value} < \alpha \text{, reject } H_0

Advantages: Shows strength of evidence, More informative

Disadvantages: Can be misinterpreted

Interpretation Guidelines

Reject H₀:

Strong evidence against H₀ in favor of H₁

Fail to Reject H₀:

Insufficient evidence to reject H₀ (not proof of H₀)

Common Mistakes:

  • Don't say 'accept H₀' - we never prove H₀
  • Don't confuse statistical and practical significance
  • P-value is not probability that H₀ is true

Core Theorem Proofs

Neyman-Pearson Lemma
Most Powerful Test Construction

For testing simple hypotheses H₀: θ = θ₀ vs H₁: θ = θ₁, the likelihood ratio test is the most powerful test of size α.

Theorem Statement

Let ϕ(x)={1if L(θ1;x)L(θ0;x)>k0otherwise\text{Let } \phi(\mathbf{x}) = \begin{cases} 1 & \text{if } \frac{L(\theta_1; \mathbf{x})}{L(\theta_0; \mathbf{x})} > k \\ 0 & \text{otherwise} \end{cases}

Then φ is the most powerful level-α test for H₀ vs H₁.

Proof Steps

1
Define Simple Hypothesis Problem

Consider simple hypothesis testing: H₀: θ = θ₀ versus H₁: θ = θ₁. Let X = (X₁, ..., Xₙ) be the data vector with likelihood functions L₀(x) = L(θ₀; x) and L₁(x) = L(θ₁; x).

H0:θ=θ0vsH1:θ=θ1H_0: \theta = \theta_0 \quad \text{vs} \quad H_1: \theta = \theta_1
2
Construct Likelihood Ratio Rejection Region

Define the rejection region based on likelihood ratio: D_k = {x : L₁(x)/L₀(x) > k}, where k is chosen to satisfy the size constraint. This forms the basis of the likelihood ratio test.

Dk={x:λ(x)=L1(x)L0(x)>k}D_k = \{\mathbf{x} : \lambda(\mathbf{x}) = \frac{L_1(\mathbf{x})}{L_0(\mathbf{x})} > k\}
3
Verify Significance Level Constraint

Choose k such that the test has exactly size α: P_θ₀(X ∈ D_k) = α. Under regularity conditions, there exists such a k. This ensures the Type I error rate is controlled at level α.

α=Pθ0(XDk)=DkL0(x)dx\alpha = P_{\theta_0}(\mathbf{X} \in D_k) = \int_{D_k} L_0(\mathbf{x}) \, d\mathbf{x}
4
Compare with Any Same-Level Test

Let D' be any other rejection region satisfying P_θ₀(X ∈ D') ≤ α. We need to show that P_θ₁(X ∈ D_k) ≥ P_θ₁(X ∈ D'), i.e., the LRT has maximum power among all level-α tests.

Goal: Pθ1(XDk)Pθ1(XD) for all D with Pθ0(XD)α\text{Goal: } P_{\theta_1}(\mathbf{X} \in D_k) \geq P_{\theta_1}(\mathbf{X} \in D') \text{ for all } D' \text{ with } P_{\theta_0}(\mathbf{X} \in D') \leq \alpha
5
Apply Indicator Function Algebra

For x ∈ D_k, we have L₁(x) > kL₀(x). For x ∈ D', we have L₁(x) ≤ kL₀(x) or L₁(x) > kL₀(x). Using indicator functions, we can write the power difference as an integral involving these inequalities.

Pθ1(Dk)Pθ1(D)=DkDL1(x)dxDDkL1(x)dxP_{\theta_1}(D_k) - P_{\theta_1}(D') = \int_{D_k \setminus D'} L_1(\mathbf{x}) d\mathbf{x} - \int_{D' \setminus D_k} L_1(\mathbf{x}) d\mathbf{x}
6
Conclude Optimality (UMP)

Since L₁(x) > kL₀(x) on D_k and L₁(x) ≤ kL₀(x) on the complement, the integral difference is non-negative, proving P_θ₁(X ∈ D_k) ≥ P_θ₁(X ∈ D'). The likelihood ratio test is UMP (Uniformly Most Powerful) for simple hypotheses.

ϕk is UMP level-α test for H0 vs H1\therefore \phi_k \text{ is UMP level-}\alpha\text{ test for } H_0 \text{ vs } H_1

Example Application

For testing H₀: μ = 0 vs H₁: μ = 1 in N(μ, 1), the LRT reduces to rejecting H₀ when X̄ > c, which is the most powerful test.

Wilks' Theorem
Asymptotic Distribution of GLRT

Under regularity conditions, the generalized likelihood ratio statistic -2 log λ converges in distribution to a chi-square distribution as sample size approaches infinity.

Theorem Statement

2logλ(X)dχ2(r) as n-2\log \lambda(\mathbf{X}) \xrightarrow{d} \chi^2(r) \text{ as } n \to \infty

where r = dim(Θ) - dim(Θ₀) is the difference in parameter dimensions.

Proof Steps

1
Define Likelihood Ratio Statistic

Consider the generalized likelihood ratio Λ(X) = L(θ̂₀; X) / L(θ̂; X), where θ̂ is the unrestricted MLE and θ̂₀ is the MLE under H₀: θ ∈ Θ₀. The statistic ranges from 0 to 1.

λ(X)=supθΘ0L(θ;X)supθΘL(θ;X)=L(θ^0;X)L(θ^;X)\lambda(\mathbf{X}) = \frac{\sup_{\theta \in \Theta_0} L(\theta; \mathbf{X})}{\sup_{\theta \in \Theta} L(\theta; \mathbf{X})} = \frac{L(\hat{\theta}_0; \mathbf{X})}{L(\hat{\theta}; \mathbf{X})}
2
Logarithmic Transformation

Consider the log-likelihood ratio: -2 log Λ = 2[ℓ(θ̂) - ℓ(θ̂₀)], where ℓ(θ) = log L(θ; X) is the log-likelihood. This transformation is monotone and more analytically tractable.

2logλ=2[(θ^)(θ^0)]-2\log\lambda = 2[\ell(\hat{\theta}) - \ell(\hat{\theta}_0)]
3
Taylor Expansion

Expand ℓ(θ̂) and ℓ(θ̂₀) around the true θ₀ (assuming H₀ is true). Using Taylor's theorem to second order, we get quadratic forms involving the score and information matrix.

(θ^)(θ0)+(θ0)T(θ^θ0)12(θ^θ0)TI(θ0)(θ^θ0)\ell(\hat{\theta}) \approx \ell(\theta_0) + \nabla\ell(\theta_0)^T(\hat{\theta} - \theta_0) - \frac{1}{2}(\hat{\theta} - \theta_0)^T I(\theta_0)(\hat{\theta} - \theta_0)
4
Apply Central Limit Theorem

By the asymptotic normality of the MLE, we have √n(θ̂ - θ₀) →ᵈ N(0, I(θ₀)⁻¹), where I(θ₀) is the Fisher information matrix. This is a fundamental result in maximum likelihood theory.

n(θ^θ0)dN(0,I(θ0)1)\sqrt{n}(\hat{\theta} - \theta_0) \xrightarrow{d} N(0, I(\theta_0)^{-1})
5
Apply Law of Large Numbers

The Fisher information matrix can be consistently estimated by the observed information. By LLN, the empirical information converges to the true Fisher information: În → I(θ₀) in probability.

I^n=1n2(θ^)PI(θ0)\hat{I}_n = -\frac{1}{n}\nabla^2 \ell(\hat{\theta}) \xrightarrow{P} I(\theta_0)
6
Conclude Chi-Square Distribution

Combining steps 3-5 with Slutsky's theorem, -2 log λ asymptotically equals a quadratic form of a multivariate normal vector, which follows a χ²(r) distribution, where r is the difference in dimensions between full and restricted parameter spaces.

2logλdχ2(r),r=dim(Θ)dim(Θ0)-2\log\lambda \xrightarrow{d} \chi^2(r), \quad r = \dim(\Theta) - \dim(\Theta_0)

Example Application

Testing H₀: μ₁ = μ₂ = μ₃ in three normal populations, -2 log λ follows approximately χ²(2) for large samples.

Common Statistical Tests

Standard tests for normal populations and common parameters

Single Normal Population Tests
Tests for normal population parameters with different scenarios

U-Test (Z-Test)

Scenario:

Testing population mean with known variance

Hypotheses:

H0:μ=μ0vsH1:μμ0,μ>μ0,orμ<μ0H₀: μ = μ₀ vs H₁: μ ≠ μ₀, μ > μ₀, or μ < μ₀

Assumptions:

  • X N(μ,σ2)X ~ N(μ, σ²)
  • σ2knownσ² known
  • RandomsampleRandom sample

Test Statistic:

U=Xˉμ0σ/nN(0,1) under H0U = \frac{\bar{X} - \mu_0}{\sigma/\sqrt{n}} \sim N(0,1) \text{ under } H_0

Rejection Regions:

  • Twosided:U>uα/2Two-sided: |U| > u_{α/2}
  • Rightsided:U>uαRight-sided: U > u_α
  • Leftsided:U<uαLeft-sided: U < -u_α

Example Application:

Testing if mean height = 170cm with σ = 5cm known

T-Test

Scenario:

Testing population mean with unknown variance

Hypotheses:

H0:μ=μ0vsH1:μμ0,μ>μ0,orμ<μ0H₀: μ = μ₀ vs H₁: μ ≠ μ₀, μ > μ₀, or μ < μ₀

Assumptions:

  • X N(μ,σ2)X ~ N(μ, σ²)
  • σ2unknownσ² unknown
  • RandomsampleRandom sample

Test Statistic:

T=Xˉμ0S/nt(n1) under H0T = \frac{\bar{X} - \mu_0}{S/\sqrt{n}} \sim t(n-1) \text{ under } H_0

Rejection Regions:

  • Twosided:T>tα/2(n1)Two-sided: |T| > t_{α/2}(n-1)
  • Rightsided:T>tα(n1)Right-sided: T > t_α(n-1)
  • Leftsided:T<tα(n1)Left-sided: T < -t_α(n-1)

Example Application:

Testing if new teaching method improves test scores

Two Sample Comparison Tests
Tests comparing parameters between two independent populations

Two-Sample T-Test

Scenario:

Comparing means with unknown but equal variances

Hypotheses:

H0:μX=μYvsH1:μXμY,μX>μY,orμX<μYH₀: μ_X = μ_Y vs H₁: μ_X ≠ μ_Y, μ_X > μ_Y, or μ_X < μ_Y

Assumptions:

  • X N(μX,σ2),Y N(μY,σ2)X ~ N(μ_X, σ²), Y ~ N(μ_Y, σ²)
  • σ2unknownbutequalσ² unknown but equal
  • IndependentsamplesIndependent samples

Test Statistic:

T=XˉYˉSw1/m+1/nt(m+n2) under H0T = \frac{\bar{X} - \bar{Y}}{S_w\sqrt{1/m + 1/n}} \sim t(m+n-2) \text{ under } H_0

Pooled Variance:

Sw2=(m1)SX2+(n1)SY2m+n2S_w^2 = \frac{(m-1)S_X^2 + (n-1)S_Y^2}{m+n-2}

Rejection Regions:

  • Twosided:T>tα/2(m+n2)Two-sided: |T| > t_{α/2}(m+n-2)
  • Rightsided:T>tα(m+n2)Right-sided: T > t_α(m+n-2)
  • Leftsided:T<tα(m+n2)Left-sided: T < -t_α(m+n-2)

Example Application:

Comparing test scores between two teaching methods

Generalized Likelihood Ratio Test

A general method for constructing hypothesis tests using likelihood functions

Generalized Likelihood Ratio Test (GLRT)
A general method for constructing hypothesis tests using likelihood functions

Motivation:

When optimal tests don't exist or are unknown, GLRT provides a systematic approach

Principle:

Compare maximum likelihood under full parameter space to maximum likelihood under null hypothesis constraint

GLRT Construction

Likelihood Ratio Definition:

λ(x~)=supθΘL(θ;x~)supθΘ0L(θ;x~)=L(θ^;x~)L(θ^0;x~)\lambda(\tilde{x}) = \frac{\sup_{\theta \in \Theta} L(\theta; \tilde{x})}{\sup_{\theta \in \Theta_0} L(\theta; \tilde{x})} = \frac{L(\hat{\theta}; \tilde{x})}{L(\hat{\theta}_0; \tilde{x})}

Components:

  • L(θ;x~):LikelihoodfunctionL(θ;x̃): Likelihood function
  • θ^:UnrestrictedMLE(globalmaximum)θ̂: Unrestricted MLE (global maximum)
  • θ^0:RestrictedMLEunderH0(constrainedmaximum)θ̂₀: Restricted MLE under H₀ (constrained maximum)
  • λ(x~):Likelihoodratiostatisticλ(x̃): Likelihood ratio statistic

Test Rule:

Reject H0 if λ(X~)>c\text{Reject } H_0 \text{ if } \lambda(\tilde{X}) > c

Critical Value:

Choose c such that supθΘ0Pθ(λ(X~)>c)α\text{Choose } c \text{ such that } \sup_{\theta \in \Theta_0} P_{\theta}(\lambda(\tilde{X}) > c) \leq \alpha
GLRT Examples

Normal Mean Test (σ² unknown)

Hypotheses: H0:μ=μ0vsH1:μμ0H₀: μ = μ₀ vs H₁: μ ≠ μ₀
Global MLE:
μ^=Xˉ,σ^2=1n(XiXˉ)2\hat{\mu} = \bar{X}, \hat{\sigma}^2 = \frac{1}{n}\sum(X_i - \bar{X})^2
Restricted MLE:
μ^0=μ0,σ^02=1n(Xiμ0)2\hat{\mu}_0 = \mu_0, \hat{\sigma}_0^2 = \frac{1}{n}\sum(X_i - \mu_0)^2
Likelihood Ratio:
λ=(1+t2n1)n/2\lambda = \left(1 + \frac{t^2}{n-1}\right)^{n/2}
Test Statistic:
t=Xˉμ0S/nt = \frac{\bar{X} - \mu_0}{S/\sqrt{n}}
Equivalence: λ monotone in tGLRT equivalent to t-test\lambda \text{ monotone in } |t| \Rightarrow \text{GLRT equivalent to t-test}
Large Sample Properties

Wilks' Theorem:

Underregularityconditions:2logλ(X~)dχ2(r) as nUnder regularity conditions: 2\log\lambda(\tilde{X}) \xrightarrow{d} \chi^2(r) \text{ as } n \to \infty

where r=dim(Θ)dim(Θ0)r = \dim(\Theta) - \dim(\Theta_0)

Applications:

  • Provides approximate critical values for large samples
  • Enables testing in complex models where exact distributions unknown

Limitations:

  • Requires large sample sizes for accuracy
  • May not be optimal for specific alternatives

Confidence Intervals & Hypothesis Testing

Confidence Interval - Hypothesis Test Duality
Fundamental two-way relationship between interval estimation and hypothesis testing

There's a one-to-one correspondence between confidence intervals and hypothesis tests at the same confidence/significance level

Test → Interval

From acceptance regions to confidence sets

Formula:
C(x~)={θ0:x~A(θ0)}C(\tilde{x}) = \{\theta_0 : \tilde{x} \in A(\theta_0)\}

Explanation: The confidence set contains all parameter values that would not be rejected by the test

Example: If |t| ≤ t_{α/2}(n-1) is acceptance region, then confidence interval is x̄ ± t_{α/2}(n-1) × s/√n

Interval → Test

From confidence sets to acceptance regions

Formula:
A(θ0)={x~:θ0C(x~)}A(\theta_0) = \{\tilde{x} : \theta_0 \in C(\tilde{x})\}

Explanation: Accept H₀: θ = θ₀ if and only if θ₀ lies within the confidence interval

Example: Reject H₀: μ = μ₀ if μ₀ falls outside the 95% confidence interval for μ

Practical Implications:

  • Confidence intervals provide range of plausible parameter values
  • Hypothesis tests provide binary decisions about specific values
  • Intervals more informative for practical decision making
Duality Examples

Normal Mean (σ unknown)

Confidence Interval:
[xˉtα/2(n1)sn,xˉ+tα/2(n1)sn][\bar{x} - t_{\alpha/2}(n-1) \frac{s}{\sqrt{n}}, \bar{x} + t_{\alpha/2}(n-1) \frac{s}{\sqrt{n}}]
Hypothesis Test:
Reject H0:μ=μ0 if μ0CI\text{Reject } H_0: \mu = \mu_0 \text{ if } \mu_0 \notin \text{CI}
Interpretation: 95% CI gives all μ₀ values that would not be rejected at α = 0.05

Real-World Applications

Practical applications of hypothesis testing across different domains

Quality Control Testing
Monitor production processes to ensure specifications are met

Common Scenarios:

  • Testing if mean product dimension meets target specification
  • Monitoring process variability within acceptable limits
  • Detecting shifts in production quality over time

Typical Tests:

One-sample t-test
Chi-square test for variance
Control charts

Key Considerations:

  • Economic consequences
  • Cost of Type I vs Type II errors
  • Sample size planning
Medical Research
Evaluate treatment effectiveness and drug safety

Common Scenarios:

  • Testing if new treatment improves patient outcomes
  • Comparing side effect rates between treatments
  • Establishing bioequivalence between generic and brand drugs

Typical Tests:

Two-sample t-test
Chi-square independence test
Equivalence testing

Key Considerations:

  • Patient safety
  • Regulatory requirements
  • Ethical implications
A/B Testing
Compare different versions to optimize performance

Common Scenarios:

  • Testing if new website design increases conversion rate
  • Comparing marketing campaign effectiveness
  • Evaluating user interface changes

Typical Tests:

Two-proportion z-test
Two-sample t-test
Chi-square test

Key Considerations:

  • Business impact
  • Sample size constraints
  • Multiple testing corrections
Environmental Monitoring
Assess compliance with environmental standards

Common Scenarios:

  • Testing if pollutant levels exceed safety thresholds
  • Monitoring changes in ecosystem health indicators
  • Evaluating effectiveness of environmental interventions

Typical Tests:

One-sample tests
Trend analysis
Non-parametric tests

Key Considerations:

  • Regulatory compliance
  • Public health impact
  • Measurement uncertainty

Frequently Asked Questions

Common Questions & Misconceptions
Clear explanations of fundamental concepts and common confusions in hypothesis testing

What is the fundamental difference between H₀ and H₁?

The null hypothesis H₀ is the hypothesis we try to challenge (usually representing "no effect" or "no difference"), while the alternative hypothesis H₁ is what we seek evidence to support. In hypothesis testing, we always start from the premise "assume H₀ is true," then see if the data provides strong enough evidence to reject it. This asymmetry reflects the "skepticism" principle in the scientific method.

H0:θΘ0 vs H1:θΘ1,Θ0Θ1=H_0: \theta \in \Theta_0 \text{ vs } H_1: \theta \in \Theta_1, \quad \Theta_0 \cap \Theta_1 = \emptyset

Why control Type I error instead of Type II error?

This stems from the philosophical foundation of the Neyman-Pearson principle. Type I error (rejecting true H₀) usually has more serious consequences because it means we incorrectly claim to have discovered some effect. Type II error (failing to reject false H₀) merely means we haven't found sufficient evidence. In scientific research, we prefer "better to miss than to wrongly assert."

α=P(Reject H0H0 true)0.05\alpha = P(\text{Reject } H_0 | H_0 \text{ true}) \leq 0.05

What does "fail to reject H₀" mean? Why not say "accept H₀"?

This is one of the most common misunderstandings in hypothesis testing. "Fail to reject H₀" only means the data didn't provide strong enough evidence to refute H₀, not that H₀ is necessarily true. This is like in court "insufficient evidence" ≠ "innocent." We can never prove H₀ is true, only say the data is compatible with H₀.

Key Point: Absence of evidence is not evidence of absence

How to choose between one-sided and two-sided tests?

This depends on your research question. If you only care whether the parameter deviates in one direction (e.g., "does the new drug improve efficacy"), use a one-sided test. If you care whether the parameter differs from a value (regardless of direction), use a two-sided test. Principle: decide based on substantive research questions, not data, and determine before seeing the data.

Comparison: One-sided tests have higher power but can only detect differences in one direction; two-sided tests are more conservative but can detect both directions

What is the relationship between P-value and significance level α?

The P-value is the probability of observing the current data or more extreme data under the assumption that H₀ is true. α is the threshold we set beforehand. Decision rule: if P-value < α, reject H₀. Note that the P-value is not "the probability that H₀ is true" (that's a Bayesian posterior probability concept).

P-value=P(TtobsH0),Reject H0 if P-value<α\text{P-value} = P(T \geq t_{obs} | H_0), \quad \text{Reject } H_0 \text{ if P-value} < \alpha

When to use t-test vs z-test (U-test)?

The key is whether the population variance is known. If population variance σ² is known, use z-test (U-test); if σ² is unknown and needs to be estimated with sample variance, use t-test. In practice, population variance is usually unknown, so t-test is more common. When sample size is large (n > 30), the t-distribution approximates the normal distribution, and results are similar.

U=Xˉμ0σ/nN(0,1)vsT=Xˉμ0S/nt(n1)U = \frac{\bar{X} - \mu_0}{\sigma/\sqrt{n}} \sim N(0,1) \quad \text{vs} \quad T = \frac{\bar{X} - \mu_0}{S/\sqrt{n}} \sim t(n-1)

Why did α = 0.05 become the "standard"?

This is mainly a historical convention rather than mathematical necessity. R.A. Fisher proposed 0.05 as a "suspicious" threshold in the 1920s, which later became conventional standard. Actually, the choice of α should be based on field characteristics and error costs: medical research often uses 0.01 (more strict), exploratory research may use 0.10 (more relaxed). Importantly, determine α before data collection and clearly state it in reports.

Historical Note: Fisher originally described 0.05 as a "convenient approximation," not an absolute standard

What is the connection between hypothesis testing and confidence intervals?

They have a precise duality relationship. At the same significance level, if parameter value θ₀ falls within the (1-α) confidence interval, then we cannot reject H₀: θ = θ₀ at level α testing. Vice versa. Confidence intervals provide more information than hypothesis tests: they not only tell us whether to reject a specific value but also give the range of all plausible parameter values.

θ0CI1αFail to reject H0:θ=θ0 at level α\theta_0 \in \text{CI}_{1-\alpha} \Leftrightarrow \text{Fail to reject } H_0: \theta = \theta_0 \text{ at level } \alpha
Ask AI ✨
MathIsimple – Simple, Friendly Math Tools & Learning