MathIsimple
Advanced Theory
4-6 Hours

Sufficient & Complete Statistics

Master the theory of sufficient and complete statistics for optimal estimation

Essential Definitions

Core concepts in sufficient and complete statistics

Sufficient Statistic

A statistic T(X̃) that contains all information about θ contained in the sample. Given T=t, the conditional distribution of X̃ is independent of θ.

P(X~=x~T=t;θ) is independent of θP(\tilde{X} = \tilde{x} \mid T = t; \theta) \text{ is independent of } \theta
Complete Statistic

A statistic T where the only unbiased function with zero expectation is the zero function (with probability 1).

Eθ[ϕ(T)]=0  θPθ(ϕ(T)=0)=1  θE_{\theta}[\phi(T)] = 0 \; \forall \theta \Rightarrow P_{\theta}(\phi(T) = 0) = 1 \; \forall \theta
Factorization Theorem

T(X̃) is sufficient for θ if and only if the joint density can be factored as p(x̃;θ) = g(T(x̃);θ)h(x̃).

p(x~;θ)=g(T(x~);θ)×h(x~)p(\tilde{x};\theta) = g(T(\tilde{x});\theta) \times h(\tilde{x})

Sufficient Statistics

Statistics that capture all parameter information from the sample

Key Properties of Sufficient Statistics
A sufficient statistic captures all the information about the parameter contained in the sample
  • Simplifies inference without loss of information about the parameter
  • Reduces data dimensionality while preserving statistical properties
  • Forms the foundation for optimal estimation theory
Sufficient Statistics Examples

Binomial B(n,p): T=XiT = \sum X_i is sufficient for pp

Normal N(μ,σ²): T=(Xi,Xi2)T = (\sum X_i, \sum X_i^2) is sufficient for (μ,σ2)(\mu, \sigma^2)

Poisson P(λ): T=XiT = \sum X_i is sufficient for λ\lambda

Uniform U(0,θ): T=X(n)T = X_{(n)} is sufficient for θ\theta

Rao-Blackwell Theorem
Demonstrates how sufficient statistics improve estimation efficiency

Statement: If T is sufficient for θ and φ(X̃) is an unbiased estimator of g(θ), then:

g^(T)=E[ϕ(X~)T]\hat{g}(T) = E[\phi(\tilde{X})|T]

is also unbiased for g(θ) with Varθ(g^(T))Varθ(ϕ(X~))\text{Var}_\theta(\hat{g}(T)) \leq \text{Var}_\theta(\phi(\tilde{X}))

Example:

Binomial B(1,p) with sample X₁,...,Xₙ. Original estimator: φ(X̃) = X₁ with Var(X₁) = p(1-p). Sufficient statistic: T = ΣXᵢ. Improved estimator: ĝ(T) = E[X₁|T] = T/n = X̄ with Var(X̄) = p(1-p)/n ≤ Var(X₁).

Complete Statistics

Statistics where unbiased functions with zero expectation must be zero functions

Complete Statistic Concept
Completeness ensures uniqueness of unbiased estimators based on the statistic

Key Property: If Eθ[ϕ(T)]=0E_{\theta}[\phi(T)] = 0 for all θ ∈ Θ, then Pθ(ϕ(T)=0)=1P_{\theta}(\phi(T) = 0) = 1 for all θ ∈ Θ

Examples:

  • Binomial B(n,p): T ~ B(n,p) is complete
  • Normal N(μ,σ²): T = (ΣXᵢ, ΣXᵢ²) is complete
  • Uniform U(0,θ): T = X₍ₙ₎ is complete
Lehmann-Scheffé Theorem
Core theorem for constructing unique UMVUE using sufficient complete statistics

Statement: If S is a sufficient complete statistic for θ and φ(X̃) is an unbiased estimator of g(θ), then:

g^=E[ϕ(X~)S] is the unique UMVUE of g(θ)\hat{g} = E[\phi(\tilde{X})|S] \text{ is the unique UMVUE of } g(\theta)

Applications:

  • Normal N(μ,σ²): X̄ is UMVUE of μ; S² is UMVUE of σ²
  • Poisson P(λ): X̄ is UMVUE of λ

Basu's Theorem

Establishes independence between sufficient complete statistics and ancillary statistics

Theorem Statement
If T is a sufficient complete statistic for θ and V is an ancillary statistic, then T and V are independent

Mathematical Statement: T complete sufficient,V ancillary    TVT \text{ complete sufficient}, V \text{ ancillary} \implies T \perp V

Example:

In N(μ, σ²), (X̄, S²) is sufficient complete for (μ, σ²). Sample skewness is ancillary for (μ, σ²). By Basu's theorem, (X̄, S²) and sample skewness are independent.

Examples and Solutions

Practical applications of sufficient and complete statistics

Example 1: Finding Sufficient Statistic

Problem: For a sample X₁, X₂, ..., Xₙ from Poisson P(λ), find a sufficient statistic for λ.

Solution:

The joint pmf is: p(x~;λ)=i=1nλxieλxi!=λxienλ/xi!p(\tilde{x};\lambda) = \prod_{i=1}^n \frac{\lambda^{x_i} e^{-\lambda}}{x_i!} = \lambda^{\sum x_i} e^{-n\lambda} / \prod x_i!

By factorization theorem: g(T;λ)=λTenλg(T;\lambda) = \lambda^T e^{-n\lambda} where T=XiT = \sum X_i, and h(x~)=1/xi!h(\tilde{x}) = 1/\prod x_i!. Therefore, T=XiT = \sum X_i is sufficient for λ.

Example 2: UMVUE Construction

Problem: For X₁, X₂, ..., Xₙ ~ N(μ, σ²) with known σ², find the UMVUE of μ.

Solution:

Step 1: T=XiT = \sum X_i is sufficient and complete for μ (by factorization theorem and completeness of normal family).

Step 2: X1X_1 is an unbiased estimator of μ.

Step 3: By Lehmann-Scheffé theorem, μ^=E[X1T]=T/n=Xˉ\hat{\mu} = E[X_1|T] = T/n = \bar{X} is the unique UMVUE of μ.

Example 3: Applying Basu's Theorem

Problem: For X₁, X₂, ..., Xₙ ~ N(μ, σ²), show that X̄ and S² are independent.

Solution:

Step 1: (Xˉ,S2)(\bar{X}, S^2) is sufficient complete for (μ, σ²).

Step 2: For fixed σ², S² is ancillary for μ (its distribution doesn't depend on μ).

Step 3: By Basu's theorem, X̄ (function of sufficient complete) and S² (ancillary) are independent.

Practice Quiz

Test your understanding with 10 multiple-choice questions

Practice Quiz
10
Questions
0
Correct
0%
Accuracy
1
For a population X ~ N(μ,σ²) with sample X₁, X₂, ..., Xₙ, which statement about the sample mean X̄ is correct?
2
For a population X ~ U(0,θ) with sample X₁, X₂, ..., Xₙ, the maximum likelihood estimator (MLE) is:
3
The mean squared error (MSE) formula is:
4
For a population X ~ B(1,p) with sample X₁, X₂, ..., Xₙ, which statistic is sufficient for p?
5
For a population X ~ P(λ) with sample X₁, X₂, ..., Xₙ, the Cramér-Rao lower bound for unbiased estimators of λ is:
6
In the linear regression model Yᵢ = β₀ + β₁xᵢ + εᵢ, the least squares estimator for β₁ is:
7
If T(X̃) is a sufficient complete statistic for θ and φ(X̃) is an unbiased estimator of g(θ), then E[φ(X̃)|T] is:
8
For a population X ~ E(λ) (exponential distribution) with sample X₁, X₂, ..., Xₙ, the method of moments estimator for λ is:
9
Which statement about efficient estimators is correct?
10
For a population X ~ N(μ,σ²) with sample X₁, X₂, ..., Xₙ, which is an unbiased estimator of σ²?

Frequently Asked Questions

Common questions about sufficient and complete statistics

What is the intuitive understanding of Sufficient Statistics?
Intuitively, a sufficient statistic is like 'lossless compression'. It compresses the raw data into a simpler value (the statistic), but in this process, no information about the unknown parameter θ is lost. If you have the sufficient statistic, the original data is redundant for inferring θ.
Key Point: Lossless compression of data information
How to determine if a statistic is sufficient?
The most commonly used method is the Factorization Theorem. If the joint density function can be factored into two parts: one containing only the statistic T and parameter θ, and another containing only the sample x (independent of θ), then T is sufficient.
f(xθ)=g(T(x)θ)h(x)f(x|\theta) = g(T(x)|\theta)h(x)
What is a Complete Statistic and why is it important?
Completeness is a uniqueness requirement: if a function g(T) has expectation 0 for all parameter values, then the function itself must be 0 almost surely. Completeness ensures uniqueness in inference based on sufficient statistics, which is used in the Lehmann-Scheffé theorem to determine UMVUE.
E[g(T)]=0  θ    P(g(T)=0)=1E[g(T)] = 0 \; \forall\theta \implies P(g(T)=0)=1
What is the relationship between Lehmann-Scheffé Theorem and UMVUE?
This theorem combines Rao-Blackwell and completeness, providing a clear method to find UMVUE (Uniformly Minimum Variance Unbiased Estimator): if an unbiased estimator depends only on a complete sufficient statistic, then it is the UMVUE. This is one of the most important results in parameter estimation theory.
Key Point: Complete Sufficient → Uniqueness of UMVUE
What are the practical applications of Basu's Theorem?
Basu's theorem states: if T is a complete sufficient statistic and V is an ancillary statistic, then T and V are independent. This conclusion simplifies many derivations, such as proving that the sample mean and sample variance are independent under normal distribution, without needing to compute complex joint distributions.
Key Point: Complete Sufficient ⊥ Ancillary
Ask AI ✨
MathIsimple – Simple, Friendly Math Tools & Learning