Master the theory of sufficient and complete statistics for optimal estimation
Core concepts in sufficient and complete statistics
A statistic T(X̃) that contains all information about θ contained in the sample. Given T=t, the conditional distribution of X̃ is independent of θ.
A statistic T where the only unbiased function with zero expectation is the zero function (with probability 1).
T(X̃) is sufficient for θ if and only if the joint density can be factored as p(x̃;θ) = g(T(x̃);θ)h(x̃).
Statistics that capture all parameter information from the sample
Binomial B(n,p): is sufficient for
Normal N(μ,σ²): is sufficient for
Poisson P(λ): is sufficient for
Uniform U(0,θ): is sufficient for
T(X̃) is sufficient for θ if and only if the joint density/mass function can be written as:
Problem:
Given a random sample , use the Factorization Theorem to find a sufficient statistic for .
Solution:
Key Insight:
For Poisson distributions, the sum of observations contains all information about . The individual values and their factorial terms don't provide additional information beyond the sum.
Theorem Statement:
If T is sufficient for θ and φ(X̃) is an unbiased estimator of g(θ), then:
is also unbiased for g(θ) with
Proof:
Problem:
For , start with (unbiased). Use Rao-Blackwell to improve it with sufficient statistic .
Solution:
Key Insight:
Rao-Blackwell transforms a crude unbiased estimator (with infinite variance!) into an efficient estimator (MLE). Always condition on sufficient statistics to improve estimators.
Statistics that ensure uniqueness of unbiased estimators
Completeness ensures uniqueness of unbiased estimators based on the statistic
Problem:
Show that is complete for when with .
Solution:
Key Insight:
The completeness proof relies on the fact that a polynomial that is identically zero must have all zero coefficients. This technique works for many discrete exponential family distributions.
Theorem Statement:
If S is a sufficient complete statistic for θ and φ(X̃) is an unbiased estimator of g(θ), then:
Proof:
Corollary:
If h(S) is a function of sufficient complete statistic S with E_θ[h(S)] = g(θ), then h(S) is the unique UMVUE of g(θ)
Problem:
For with both parameters unknown, find the UMVUE for .
Solution:
Key Insight:
The sample variance is not just an unbiased estimator—it's the unique UMVUE. This demonstrates the power of Lehmann-Scheffé theorem in identifying optimal estimators.
If T is a sufficient complete statistic for θ and V is an ancillary statistic (distribution independent of θ), then T and V are independent for all θ ∈ Θ
Step-by-step mathematical derivations of fundamental theorems
Let X have pdf/pmf f(x|θ). A statistic T(X) is sufficient for θ if and only if f(x|θ) = g(T(x)|θ)h(x) for some functions g and h.
This theorem allows us to find sufficient statistics by simple inspection of the density function.
Assume factorization holds. We show T is sufficient by computing conditional probability.
If T(x) ≠ t, probability is 0. If T(x)=t, substitute f(x|θ) = g(t|θ)h(x).
The term g(t|θ) factors out of the sum and cancels with the numerator.
The result depends only on x and h(x), not on θ. Thus, T is sufficient.
Assume T is sufficient. Then P(X=x|T=t) is independent of θ. Let this be k(x,t).
Write joint prob as conditional × marginal. Define g(t|θ) = P(T=t|θ) and h(x) = P(X=x|T=T(x)).
If T is a complete sufficient statistic for θ, and V is an ancillary statistic, then T and V are independent.
Powerful tool for proving independence without finding joint distributions.
Let A be any event involving V (e.g., V ∈ B). Let η(t) = P(V ∈ B | T=t).
Since T is sufficient, the conditional distribution of X given T is independent of θ. Since V is a function of X, its conditional distribution given T is also independent of θ.
Consider the expectation of η(T) over T. By law of iterated expectations:
Since V is ancillary, P(V ∈ B) is a constant c independent of θ.
Consider the function g(T) = η(T) - c. Its expectation is E[η(T) - c] = c - c = 0 for all θ.
Since T is complete, g(T) must be zero almost surely. Thus η(T) = c a.s.
| Aspect | Sufficient Statistics | Complete Statistics |
|---|---|---|
| Core Purpose | ||
| Mathematical Criterion | ||
| Statistical Role | ||
| Independence | ||
| Combined Power |
How to apply sufficient and complete statistics in practice
Common questions about sufficient and complete statistics