Master the fundamental concepts of sufficient and complete statistics, their theoretical foundations, and applications in optimal statistical inference.
A statistic T(X̃) that contains all information about θ contained in the sample. Given T=t, the conditional distribution of X̃ is independent of θ.
P(X̃ = x̃ | T = t; θ) is independent of θ
A statistic T where the only unbiased function with zero expectation is the zero function (with probability 1).
E_θ[φ(T)] = 0 ∀θ ⇒ P_θ(φ(T) = 0) = 1 ∀θ
T(X̃) is sufficient for θ if and only if the joint density can be factored as p(x̃;θ) = g(T(x̃);θ)h(x̃).
p(x̃;θ) = g(T(x̃);θ) × h(x̃)
Binomial B(n,p): T = ΣXᵢ (total successes) is sufficient for p
Normal N(μ,σ²): T = (ΣXᵢ, ΣXᵢ²) is sufficient for (μ,σ²)
Poisson P(λ): T = ΣXᵢ (total count) is sufficient for λ
Uniform U(0,θ): T = X₍ₙ₎ (maximum order statistic) is sufficient for θ
T(X̃) is sufficient for θ if and only if the joint density/mass function can be written as:
p(x̃;θ) = g(T(x̃);θ) × h(x̃)
If T is sufficient for θ and φ(X̃) is an unbiased estimator of g(θ), then:
ĝ(T) = E[φ(X̃)|T] is also unbiased for g(θ) with Var_θ(ĝ(T)) ≤ Var_θ(φ(X̃))
Completeness ensures uniqueness of unbiased estimators based on the statistic
If E_θ[φ(T)] = 0 for all θ ∈ Θ, then P_θ(φ(T) = 0) = 1 for all θ ∈ Θ
If E_p[φ(T)] = Σφ(k)C(n,k)p^k(1-p)^(n-k) = 0 for all p ∈ (0,1), substituting θ = p/(1-p) gives polynomial Σφ(k)C(n,k)θ^k = 0. Since polynomial is zero for all θ > 0, all coefficients must be zero, so φ(k) = 0.
Uses completeness of normal family and gamma distribution properties. If E[φ(T)] = 0 for all (μ,σ²), then φ must be zero almost everywhere.
If E_θ[φ(T)] = ∫₀^θ φ(t)(nt^(n-1)/θ^n)dt = 0 for all θ > 0, differentiating with respect to θ gives φ(θ)θ^(n-1) = 0, so φ(t) = 0 for all t > 0.
φ(X₁) = X₁ has E[φ(X₁)] = 0 for all σ², but P(X₁ = 0) = 0 ≠ 1
If S is a sufficient complete statistic for θ and φ(X̃) is an unbiased estimator of g(θ), then:
ĝ = E[φ(X̃)|S] is the unique UMVUE of g(θ)
If h(S) is a function of sufficient complete statistic S with E_θ[h(S)] = g(θ), then h(S) is the unique UMVUE of g(θ)
If T is a sufficient complete statistic for θ and V is an ancillary statistic (distribution independent of θ), then T and V are independent for all θ ∈ Θ
Aspect | Sufficient Statistics | Complete Statistics |
---|---|---|
Core Purpose | Captures all parameter information from sample data | Ensures uniqueness of unbiased functions |
Mathematical Criterion | Factorization theorem: p(x̃;θ) = g(T(x̃);θ)h(x̃) | Zero unbiased functions are zero: E[φ(T)] = 0 ⟹ φ(T) = 0 a.s. |
Statistical Role | Data reduction without information loss | Uniqueness guarantee for optimal estimators |
Independence | Not necessarily complete (e.g., U(θ-1/2, θ+1/2)) | Not necessarily sufficient (depends on family) |
Combined Power | With completeness: enables Lehmann-Scheffé theorem | With sufficiency: constructs unique UMVUE |