Advanced Theory

4-6 Hours

Sufficient & Complete Statistics

Master the theory of sufficient and complete statistics for optimal estimation

Learning Objectives

Master the concepts of sufficient statistics and their role in statistical inference
Understand the Fisher-Neyman Factorization Theorem and its applications
Learn about complete statistics and their importance in optimal estimation
Apply the Rao-Blackwell and Lehmann-Scheffé theorems in practice
Explore the relationship between sufficiency and completeness
Understand Basu's theorem and independence properties

Essential Definitions

Core concepts in sufficient and complete statistics

Sufficient Statistic

A statistic T(X̃) that contains all information about θ contained in the sample. Given T=t, the conditional distribution of X̃ is independent of θ.

P(\tilde{X} = \tilde{x} \mid T = t; \theta) \text{ is independent of } \theta

Complete Statistic

A statistic T where the only unbiased function with zero expectation is the zero function (with probability 1).

E_\theta[\phi(T)] = 0 \; \forall \theta \Rightarrow P_\theta(\phi(T) = 0) = 1 \; \forall \theta

Factorization Theorem

T(X̃) is sufficient for θ if and only if the joint density can be factored as p(x̃;θ) = g(T(x̃);θ)h(x̃).

p(\tilde{x};\theta) = g(T(\tilde{x});\theta) \times h(\tilde{x})

Sufficient Statistics

Statistics that capture all parameter information from the sample

Concept of Sufficient Statistics

A sufficient statistic captures all the information about the parameter contained in the sample

Key Properties:

Simplifies inference without loss of information about the parameter

Reduces data dimensionality while preserving statistical properties

Forms the foundation for optimal estimation theory

Examples:

Binomial B(n,p): $T = \sum X_i$ is sufficient for $p$

Normal N(μ,σ²): $T = (\sum X_i, \sum X_i^2)$ is sufficient for $(\mu, \sigma^2)$

Poisson P(λ): $T = \sum X_i$ is sufficient for $\lambda$

Uniform U(0,θ): $T = X_{(n)}$ is sufficient for $heta$

Fisher-Neyman Factorization Theorem

The fundamental criterion for identifying sufficient statistics

Theorem Statement:

T(X̃) is sufficient for θ if and only if the joint density/mass function can be written as:

p(x̃;θ) = g(T(x̃);θ) × h(x̃)

Components:

g(T(x̃);θ): depends on data only through T(x̃) and on parameter θ

h(x̃): depends on data x̃ but is independent of parameter θ

Detailed Examples:

Normal N(μ,σ²)

Joint Density:

p(\tilde{x};\mu,\sigma^2) = (2\pi\sigma^2)^{-n/2} \exp\{-\sum(x_i-\mu)^2/(2\sigma^2)\}

Factorization:

g(T;\mu,\sigma^2) = (2\pi\sigma^2)^{-n/2} \exp\{\mu\sum x_i/\sigma^2 - n\mu^2/(2\sigma^2) - \sum x_i^2/(2\sigma^2)\}, h(\tilde{x}) = 1

Sufficient Statistic:

T = (\sum x_i, \sum x_i^2)

Poisson P(λ)

Joint Density:

p(\tilde{x};\lambda) = \lambda^{\sum x_i} e^{-n\lambda} / \prod(x_i!)

Factorization:

g(T;\lambda) = \lambda^T e^{-n\lambda}, h(\tilde{x}) = 1/\prod(x_i!)

Sufficient Statistic:

T = \sum x_i

Uniform U(0,θ)

Joint Density:

p(\tilde{x};\theta) = \theta^{-n} I\{0 \leq x_{(1)} \leq x_{(n)} \leq \theta\}

Factorization:

g(T;\theta) = \theta^{-n} I\{T \leq \theta\}, h(\tilde{x}) = I\{0 \leq x_{(1)} \leq T\}

Sufficient Statistic:

T = X_{(n)}

Example: Finding Sufficient Statistic for Poisson Distribution

Problem:

Given a random sample $X_1, \ldots, X_n \sim \text{Poisson}(\lambda)$ , use the Factorization Theorem to find a sufficient statistic for $\lambda$ .

Solution:

Write the joint probability mass function:
$p(\mathbf{x}; \lambda) = \prod_{i=1}^n \frac{\lambda^{x_i} e^{-\lambda}}{x_i!}$
Simplify the product:
$= \frac{\lambda^{\sum_{i=1}^n x_i} e^{-n\lambda}}{\prod_{i=1}^n x_i!}$
Factor into g and h:
$= \underbrace{\lambda^{\sum x_i} e^{-n\lambda}}_{g(T(\mathbf{x}); \lambda)} \times \underbrace{\frac{1}{\prod x_i!}}_{h(\mathbf{x})}$
where $T(\mathbf{x}) = \sum_{i=1}^n x_i$
Conclusion: Since the joint PMF factors as $g(T(\mathbf{x}); \lambda) h(\mathbf{x})$ where $g$ depends on the data only through $T = \sum X_i$ and $h$ is independent of $\lambda$ , by the Factorization Theorem, $T = \sum_{i=1}^n X_i$ is sufficient for $\lambda$ .

Key Insight:

For Poisson distributions, the sum of observations contains all information about $\lambda$ . The individual values and their factorial terms don't provide additional information beyond the sum.

Rao-Blackwell Theorem

Demonstrates how sufficient statistics improve estimation efficiency

Theorem Statement:

If T is sufficient for θ and φ(X̃) is an unbiased estimator of g(θ), then:

\hat{g}(T) = E[\phi(\tilde{X})|T]

is also unbiased for g(θ) with

\text{Var}_\theta(\hat{g}(T)) \leq \text{Var}_\theta(\phi(\tilde{X}))

Proof:

Step 1 (Verify Unbiasedness): We first show that $\hat{\theta}^* = E[\hat{\theta} \mid T]$ is unbiased. Using the tower property of conditional expectation:
$E[\hat{\theta}^*] = E[E[\hat{\theta} \mid T]]$
By the law of iterated expectations:
$E[E[\hat{\theta} \mid T]] = E[\hat{\theta}]$
Since $\hat{\theta}$ is unbiased for $\theta$ , we have $E[\hat{\theta}] = \theta$ , thus:
$E[\hat{\theta}^*] = \theta$
Step 2 (Law of Total Variance): Recall the variance decomposition formula:
$\text{Var}(X) = E[\text{Var}(X \mid Y)] + \text{Var}(E[X \mid Y])$
Applying this to $\hat{\theta}$ conditioned on $T$ :
$\text{Var}(\hat{\theta}) = E[\text{Var}(\hat{\theta} \mid T)] + \text{Var}(E[\hat{\theta} \mid T])$
Step 3 (Substitute Improved Estimator): Recognize that by definition:
$E[\hat{\theta} \mid T] = \hat{\theta}^*$
Substituting into the variance decomposition:
$\text{Var}(\hat{\theta}) = E[\text{Var}(\hat{\theta} \mid T)] + \text{Var}(\hat{\theta}^*)$
Step 4 (Non-negativity of Conditional Variance): By fundamental properties of variance, conditional variance is always non-negative:
$\text{Var}(\hat{\theta} \mid T) \geq 0 \quad \text{for all } T$
Taking expectations on both sides:
$E[\text{Var}(\hat{\theta} \mid T)] \geq 0$
Step 5 (Derive Variance Inequality): From Step 3, rearrange to isolate $\text{Var}(\hat{\theta}^*)$ :
$\text{Var}(\hat{\theta}^*) = \text{Var}(\hat{\theta}) - E[\text{Var}(\hat{\theta} \mid T)]$
Since $E[\text{Var}(\hat{\theta} \mid T)] \geq 0$ from Step 4:
$\text{Var}(\hat{\theta}^*) \leq \text{Var}(\hat{\theta})$
Step 6 (Characterize Equality): Equality holds when:
$E[\text{Var}(\hat{\theta} \mid T)] = 0$
Since $\text{Var}(\hat{\theta} \mid T) \geq 0$ , this requires:
$\text{Var}(\hat{\theta} \mid T) = 0 \quad \text{almost surely}$
Step 7 (Zero Variance Implies Constant): A random variable with zero conditional variance is constant (given the conditioning variable):
$\text{Var}(\hat{\theta} \mid T) = 0 \quad \Rightarrow \quad \hat{\theta} = E[\hat{\theta} \mid T] = \hat{\theta}^*$
This means $\hat{\theta}$ is already a function of the sufficient statistic $T$ alone.
Step 8 (Conclusion): We have proven:
$E[\hat{\theta}^*] = \theta \quad \text{and} \quad \text{Var}(\hat{\theta}^*) \leq \text{Var}(\hat{\theta})$
with equality if and only if $\hat{\theta}$ is already based on $T$ alone. $\quad \blacksquare$

Key Implications:

Sufficient statistics allow variance reduction without bias

Optimal unbiased estimators must be functions of sufficient statistics

Provides systematic method for improving estimators

Practical Example:

Setup: Binomial B(1,p) with sample X₁,...,Xₙ

Original Estimator:

φ(X̃) = X₁ with Var(X₁) = p(1-p)

Sufficient Statistic:

T = ΣXᵢ is sufficient for p

Improved Estimator:

ĝ(T) = E[X₁|T] = T/n = X̄ with Var(X̄) = p(1-p)/n ≤ Var(X₁)

Example: Improving Estimator via Rao-Blackwell

Problem:

For $X_1, \ldots, X_n \sim \text{Exponential}(\lambda)$ , start with $\hat{\lambda}_1 = 1/X_1$ (unbiased). Use Rao-Blackwell to improve it with sufficient statistic $T = \sum X_i$ .

Solution:

Verify unbiasedness of initial estimator:
$E[1/X_1] = \int_0^\infty \frac{1}{x} \lambda e^{-\lambda x} dx$
Using integration by parts or direct calculation: $E[1/X_1] = \lambda$ (unbiased)
Identify sufficient statistic: $T = \sum_{i=1}^n X_i \sim \Gamma(n, \lambda)$
Apply Rao-Blackwell:
$\hat{\lambda}^* = E[1/X_1 \mid T]$
By symmetry, $X_1, \ldots, X_n$ are exchangeable given $T$ :
$E[1/X_i \mid T] = E[1/X_j \mid T] \text{ for all } i,j$
Use linearity:
$n \cdot E[1/X_1 \mid T] = E\left[\sum_{i=1}^n 1/X_i \mid T\right]$
The improved estimator is:
$\hat{\lambda}^* = \frac{1}{n} E\left[\sum 1/X_i \mid T\right]$
For exponential family: It can be shown that:
$\hat{\lambda}^* = \frac{n}{T} = \frac{n}{\sum X_i} = \frac{1}{\bar{X}}$
This is the MLE!
Variance comparison:
$\text{Var}(1/X_1) = \infty \quad \text{(infinite variance!)}$
$\text{Var}(1/\bar{X}) = \frac{\lambda^2}{n} \quad \text{(finite, achieves CRLB)}$

Key Insight:

Rao-Blackwell transforms a crude unbiased estimator (with infinite variance!) into an efficient estimator (MLE). Always condition on sufficient statistics to improve estimators.

Complete Statistics

Statistics that ensure uniqueness of unbiased estimators

Complete Statistics

Statistics where unbiased functions with zero expectation must be zero functions

Intuitive Understanding:

Completeness ensures uniqueness of unbiased estimators based on the statistic

E_\theta[\phi(T)] = 0 \text{ for all } \theta \in \Theta \implies P_\theta(\phi(T) = 0) = 1 \text{ for all } \theta \in \Theta

Statistical Significance:

Guarantees uniqueness of UMVUE when combined with sufficiency

Eliminates non-zero functions with zero expectation

Essential for Lehmann-Scheffé theorem applications

Examples of Complete Statistics

Binomial B(n,p), 0 < p < 1

Statistic:

T ~ B(n,p)

Proof Outline:

\text{If } E_p[\phi(T)] = \sum \phi(k)\binom{n}{k}p^k(1-p)^{n-k} = 0 \text{ for all } p \in (0,1), \text{ substituting } \theta = p/(1-p) \text{ gives polynomial } \sum \phi(k)\binom{n}{k}\theta^k = 0. \text{ Since polynomial is zero for all } \theta > 0, \text{ all coefficients must be zero, so } \phi(k) = 0.

Conclusion: T is complete

Normal N(μ,σ²), μ ∈ ℝ, σ > 0

Statistic:

T = (ΣXᵢ, ΣXᵢ²)

Proof Outline:

undefined

Conclusion: T is complete

Uniform U(0,θ), θ > 0

Statistic:

T = X₍ₙ₎ with density p_T(t;θ) = nt^(n-1)/θ^n, 0 < t < θ

Proof Outline:

\text{If } E_\theta[\phi(T)] = \int_0^\theta \phi(t)\frac{nt^{n-1}}{\theta^n}dt = 0 \text{ for all } \theta > 0, \text{ differentiating with respect to } \theta \text{ gives } \phi(\theta)\theta^{n-1} = 0, \text{ so } \phi(t) = 0 \text{ for all } t > 0.

Conclusion: T is complete

Non-Complete Example: Normal N(0,σ²), σ > 0

Statistic:

X₁

Counterexample:

$φ(X₁) = X₁ has E[φ(X₁)] = 0 for all σ², but P(X₁ = 0) = 0 ≠ 1$

Conclusion: X₁ is not complete (the family itself is not complete)

Example: Proving Completeness for Binomial Distribution

Problem:

Show that $T = \sum_{i=1}^n X_i$ is complete for $p$ when $X_1, \ldots, X_n \sim \text{Binomial}(1, p)$ with $0 < p < 1$ .

Solution:

Set up the expectation condition: Suppose $E_p[\phi(T)] = 0$ for all $p \in (0, 1)$ . We need to show $P_p(\phi(T) = 0) = 1$ .
Write the expectation explicitly:
$E_p[\phi(T)] = \sum_{k=0}^n \phi(k) \binom{n}{k} p^k (1-p)^{n-k} = 0$
for all $p \in (0, 1)$ .
Substitute parameter transformation: Let $\theta = \frac{p}{1-p}$ , so $p = \frac{\theta}{1+\theta}$ and $1-p = \frac{1}{1+\theta}$ .
Rewrite the expectation:
$E_p[\phi(T)] = \sum_{k=0}^n \phi(k) \binom{n}{k} \left(\frac{\theta}{1+\theta}\right)^k \left(\frac{1}{1+\theta}\right)^{n-k}$
$= \frac{1}{(1+\theta)^n} \sum_{k=0}^n \phi(k) \binom{n}{k} \theta^k = 0$
Since the denominator is positive:
$\sum_{k=0}^n \phi(k) \binom{n}{k} \theta^k = 0$
for all $\theta > 0$ .
Polynomial argument: The left-hand side is a polynomial in $\theta$ of degree at most $n$ . If this polynomial is zero for all $\theta > 0$ , then all its coefficients must be zero:
$\phi(k) \binom{n}{k} = 0 \quad \text{for all } k = 0, 1, \ldots, n$
Since binomial coefficients are positive:
$\phi(k) = 0 \quad \text{for all } k = 0, 1, \ldots, n$
Conclusion: Therefore $P_p(\phi(T) = 0) = 1$ for all $p \in (0, 1)$ , which proves that $T$ is complete.

Key Insight:

The completeness proof relies on the fact that a polynomial that is identically zero must have all zero coefficients. This technique works for many discrete exponential family distributions.

Lehmann-Scheffé Theorem

Core theorem for constructing unique UMVUE using sufficient complete statistics

Theorem Statement:

If S is a sufficient complete statistic for θ and φ(X̃) is an unbiased estimator of g(θ), then:

\hat{g} = E[\phi(\tilde{X})|S] \text{ is the unique UMVUE of } g(\theta)

Proof:

Step 1 (Strategy): Suppose $\tilde{\theta}$ is any other unbiased estimator of $\theta$ . We will show that $\text{Var}(\hat{\theta}) \leq \text{Var}(\tilde{\theta})$ with equality only when $\tilde{\theta} = \hat{\theta}$ .
Step 2 (Apply Rao-Blackwell): By Rao-Blackwell theorem, define:
$\tilde{\theta}^* = E[\tilde{\theta} \mid T]$
Then $\tilde{\theta}^*$ is also unbiased and:
$\text{Var}(\tilde{\theta}^*) \leq \text{Var}(\tilde{\theta})$
Step 3 (Function of Sufficient Statistic): Since $\tilde{\theta}^* = E[\tilde{\theta} \mid T]$ , it is a function of $T$ alone, say:
$\tilde{\theta}^* = h(T)$
for some function $h$ .
Step 4 (Both are Unbiased Functions of T): We now have two unbiased estimators based on $T$ :
$E[\hat{\theta}] = E[g(T)] = \theta$
$E[\tilde{\theta}^*] = E[h(T)] = \theta$
Step 5 (Use Completeness): Consider their difference:
$E[\hat{\theta} - \tilde{\theta}^*] = E[g(T) - h(T)] = \theta - \theta = 0$
Since $T$ is complete and $g(T) - h(T)$ is a function of $T$ with expectation zero:
$P(g(T) - h(T) = 0) = 1$
Therefore: $\hat{\theta} = \tilde{\theta}^*$ almost surely.
Step 6 (Conclude Uniqueness): Since $\hat{\theta} = \tilde{\theta}^*$ and $\text{Var}(\tilde{\theta}^*) \leq \text{Var}(\tilde{\theta})$ :
$\text{Var}(\hat{\theta}) = \text{Var}(\tilde{\theta}^*) \leq \text{Var}(\tilde{\theta})$
This holds for any unbiased estimator $\tilde{\theta}$ , so $\hat{\theta}$ has minimum variance among all unbiased estimators.
Step 7 (Uniqueness of UMVUE): If there were another UMVUE $\tilde{\theta}'$ , the same argument shows:
$E[\tilde{\theta}' \mid T] = \hat{\theta}$
By completeness: $\tilde{\theta}' = \hat{\theta}$ almost surely. Thus the UMVUE is unique. $\quad \blacksquare$

Corollary:

If h(S) is a function of sufficient complete statistic S with E_θ[h(S)] = g(θ), then h(S) is the unique UMVUE of g(θ)

Applications:

Normal N(μ,σ²)

Sufficient Complete:

S = (\sum X_i, \sum X_i^2)

Parameter:

\mu

UMVUE:

\bar{X} = \frac{\sum X_i}{n}

Verification:

E[\bar{X}] = \mu

and X̄ is function of S

Normal N(μ,σ²)

Sufficient Complete:

S = (\sum X_i, \sum X_i^2)

Parameter:

\sigma^2

UMVUE:

S^2 = \frac{\sum(X_i - \bar{X})^2}{n-1}

Verification:

E[S^2] = \sigma^2

and S² is function of S

Poisson P(λ)

Sufficient Complete:

S = \sum X_i

Parameter:

\lambda

UMVUE:

\bar{X} = \frac{S}{n}

Verification:

E[\bar{X}] = \lambda

and X̄ is function of S

Example: Constructing UMVUE for Normal Variance

Problem:

For $X_1, \ldots, X_n \sim N(\mu, \sigma^2)$ with both parameters unknown, find the UMVUE for $\sigma^2$ .

Solution:

Identify sufficient complete statistic: For normal distribution, $T = (\sum X_i, \sum X_i^2)$ is sufficient and complete for $(\mu, \sigma^2)$ .
Find an unbiased estimator: Consider $\hat{\sigma}^2_1 = X_1^2 - X_1 X_2$ . We check:
$E[X_1^2] = \text{Var}(X_1) + [E(X_1)]^2 = \sigma^2 + \mu^2$
$E[X_1 X_2] = E[X_1]E[X_2] = \mu^2$
Therefore: $E[\hat{\sigma}^2_1] = (\sigma^2 + \mu^2) - \mu^2 = \sigma^2$ (unbiased)
Apply Lehmann-Scheffé: The UMVUE is:
$\hat{\sigma}^2_{UMVUE} = E[\hat{\sigma}^2_1 \mid T]$
Compute conditional expectation: By symmetry and properties of normal distribution:
$E[X_1^2 \mid T] = E[X_i^2 \mid T] = \frac{1}{n} \sum_{i=1}^n E[X_i^2 \mid T]$
$= \frac{1}{n} E\left[\sum X_i^2 \mid T\right] = \frac{1}{n} \sum X_i^2$
Similarly for cross terms:
$E[X_1 X_2 \mid T] = \frac{1}{n(n-1)} \sum_{i \neq j} E[X_i X_j \mid T]$
$= \frac{1}{n(n-1)} \left[\left(\sum X_i\right)^2 - \sum X_i^2\right]$
Combine results:
$\hat{\sigma}^2_{UMVUE} = \frac{1}{n} \sum X_i^2 - \frac{1}{n(n-1)} \left[\left(\sum X_i\right)^2 - \sum X_i^2\right]$
$= \frac{1}{n-1} \left[\sum X_i^2 - \frac{1}{n}\left(\sum X_i\right)^2\right]$
$= \frac{1}{n-1} \sum (X_i - \bar{X})^2 = S^2$
Verification: We know $E[S^2] = \sigma^2$ and $S^2$ is a function of the sufficient complete statistic $T$ . Therefore, $S^2$ is the unique UMVUE for $\sigma^2$ .

Key Insight:

The sample variance $S^2$ is not just an unbiased estimator—it's the unique UMVUE. This demonstrates the power of Lehmann-Scheffé theorem in identifying optimal estimators.

Basu's Theorem

Establishes independence between sufficient complete statistics and ancillary statistics

Theorem Statement:

If T is a sufficient complete statistic for θ and V is an ancillary statistic (distribution independent of θ), then T and V are independent for all θ ∈ Θ

Key Concepts:

Ancillary statistic: distribution does not depend on θ

Independence follows from sufficiency and completeness

Useful for proving independence in specific problems

Example Application:

Setup: Normal N(μ,σ²) with sample X₁,...,Xₙ

Sufficient Complete:

(\bar{X}, S^2)

is sufficient complete for

(\mu, \sigma^2)

Sample skewness:

\text{Skew} = \frac{\sqrt{n} \sum(X_i - \bar{X})^3}{(\sum(X_i - \bar{X})^2)^{3/2}}

Reasoning: Skewness distribution is invariant under location-scale transformations Yᵢ = (Xᵢ-μ)/σ

Conclusion: (X̄, S²) and Skew are independent by Basu's theorem

Rigorous Theorem Proofs

Step-by-step mathematical derivations of fundamental theorems

Fisher-Neyman Factorization Theorem

Fundamental Criterion for Sufficiency

Let X have pdf/pmf f(x|θ). A statistic T(X) is sufficient for θ if and only if f(x|θ) = g(T(x)|θ)h(x) for some functions g and h.

Theorem Statement

f(x|\theta) = g(T(x)|\theta) h(x) \quad \forall x \in \mathcal{X}, \theta \in \Theta

This theorem allows us to find sufficient statistics by simple inspection of the density function.

Proof Steps (Discrete Case)

Discrete Case - Sufficiency (⇒)

Assume factorization holds. We show T is sufficient by computing conditional probability.

P(X=x|T=t) = \frac{P(X=x, T=t)}{P(T=t)}

Substitute Factorization

If T(x) ≠ t, probability is 0. If T(x)=t, substitute f(x|θ) = g(t|θ)h(x).

P(X=x|T=t) = \frac{g(t|\theta)h(x)}{\sum_{y:T(y)=t} g(t|\theta)h(y)}

Cancel Parameter Dependence

The term g(t|θ) factors out of the sum and cancels with the numerator.

P(X=x|T=t) = \frac{g(t|\theta)h(x)}{g(t|\theta)\sum_{y:T(y)=t} h(y)} = \frac{h(x)}{\sum_{y:T(y)=t} h(y)}

Conclusion for Sufficiency

The result depends only on x and h(x), not on θ. Thus, T is sufficient.

P(X=x|T=t) \text{ is independent of } \theta

Discrete Case - Necessity (⇐)

Assume T is sufficient. Then P(X=x|T=t) is independent of θ. Let this be k(x,t).

f(x|\theta) = P(X=x|\theta) = P(X=x, T=T(x)|\theta)

Construct Functions

Write joint prob as conditional × marginal. Define g(t|θ) = P(T=t|θ) and h(x) = P(X=x|T=T(x)).

f(x|\theta) = P(T=T(x)|\theta) P(X=x|T=T(x)) = g(T(x)|\theta)h(x)

Example Application

For Poisson(\lambda), f(\mathbf{x}|\lambda) = e^{-n\lambda}\lambda^{\sum x_i}/(\prod x_i!) = [e^{-n\lambda}\lambda^{\sum x_i}]_{g(T|\lambda)} \times [1/(\prod x_i!)]_{h(\mathbf{x})}. \text{ Thus } T=\sum X_i \text{ is sufficient.}

Basu's Theorem

Independence of Statistics

If T is a complete sufficient statistic for θ, and V is an ancillary statistic, then T and V are independent.

Theorem Statement

T \text{ complete sufficient}, V \text{ ancillary} \implies T \perp V

Powerful tool for proving independence without finding joint distributions.

Proof Steps

Define Conditional Probability

Let A be any event involving V (e.g., V ∈ B). Let η(t) = P(V ∈ B | T=t).

P(V \in B | T=t) = \eta(t)

Use Sufficiency

Since T is sufficient, the conditional distribution of X given T is independent of θ. Since V is a function of X, its conditional distribution given T is also independent of θ.

\eta(t) \text{ does not depend on } \theta

Compute Expectation

Consider the expectation of η(T) over T. By law of iterated expectations:

E[\eta(T)] = E[P(V \in B | T)] = P(V \in B)

Use Ancillarity

Since V is ancillary, P(V ∈ B) is a constant c independent of θ.

E[\eta(T)] = c

Construct Zero-Mean Function

Consider the function g(T) = η(T) - c. Its expectation is E[η(T) - c] = c - c = 0 for all θ.

E_\theta[\eta(T) - c] = 0 \quad \forall \theta

Apply Completeness

Since T is complete, g(T) must be zero almost surely. Thus η(T) = c a.s.

P(V \in B | T) = P(V \in B) \implies T \perp V

Example Application

\text{In } N(\mu, \sigma^2), \bar{X} \text{ is complete sufficient for } \mu \text{ (fixed } \sigma), S^2 \text{ is ancillary for } \mu. \text{ Thus } \bar{X} \perp S^2.

Sufficiency vs Completeness: Key Differences

Aspect	Sufficient Statistics	Complete Statistics
Core Purpose	$\text{Captures all parameter information from sample data}$	$\text{Ensures uniqueness of unbiased functions}$
Mathematical Criterion	$\text{Factorization theorem: } p(\tilde{x};\theta) = g(T(\tilde{x});\theta)h(\tilde{x})$	$\text{Zero unbiased functions are zero: } E[\phi(T)] = 0 \implies \phi(T) = 0 \text{ a.s.}$
Statistical Role	$\text{Data reduction without information loss}$	$\text{Uniqueness guarantee for optimal estimators}$
Independence	$\text{Not necessarily complete (e.g., } U(\theta-1/2, \theta+1/2)\text{)}$	$\text{Not necessarily sufficient (depends on family)}$
Combined Power	$\text{With completeness: enables Lehmann-Scheffé theorem}$	$\text{With sufficiency: constructs unique UMVUE}$

Practical Applications

How to apply sufficient and complete statistics in practice

UMVUE Construction

Use sufficient complete statistics to find unique optimal unbiased estimators

Identify sufficient complete statistic using factorization theorem

Find any unbiased estimator for the parameter

Apply Lehmann-Scheffé theorem: E[estimator|sufficient complete] = UMVUE

Variance Improvement

Apply Rao-Blackwell theorem to reduce estimator variance

Start with any unbiased estimator

Find sufficient statistic for the parameter

Compute conditional expectation given sufficient statistic

Result has smaller or equal variance

Independence Testing

Use Basu's theorem to establish independence properties

Identify sufficient complete statistic

Verify ancillary property of other statistic

Apply Basu's theorem to conclude independence

Use independence for further inference

Frequently Asked Questions

Common questions about sufficient and complete statistics

What is the intuitive understanding of Sufficient Statistics?

Intuitively, a sufficient statistic is like "lossless compression". It compresses the raw data into a simpler value (the statistic), but in this process, no information about the unknown parameter θ is lost. If you have the sufficient statistic, the original data is redundant for inferring θ.

Key Point: Lossless compression of data information

How to determine if a statistic is sufficient?

The most commonly used method is the **Factorization Theorem**. If the joint density function can be factored into two parts: one containing only the statistic T and parameter θ, and another containing only the sample x (independent of θ), then T is sufficient. The definition method (conditional distribution independent of θ) is usually difficult to verify directly.

f(x|\theta) = g(T(x)|\theta)h(x)

What is a Complete Statistic and why is it important?

Completeness is a uniqueness requirement: if a function g(T) has expectation 0 for all parameter values, then the function itself must be 0 almost surely. Completeness ensures uniqueness in inference based on sufficient statistics, which is used in the Lehmann-Scheffé theorem to determine UMVUE (Uniformly Minimum Variance Unbiased Estimator).

E[g(T)] = 0 \; \forall\theta \implies P(g(T)=0)=1

Comparison: Sufficiency ensures no information loss, completeness ensures estimator uniqueness

Is a sufficient statistic unique?

No. If T is a sufficient statistic, then any one-to-one function of T is also sufficient. For example, if ΣXᵢ is sufficient, then X̄ is also sufficient. We usually focus on the **Minimal Sufficient Statistic**, which has the highest degree of compression among all sufficient statistics.

Key Point: Minimal sufficient is the most compressed form

What is an Ancillary Statistic?

An ancillary statistic is a statistic whose distribution does not depend on the parameter θ. Although it contains no information about θ itself, it is often combined with sufficient statistics (as in Basu's theorem) to prove independence, or used as auxiliary information in conditional inference.

Example:

\text{Sample variance } S^2 \text{ is ancillary for } \mu \text{ in } N(\mu, 1)

What are the practical applications of Basu's Theorem?

Basu's theorem is very powerful. It states: if T is a complete sufficient statistic and V is an ancillary statistic, then T and V are independent. This conclusion simplifies many derivations, such as proving that the sample mean and sample variance are independent under normal distribution, without needing to compute complex joint distributions.

Key Point: Complete Sufficient ⊥ Ancillary

What does the Rao-Blackwell Theorem tell us?

The Rao-Blackwell theorem shows that the conditional expectation based on a sufficient statistic can improve any unbiased estimator. Specifically, if δ(X) is an unbiased estimate of θ, then E[δ(X)|T] is still unbiased, but its variance is always no greater than that of δ(X). This is the "dividend" brought by sufficiency.

\text{Var}(E[\delta|T]) \leq \text{Var}(\delta)

What is the relationship between Lehmann-Scheffé Theorem and UMVUE?

This theorem combines Rao-Blackwell and completeness, providing a clear method to find UMVUE (Uniformly Minimum Variance Unbiased Estimator): if an unbiased estimator depends only on a complete sufficient statistic, then it is the UMVUE. This is one of the most important results in parameter estimation theory.

Key Point: Complete Sufficient → Uniqueness of UMVUE

Back to Mathematical Statistics

Practice Problems View Formulas