Master parameter estimation methods and their optimality properties
The Method of Moments estimates parameters by setting sample moments equal to population moments and solving for parameters.
Population Moment
Sample Moment
MLE finds the parameter value that makes the observed data most likely. It's the gold standard for point estimation due to optimal asymptotic properties.
Likelihood Function
Score Function
Unbiasedness
Efficiency
Consistency
Mean Squared Error
Bias-Variance Decomposition:
For any unbiased estimator of , the variance satisfies:
Fisher Information
Efficiency
Theorem:
Define score function . Note:
Since is unbiased,
Apply Cauchy-Schwarz:
Rearranging gives:
Theorem:
Let be an unbiased estimator and a sufficient statistic. Define:
Then is also unbiased and .
By law of iterated expectations:
By law of total variance:
Since , we have
Theorem:
Let be a complete sufficient statistic for . If is an unbiased estimator based solely on , then is the unique UMVUE.
For any other unbiased estimator , apply Rao-Blackwell:
By completeness: almost surely
Therefore has minimum variance among all unbiased estimators and is unique.
Problem: Given a sample from , find the Method of Moments estimator for .
Solution:
Population first moment:
Sample first moment:
Set equal:
Problem: Find the MLE of and for sample from .
Solution:
Log-likelihood:
Differentiate w.r.t. :
Solve:
Differentiate w.r.t. and solve:
Problem: Given i.i.d., find the MLE of and verify it achieves the CRLB.
Solution:
Log-likelihood:
Score:
MLE:
Fisher information:
CRLB:
Actual variance: - achieves CRLB exactly!
Test your understanding with 10 multiple-choice questions
Use MLE when you need optimal asymptotic properties and can compute the likelihood. Use Method of Moments for quick estimates, complex likelihoods, or as starting values for iterative MLE. MLE is generally preferred for its efficiency and invariance property.
An estimator is efficient if it achieves the Cramér-Rao lower bound: . This means no other unbiased estimator has lower variance. MLE is asymptotically efficient under regularity conditions.
Dividing by makes the estimator unbiased: . We "lose one degree of freedom" because we estimate the mean from the same data. The MLE uses (biased) but the bias vanishes for large samples.
Unbiasedness () is a finite-sample property: on average across repeated samples of size , the estimate equals the true value. Consistency () is an asymptotic property: as , the estimate converges to the true value.
Two equivalent methods: (1) - expected squared score, or (2) - negative expected Hessian. Often method (2) is easier. For i.i.d. observations, total information is .