Master the numerical characteristics of random variables including expectation, variance, moments, and characteristic functions. Essential tools for describing and analyzing probability distributions.
The fundamental measure of central tendency for random variables
Mathematical expectation is the 'weighted average' of a random variable's values, where weights are corresponding probabilities, reflecting the central tendency of the random variable.
Must satisfy 'absolute convergence' condition to avoid order dependency in summation/integration:
provided \sum_{k=1}^{\infty} |x_k| p_k < \infty
provided \int_{-\infty}^{\infty} |x| p(x) dx < \infty
Stieltjes integral unifies discrete (summation) and continuous (integration) cases
If a ≤ ξ ≤ b, then a ≤ Eξ ≤ b; if ξ ≤ η, then Eξ ≤ Eη
Importance: Preserves ordering relationships
Linear combinations preserve expectation (independence not required)
Importance: Most fundamental property for calculations
Product of independent variables equals product of expectations
Condition: for ε > 0
Application: Bounds probability using first moment
Condition: equality iff P(η = t₀ξ) = 1 for some constant t₀
Application: Fundamental inequality connecting second moments
Condition: strict convexity: equality iff P(ξ = Eξ) = 1
Application: Relates function of expectation to expectation of function
Distribution | Parameters | Expectation E[ξ] |
---|---|---|
Bernoulli Ber(p) | p: success probability | |
Binomial B(n,p) | n: trials, p: success probability | |
Poisson P(λ) | λ: rate parameter | |
Geometric Geo(p) | p: success probability | |
Uniform U[a,b] | a, b: interval endpoints | |
Exponential Exp(λ) | λ: rate parameter | |
Normal N(μ,σ²) | μ: mean, σ²: variance |
Measuring dispersion and joint variability between random variables
Variance measures how much a random variable deviates from its mean, defined as the expectation of squared deviation:
The computational formula E[ξ²] - (E[ξ])² avoids direct calculation of deviations
σ(ξ) = √Var(ξ), having the same units as the random variable
More interpretable than variance due to matching units
Measures how two variables 'jointly deviate' from their respective means
Purpose: Eliminates scale effects from covariance
ξ and η independent ⇒ uncorrelated (converse not generally true)
ξ = cos θ, η = sin θ where θ ~ U[0,2π]: uncorrelated but not independent
For bivariate normal: uncorrelated ⟺ independent
Distribution | Variance Var(ξ) | Note |
---|---|---|
Bernoulli Ber(p) | Maximum at p = 1/2 | |
Binomial B(n,p) | n times Bernoulli variance | |
Poisson P(λ) | Mean equals variance | |
Geometric Geo(p) | Decreases with higher success probability | |
Uniform U[a,b] | Depends only on interval width | |
Exponential Exp(λ) | Variance is square of mean | |
Normal N(μ,σ²) | Direct parameter specification |
Unified framework for describing distribution characteristics
k-th power expectation about origin
k-th power expectation about mean
Property: If Mₙ < ∞, then Mₖ < ∞ for 0 < k ≤ n
Measures asymmetry of distribution
Measures 'peakedness' relative to normal distribution
The most powerful tool for analyzing probability distributions
Essence: Fourier transform of random variable, always exists without convergence conditions
Key Advantage: One-to-one correspondence with distribution functions
|f(t)| ≤ f(0) = 1, f(-t) = f̄(t) (conjugate)
f(t) is uniformly continuous on ℝ
ΣᵢΣⱼ f(tᵢ-tⱼ)λᵢλ̄ⱼ ≥ 0 for any tᵢ ∈ ℝ, λᵢ ∈ ℂ
Significance: CF of sum = product of CFs (simplifies distribution calculations)
Application: Extract moments through derivatives at origin
Distribution | Characteristic Function f(t) | Parameters |
---|---|---|
Degenerate P(ξ=c)=1 | c: constant | |
Bernoulli Ber(p) | p: success probability | |
Binomial B(n,p) | n: trials, p: probability | |
Poisson P(λ) | λ: rate parameter | |
Uniform U[a,b] | a, b: interval endpoints | |
Exponential Exp(λ) | λ: rate parameter | |
Normal N(μ,σ²) | μ: mean, σ²: variance | |
Cauchy Cauchy(a,b) | a: location, b: scale |
For continuity points x₁ < x₂ of distribution function F(x):
Recovers distribution function from characteristic function
Characteristic function uniquely determines distribution
Foundation for distribution identification via characteristic functions
If ∫₋∞^∞ |f(t)| dt < ∞, then ξ is continuous with density:
Direct recovery of probability density from characteristic function
The cornerstone of multivariate probability theory
n-dimensional random vector ξ = (ξ₁,...,ξₙ)' has multivariate normal distribution if its characteristic function is:
Notation: ξ ~ N(a, Σ)
Any subset of components has multivariate normal distribution
Subscript (k) denotes corresponding subvector/submatrix
Linear combinations preserve multivariate normality
Components are independent if and only if uncorrelated
Significance: Unique property not shared by other multivariate distributions
Partition ξ = (ξ₁', ξ₂')' with corresponding mean and covariance partitions:
Given ξ₁ = x₁, the conditional distribution is ξ₂|ξ₁ = x₁ ~ N(a₂·₁, Σ₂₂·₁) where:
Step-by-step solutions to typical digital characteristics problems
For ξ ~ P(λ), find E[ξ²]
Use the computational variance formula rather than direct calculation
If ξ and η are independent with E[ξ]=2, Var(ξ)=1, E[η]=3, Var(η)=2, find Var(2ξ - η + 1)
Constants don't affect variance; independence allows additive variance rule
If characteristic function is f(t) = e^{2it - 3t²/2}, identify the distribution
Normal distribution characteristic function has distinctive exponential quadratic form