MathIsimple
Core Topic
10-12 Hours

Digital Characteristics & Characteristic Functions

Intermediate Level
15 Lessons
Core Concepts

Mathematical Expectation

The fundamental measure of central tendency for random variables

Definition and Essence

Core Definition

Definition:

Mathematical expectation is the 'weighted average' of a random variable's values, where weights are corresponding probabilities, reflecting the central tendency of the random variable.

Convergence Condition:

Must satisfy 'absolute convergence' condition to avoid order dependency in summation/integration:

Discrete Case
Eξ=k=1xkpkE\xi = \sum_{k=1}^{\infty} x_k p_k

provided \sum_{k=1}^{\infty} |x_k| p_k < \infty

Continuous Case
Eξ=xp(x)dxE\xi = \int_{-\infty}^{\infty} x p(x) dx

provided \int_{-\infty}^{\infty} |x| p(x) dx < \infty

General Form Case
Eξ=xdF(x)E\xi = \int_{-\infty}^{\infty} x dF(x)

Stieltjes integral unifies discrete (summation) and continuous (integration) cases

Fundamental Properties

Key Properties

Monotonicity

If a ≤ ξ ≤ b, then a ≤ Eξ ≤ b; if ξ ≤ η, then Eξ ≤ Eη

Importance: Preserves ordering relationships

Linearity
E(i=1nciξi+b)=i=1nciEξi+bE\left(\sum_{i=1}^n c_i\xi_i + b\right) = \sum_{i=1}^n c_i E\xi_i + b

Linear combinations preserve expectation (independence not required)

Importance: Most fundamental property for calculations

Independence Property
If ξ1,,ξn independent, then E(ξ1ξn)=Eξ1Eξn\text{If } \xi_1, \ldots, \xi_n \text{ independent, then } E(\xi_1 \cdots \xi_n) = E\xi_1 \cdots E\xi_n

Product of independent variables equals product of expectations

Important Inequalities

Markov Inequality
P(ξε)EξεP(|\xi| \geq \varepsilon) \leq \frac{E|\xi|}{\varepsilon}

Condition: for ε > 0

Application: Bounds probability using first moment

Cauchy-Schwarz Inequality
E(ξη)2Eξ2Eη2|E(\xi\eta)|^2 \leq E\xi^2 \cdot E\eta^2

Condition: equality iff P(η = t₀ξ) = 1 for some constant t₀

Application: Fundamental inequality connecting second moments

Jensen Inequality
If g(x) is convex, then g(Eξ)Eg(ξ)\text{If } g(x) \text{ is convex, then } g(E\xi) \leq Eg(\xi)

Condition: strict convexity: equality iff P(ξ = Eξ) = 1

Application: Relates function of expectation to expectation of function

Common Distribution Expectations
DistributionParametersExpectation E[ξ]
Bernoulli Ber(p)p: success probabilitypp
Binomial B(n,p)n: trials, p: success probabilitynpnp
Poisson P(λ)λ: rate parameterλλ
Geometric Geo(p)p: success probability1/p1/p
Uniform U[a,b]a, b: interval endpoints(a+b)/2(a+b)/2
Exponential Exp(λ)λ: rate parameter1/λ1/λ
Normal N(μ,σ²)μ: mean, σ²: varianceμμ

Variance & Covariance

Measuring dispersion and joint variability between random variables

Variance: Measuring Dispersion

Definition and Formula

Definition:

Variance measures how much a random variable deviates from its mean, defined as the expectation of squared deviation:

Varξ=E(ξEξ)2=Eξ2(Eξ)2\text{Var}\xi = E(\xi - E\xi)^2 = E\xi^2 - (E\xi)^2
Key Point:

The computational formula E[ξ²] - (E[ξ])² avoids direct calculation of deviations

Properties:
  • Var ξ = 0 ⟺ P(ξ = c) = 1 for some constant c (degenerate distribution)
  • Translation invariance: Var(ξ + b) = Var ξ (constants don't affect dispersion)
  • Scaling property: Var(cξ) = c²Var ξ (dispersion scales quadratically)
  • Independence property: If ξ₁,...,ξₙ independent, then Var(Σξᵢ) = ΣVar(ξᵢ)

Standard Deviation

Definition:

σ(ξ) = √Var(ξ), having the same units as the random variable

Advantage:

More interpretable than variance due to matching units

Chebyshev Inequality

Definition:

P(ξEξε)Varξε2P(|\xi - E\xi| \geq \varepsilon) \leq \frac{\text{Var}\xi}{\varepsilon^2}
Covariance: Measuring Joint Variability
Definition:
Cov(ξ,η)=E[(ξEξ)(ηEη)]=E(ξη)EξEηCov(ξ,η) = E[(ξ - Eξ)(η - Eη)] = E(ξη) - Eξ·Eη

Measures how two variables 'jointly deviate' from their respective means

Properties

Symmetry: Cov(ξ,η) = Cov(η,ξ)
Linearity: Cov(aξ+b, cη+d) = ac·Cov(ξ,η)
Distributivity: Cov(Σξᵢ, Σηⱼ) = ΣᵢΣⱼCov(ξᵢ,ηⱼ)
Variance relationship: Var(ξ+η) = Var(ξ) + Var(η) + 2Cov(ξ,η)
Correlation Coefficient: Standardized Dependence
Formula:
rξη=Cov(ξ,η)VarξVarηr_{\xi\eta} = \frac{\text{Cov}(\xi,\eta)}{\sqrt{\text{Var}\xi \cdot \text{Var}\eta}}

Purpose: Eliminates scale effects from covariance

Properties

Bounded: |r_{ξη}| ≤ 1
Perfect positive correlation: r_{ξη} = 1 ⟺ P((ξ-Eξ)/√Varξ = (η-Eη)/√Varη) = 1
Perfect negative correlation: r_{ξη} = -1 ⟺ P((ξ-Eξ)/√Varξ = -(η-Eη)/√Varη) = 1
Uncorrelated: r_{ξη} = 0 ⟺ Cov(ξ,η) = 0 ⟺ E(ξη) = Eξ·Eη

Independence vs Uncorrelatedness

General Rule:

ξ and η independent ⇒ uncorrelated (converse not generally true)

Counterexample:

ξ = cos θ, η = sin θ where θ ~ U[0,2π]: uncorrelated but not independent

Special Case:

For bivariate normal: uncorrelated ⟺ independent

Common Distribution Variances
DistributionVariance Var(ξ)Note
Bernoulli Ber(p)p(1p)p(1-p)Maximum at p = 1/2
Binomial B(n,p)np(1p)np(1-p)n times Bernoulli variance
Poisson P(λ)λλMean equals variance
Geometric Geo(p)(1p)/p2(1-p)/p²Decreases with higher success probability
Uniform U[a,b](ba)2/12(b-a)²/12Depends only on interval width
Exponential Exp(λ)1/λ21/λ²Variance is square of mean
Normal N(μ,σ²)σ2σ²Direct parameter specification

Moment Theory

Unified framework for distribution characteristics using moments

Moments: Unified Framework for Distribution Characteristics

Raw Moments

mk=Eξkm_k = E\xi^k

k-th power expectation about origin

Examples:
  • m₁ = Eξ (mean)
  • m₂ = Eξ² (second moment)

Central Moments

ck=E(ξEξ)kc_k = E(\xi - E\xi)^k

k-th power expectation about mean

Examples:
  • c₁ = 0 (always)
  • c₂ = Var ξ (variance)

Absolute Moments

Mα=EξαM_\alpha = E|\xi|^\alpha

Property: If Mₙ < ∞, then Mₖ < ∞ for 0 < k ≤ n

Shape Characteristics from Moments

Skewness Coefficient

γ1=c3c23/2\gamma_1 = \frac{c_3}{c_2^{3/2}}

Measures asymmetry of distribution

Values:
  • γ₁ > 0: right-skewed (positive skew)
  • γ₁ < 0: left-skewed (negative skew)
  • γ₁ = 0: symmetric (e.g., normal distribution)

Kurtosis Coefficient

γ2=c4c223\gamma_2 = \frac{c_4}{c_2^2} - 3

Measures 'peakedness' relative to normal distribution

Values:
  • γ₂ > 0: leptokurtic (more peaked than normal)
  • γ₂ < 0: platykurtic (flatter than normal)
  • γ₂ = 0: mesokurtic (normal peakedness)

Characteristic Functions

Fourier transforms for probability distributions

Definition and Fundamental Properties
Definition:
f(t)=EeitξfortR,wherei2=1f(t) = Ee^{itξ} for t ∈ ℝ, where i² = -1

Essence: Fourier transform of random variable, always exists without convergence conditions

Key Advantage: One-to-one correspondence with distribution functions

Mathematical Forms

Discrete Case
f(t)=k=1eitxkpkf(t) = \sum_{k=1}^{\infty} e^{itx_k} p_k
Continuous Case
f(t)=eitxp(x)dxf(t) = \int_{-\infty}^{\infty} e^{itx} p(x) dx
General Case
f(t)=eitxdF(x)f(t) = \int_{-\infty}^{\infty} e^{itx} dF(x)
Key Properties

Fundamental Properties

Boundedness and Conjugacy

|f(t)| ≤ f(0) = 1, f(-t) = f̄(t) (conjugate)

Uniform Continuity

f(t) is uniformly continuous on ℝ

Non-negative Definiteness

ΣᵢΣⱼ f(tᵢ-tⱼ)λᵢλ̄ⱼ ≥ 0 for any tᵢ ∈ ℝ, λᵢ ∈ ℂ

Independence Property
If ξ1,,ξn independent and η=i=1nξi, then fη(t)=i=1nfξi(t)\text{If } \xi_1, \ldots, \xi_n \text{ independent and } \eta = \sum_{i=1}^n \xi_i, \text{ then } f_\eta(t) = \prod_{i=1}^n f_{\xi_i}(t)

Significance: CF of sum = product of CFs (simplifies distribution calculations)

Moment Generation
If Eξn<, then f(k)(0)=ikEξk for 0kn\text{If } E\xi^n < \infty, \text{ then } f^{(k)}(0) = i^k E\xi^k \text{ for } 0 \leq k \leq n

Application: Extract moments through derivatives at origin

Linear Transformation
If η=aξ+b, then fη(t)=eitbfξ(at)\text{If } \eta = a\xi + b, \text{ then } f_\eta(t) = e^{itb} f_\xi(at)

Common Characteristic Functions
DistributionCharacteristic Function f(t)Parameters
Degenerate P(ξ=c)=1eitce^{itc}c: constant
Bernoulli Ber(p)peit+(1p)pe^{it} + (1-p)p: success probability
Binomial B(n,p)(peit+(1p))n(pe^{it} + (1-p))^nn: trials, p: probability
Poisson P(λ)eλ(eit1)e^{λ(e^{it}-1)}λ: rate parameter
Uniform U[a,b]eitbeitait(ba)\frac{e^{itb} - e^{ita}}{it(b-a)}a, b: interval endpoints
Exponential Exp(λ)(1itλ)1(1 - \frac{it}{λ})^{-1}λ: rate parameter
Normal N(μ,σ²)eitμσ2t22e^{itμ - \frac{σ²t²}{2}}μ: mean, σ²: variance
Cauchy Cauchy(a,b)eitabte^{ita - b|t|}a: location, b: scale
Fundamental Theorems

Inversion Formula

Statement:

For continuity points x₁ < x₂ of distribution function F(x):

F(x2)F(x1)=limT12πTTeitx1eitx2itf(t)dtF(x_2) - F(x_1) = \lim_{T \to \infty} \frac{1}{2\pi} \int_{-T}^T \frac{e^{-itx_1} - e^{-itx_2}}{it} f(t) dt
Significance:

Recovers distribution function from characteristic function

Uniqueness Theorem

Statement:

Characteristic function uniquely determines distribution

f1(t)=f2(t) for all tRF1(x)=F2(x) for all xRf_1(t) = f_2(t) \text{ for all } t \in \mathbb{R} \Rightarrow F_1(x) = F_2(x) \text{ for all } x \in \mathbb{R}
Significance:

Foundation for distribution identification via characteristic functions

Fourier Inversion

Statement:

If ∫₋∞^∞ |f(t)| dt < ∞, then ξ is continuous with density:

p(x)=12πeitxf(t)dtp(x) = \frac{1}{2\pi} \int_{-\infty}^{\infty} e^{-itx} f(t) dt
Significance:

Direct recovery of probability density from characteristic function

Multivariate Normal Distribution

Properties and conditional distributions of multivariate normal

Definition and Structure
Definition:

n-dimensional random vector ξ = (ξ₁,...,ξₙ)' has multivariate normal distribution if its characteristic function is:

f(t)=exp{ita12tΣt}f(t) = \exp\left\{ita' - \frac{1}{2}t'\Sigma t\right\}

Notation: ξ ~ N(a, Σ)

Parameters

a = (a₁,...,aₙ)' = Eξ (mean vector)
Σ = E[(ξ-a)(ξ-a)'] (n×n non-negative definite covariance matrix)
Fundamental Properties

Fundamental Properties

Marginal Normality

Any subset of components has multivariate normal distribution

If ξN(a,Σ), then ξ(k)N(a(k),Σ(k))\text{If } \xi \sim N(a, \Sigma), \text{ then } \xi_{(k)} \sim N(a_{(k)}, \Sigma_{(k)})

Subscript (k) denotes corresponding subvector/submatrix

Linear Transformation Invariance

Linear combinations preserve multivariate normality

If ξN(a,Σ) and η=Cξ for m×n matrix C, then ηN(Ca,CΣC)\text{If } \xi \sim N(a, \Sigma) \text{ and } \eta = C\xi \text{ for } m \times n \text{ matrix } C, \text{ then } \eta \sim N(Ca, C\Sigma C')
Independence-Uncorrelatedness Equivalence

Components are independent if and only if uncorrelated

ξi and ξj independentCov(ξi,ξj)=0\xi_i \text{ and } \xi_j \text{ independent} \Leftrightarrow \text{Cov}(\xi_i, \xi_j) = 0

Significance: Unique property not shared by other multivariate distributions

Conditional Distribution

Conditional Distribution

Setup:

Partition ξ = (ξ₁', ξ₂')' with corresponding mean and covariance partitions:

a=(a1,a2)a = (a₁', a₂')'
Σ=(Σ11Σ12Σ21Σ22)\Sigma = \begin{pmatrix} \Sigma_{11} & \Sigma_{12} \\ \Sigma_{21} & \Sigma_{22} \end{pmatrix}
Result:

Given ξ₁ = x₁, the conditional distribution is ξ₂|ξ₁ = x₁ ~ N(a₂·₁, Σ₂₂·₁) where:

a21=a2+Σ21Σ111(x1a1)a_{2 \cdot 1} = a_2 + \Sigma_{21}\Sigma_{11}^{-1}(x_1 - a_1)
Σ221=Σ22Σ21Σ111Σ12\Sigma_{22 \cdot 1} = \Sigma_{22} - \Sigma_{21}\Sigma_{11}^{-1}\Sigma_{12}
Interpretation:
  • Conditional mean: linear regression of ξ₂ on ξ₁
  • Conditional variance: independent of x₁ value

Worked Examples

Step-by-step solutions to digital characteristics problems

Poisson Distribution Second Moment
Problem:

For ξ ~ P(λ), find E[ξ²]

Solution Steps:
  1. 1Step 1: Use known properties - E[ξ] = λ, Var(ξ) = λ
  2. 2Step 2: Apply variance formula - Var(ξ) = E[ξ²] - (E[ξ])²
  3. 3Step 3: Substitute values - λ = E[ξ²] - λ²
  4. 4Step 4: Solve for E[ξ²] - E[ξ²] = λ + λ² = λ(1 + λ)
Key Point:

Use the computational variance formula rather than direct calculation

Linear Combination Variance
Problem:

If ξ and η are independent with E[ξ]=2, Var(ξ)=1, E[η]=3, Var(η)=2, find Var(2ξ - η + 1)

Solution Steps:
  1. 1Step 1: Apply variance properties - Var(aX) = a²Var(X), Var(X+c) = Var(X)
  2. 2Step 2: Use independence - Var(X+Y) = Var(X) + Var(Y) for independent X,Y
  3. 3Step 3: Calculate - Var(2ξ - η + 1) = Var(2ξ) + Var(-η) + Var(1)
  4. 4Step 4: Substitute - = 2²Var(ξ) + (-1)²Var(η) + 0 = 4(1) + 1(2) = 6
Key Point:

Constants don't affect variance; independence allows additive variance rule

Characteristic Function Identification
Problem:

If characteristic function is f(t) = e^{2it - 3t²/2}, identify the distribution

Solution Steps:
  1. 1Step 1: Compare with normal CF template - f(t) = e^{itμ - σ²t²/2}
  2. 2Step 2: Match coefficients - itμ = 2it, so μ = 2
  3. 3Step 3: Match variance term - σ²t²/2 = 3t²/2, so σ² = 3
  4. 4Step 4: Conclude - ξ ~ N(2, 3)
Key Point:

Normal distribution characteristic function has distinctive exponential quadratic form

Practice Quiz
10
Questions
0
Correct
0%
Accuracy
1
What is the correct formula for mathematical expectation of a continuous random variable?
Not attempted
2
Which property of expectation is INCORRECT?
Not attempted
3
The computational formula for variance is:
Not attempted
4
If XX and YY are uncorrelated, what can we conclude?
Not attempted
5
The correlation coefficient ρXY\rho_{XY} satisfies:
Not attempted
6
What is the characteristic function of XN(μ,σ2)X \sim N(\mu, \sigma^2)?
Not attempted
7
If Var(X)=0\text{Var}(X) = 0, what can we conclude?
Not attempted
8
The skewness coefficient γ1=0\gamma_1 = 0 indicates:
Not attempted
9
What is the key advantage of characteristic functions?
Not attempted
10
For independent XX and YY with CFs fX(t)f_X(t) and fY(t)f_Y(t), the CF of X+YX + Y is:
Not attempted

Frequently Asked Questions

What is the difference between expectation and variance?

Expectation E[X] measures the center or average value of a distribution (first moment), while variance Var(X) = E[(X-μ)²] measures dispersion or spread around the mean (second central moment). A distribution can have the same mean but different variances, indicating different levels of uncertainty.

Why do we need characteristic functions when we have moment generating functions?

Characteristic functions f(t) = E[e^{itX}] always exist for any distribution (bounded by 1), while moment generating functions M(t) = E[e^{tX}] may not exist if moments don't exist (e.g., Cauchy distribution). CF uniquely determines the distribution and is especially powerful for proving limit theorems in probability theory.

What does it mean when correlation coefficient is zero?

ρ_{XY} = 0 means X and Y are uncorrelated: Cov(X,Y) = 0 or E[XY] = E[X]E[Y]. This implies no linear relationship, but doesn't rule out nonlinear dependence. Independence always implies uncorrelatedness, but uncorrelated doesn't imply independence (except for bivariate normal distributions).

How do I interpret skewness and kurtosis?

Skewness (γ₁) measures asymmetry: γ₁ > 0 (right-skewed, long right tail), γ₁ < 0 (left-skewed), γ₁ = 0 (symmetric). Kurtosis (γ₂) measures tail heaviness relative to normal: γ₂ > 0 (heavy tails, more outliers), γ₂ < 0 (light tails), γ₂ = 0 (normal-like tails). These help identify appropriate distributions for modeling.

Can variance be negative?

No, variance is always non-negative: Var(X) = E[(X-μ)²] ≥ 0 since it's an expectation of a squared term. Var(X) = 0 if and only if X is constant with probability 1. Standard deviation σ = √Var(X) is also non-negative and has the same units as X.

Ask AI ✨
MathIsimple – Simple, Friendly Math Tools & Learning