Master the essential probability theory foundations needed for stochastic processes: random experiments, conditional probability, independence, and Bayes' theorem
Four comprehensive modules building from basic concepts to advanced applications
A random experiment is a procedure that: • Can be repeated under identical conditions • Has a well-defined set of possible outcomes • The exact outcome cannot be predicted with certainty Examples: - Tossing a coin: outcomes {H, T} - Rolling a die: outcomes {1, 2, 3, 4, 5, 6} - Drawing a card from a deck: 52 possible outcomes
The sample space S (or Ω) is the set of all possible outcomes of a random experiment. Properties: - Contains every possible outcome - Outcomes are mutually exclusive - Exactly one outcome occurs per experiment Types: - Finite: S = {1, 2, 3, 4, 5, 6} (die roll) - Countably infinite: S = {1, 2, 3, ...} (number of trials until success) - Uncountably infinite: S = [0, 1] (random number between 0 and 1)
An event is a subset of the sample space S. Types of Events: - Simple event: contains exactly one outcome - Compound event: contains multiple outcomes - Certain event: S (always occurs) - Impossible event: ∅ (never occurs) Event Operations: - Union: A ∪ B (A or B occurs) - Intersection: A ∩ B (both A and B occur) - Complement: A^c (A does not occur) - Difference: A - B = A ∩ B^c
Kolmogorov's axioms define probability as a function P: Events → [0,1] satisfying: Axiom 1 (Non-negativity): P(A) ≥ 0 for all events A Axiom 2 (Normalization): P(S) = 1 Axiom 3 (Countable Additivity): For mutually exclusive events A₁, A₂, ...: P(A₁ ∪ A₂ ∪ ...) = P(A₁) + P(A₂) + ... Important consequences: - P(∅) = 0 - P(A^c) = 1 - P(A) - If A ⊆ B, then P(A) ≤ P(B) - P(A ∪ B) = P(A) + P(B) - P(A ∩ B)
Conditional probability P(B|A) is the probability of event B occurring given that event A has occurred. Definition: P(B|A) = P(A ∩ B) / P(A), provided P(A) > 0 Interpretation: - Updates our probability assessment based on new information - Restricts our sample space to the condition A - Foundation for Bayesian inference
Conditional probability satisfies all probability axioms: 1. P(B|A) ≥ 0 for all events B 2. P(S|A) = 1 3. For mutually exclusive events B₁, B₂, ...: P(B₁ ∪ B₂ ∪ ...|A) = P(B₁|A) + P(B₂|A) + ... Additional properties: - P(B^c|A) = 1 - P(B|A) - P(B₁ ∪ B₂|A) = P(B₁|A) + P(B₂|A) - P(B₁ ∩ B₂|A)
From the definition of conditional probability: For two events: P(A ∩ B) = P(A) × P(B|A) = P(B) × P(A|B) For multiple events (chain rule): P(A₁ ∩ A₂ ∩ ... ∩ Aₙ) = P(A₁) × P(A₂|A₁) × P(A₃|A₁ ∩ A₂) × ... × P(Aₙ|A₁ ∩ ... ∩ Aₙ₋₁) This is fundamental for: - Sequential experiments - Tree diagrams - Markov chains
If B₁, B₂, ..., Bₙ form a partition of the sample space (mutually exclusive and exhaustive), then for any event A: P(A) = P(A|B₁)P(B₁) + P(A|B₂)P(B₂) + ... + P(A|Bₙ)P(Bₙ) = Σᵢ P(A|Bᵢ)P(Bᵢ) This allows us to compute P(A) by conditioning on different scenarios. Applications: - Quality control with multiple suppliers - Medical diagnosis with different diseases - System reliability analysis
Bayes' theorem provides a way to update probabilities based on new evidence: P(Bⱼ|A) = P(A|Bⱼ)P(Bⱼ) / [Σᵢ P(A|Bᵢ)P(Bᵢ)] = P(A|Bⱼ)P(Bⱼ) / P(A) Components: - P(Bⱼ): prior probability (before observing A) - P(Bⱼ|A): posterior probability (after observing A) - P(A|Bⱼ): likelihood (probability of evidence given hypothesis) - P(A): marginal probability (normalizing constant) This is the foundation of Bayesian statistics and machine learning.
A diagnostic test for a rare disease: - Disease prevalence: P(D) = 0.001 (0.1% of population) - Test sensitivity: P(+|D) = 0.99 (99% detection rate) - Test specificity: P(-|D^c) = 0.95 (95% correct negative rate) Question: If test is positive, what's the probability of having the disease? Solution using Bayes' theorem: P(D|+) = P(+|D)P(D) / [P(+|D)P(D) + P(+|D^c)P(D^c)] = (0.99)(0.001) / [(0.99)(0.001) + (0.05)(0.999)] = 0.00099 / 0.050985 ≈ 0.0194 or 1.94% Despite a positive test, the probability is only ~2% due to the low disease prevalence!
Two events A and B are independent if: P(A ∩ B) = P(A) × P(B) Equivalent conditions (when P(A) > 0 and P(B) > 0): - P(B|A) = P(B) - P(A|B) = P(A) - P(A ∩ B) = P(A) × P(B) Independence means: - Knowledge of A doesn't change probability of B - Events don't influence each other - Different from mutually exclusive (which means P(A ∩ B) = 0)
For events A₁, A₂, ..., Aₙ to be mutually independent: P(Aᵢ₁ ∩ Aᵢ₂ ∩ ... ∩ Aᵢₖ) = P(Aᵢ₁) × P(Aᵢ₂) × ... × P(Aᵢₖ) for every subset {i₁, i₂, ..., iₖ} of {1, 2, ..., n}. This requires checking 2ⁿ - n - 1 conditions! Pairwise independence: Every pair is independent - P(Aᵢ ∩ Aⱼ) = P(Aᵢ) × P(Aⱼ) for all i ≠ j - Does NOT imply mutual independence - Easier to verify (only n(n-1)/2 conditions)
If A and B are independent: 1. A and B^c are independent 2. A^c and B are independent 3. A^c and B^c are independent 4. A - B and B are independent 5. Any function of A is independent of any function of B For independent events: - P(A ∪ B) = P(A) + P(B) - P(A)P(B) = 1 - P(A^c)P(B^c) - If P(A) > 0 and P(B) > 0, then A and B cannot be mutually exclusive Applications: - Repeated trials (coin flips, die rolls) - Component reliability in systems - Statistical sampling
Key formulas from this module that you'll use throughout stochastic processes
Now that you've mastered probability theory prerequisites, you're ready to explore random variables, mathematical expectation, and advance toward stochastic processes!