MathIsimple
Probability Theory Prerequisites

Probability Theory Prerequisites

Master the essential probability theory foundations needed for stochastic processes: random experiments, conditional probability, independence, and Bayes' theorem

8-12 hours total4 comprehensive modulesBeginner to Intermediate

🎯 Learning Objectives

By completing this module, you will:

  • Understand random experiments and sample spaces
  • Master conditional probability and multiplication rules
  • Apply Bayes' theorem to real-world problems
  • Distinguish between independence and mutual exclusion

Prerequisites:

  • Basic set theory (unions, intersections, complements)
  • Elementary algebra and fractions
  • Logical reasoning skills

Learning Modules

Four comprehensive modules building from basic concepts to advanced applications

1. Probability Foundations
Random experiments, sample spaces, events, and basic probability definitions
Beginner
2-3 hours4 topics

Topics covered:

  • Random Experiments & Sample Spaces
  • Events and Event Operations
  • Probability Axioms & Properties
  • Classical Probability Definition

Key formulas:

P(A)0(Nonnegativity)P(A) ≥ 0 (Non-negativity)
P(S)=1(Normalization)P(S) = 1 (Normalization)
P(AB)=P(A)+P(B)P(AB)P(A ∪ B) = P(A) + P(B) - P(A ∩ B)
2. Conditional Probability
Conditional probability, multiplication rules, and probability trees
Beginner to Intermediate
2-3 hours4 topics

Topics covered:

  • Definition of Conditional Probability
  • Multiplication Rule
  • Chain Rule for Multiple Events
  • Probability Trees & Applications

Key formulas:

P(BA)=P(AB)/P(A)P(B|A) = P(A ∩ B) / P(A)
P(AB)=P(A)×P(BA)P(A ∩ B) = P(A) × P(B|A)
P(A1A2...An)=P(A1)P(A2A1)...P(AnA1...An1)P(A₁ ∩ A₂ ∩ ... ∩ Aₙ) = P(A₁)P(A₂|A₁)...P(Aₙ|A₁...Aₙ₋₁)
3. Total Probability & Bayes' Theorem
Law of total probability, Bayes' theorem, and practical applications
Intermediate
2-3 hours4 topics

Topics covered:

  • Partition of Sample Space
  • Law of Total Probability
  • Bayes' Theorem & Applications
  • Prior & Posterior Probabilities

Key formulas:

P(A)=ΣP(ABi)P(Bi)P(A) = Σ P(A|Bᵢ)P(Bᵢ)
P(BjA)=P(ABj)P(Bj)/ΣP(ABi)P(Bi)P(Bⱼ|A) = P(A|Bⱼ)P(Bⱼ) / Σ P(A|Bᵢ)P(Bᵢ)
P(HE)=P(EH)P(H)/P(E)P(H|E) = P(E|H)P(H) / P(E)
4. Independence
Statistical independence, mutual independence, and applications
Intermediate
1-2 hours4 topics

Topics covered:

  • Definition of Independence
  • Pairwise vs Mutual Independence
  • Independence of Multiple Events
  • Applications & Examples

Key formulas:

P(AB)=P(A)×P(B)P(A ∩ B) = P(A) × P(B)
P(BA)=P(B)(whenAandBareindependent)P(B|A) = P(B) (when A and B are independent)
P(A1A2...An)=P(A1)P(A2)...P(An)P(A₁ ∩ A₂ ∩ ... ∩ Aₙ) = P(A₁)P(A₂)...P(Aₙ)
1

Probability Foundations

Random Experiments
A random experiment is a procedure that:
• Can be repeated under identical conditions
• Has a well-defined set of possible outcomes  
• The exact outcome cannot be predicted with certainty

Examples:
- Tossing a coin: outcomes {H, T}
- Rolling a die: outcomes {1, 2, 3, 4, 5, 6}
- Drawing a card from a deck: 52 possible outcomes
Sample Space
The sample space S (or Ω) is the set of all possible outcomes of a random experiment.

Properties:
- Contains every possible outcome
- Outcomes are mutually exclusive
- Exactly one outcome occurs per experiment

Types:
- Finite: S = {1, 2, 3, 4, 5, 6} (die roll)
- Countably infinite: S = {1, 2, 3, ...} (number of trials until success)
- Uncountably infinite: S = [0, 1] (random number between 0 and 1)
Events
An event is a subset of the sample space S.

Types of Events:
- Simple event: contains exactly one outcome
- Compound event: contains multiple outcomes
- Certain event: S (always occurs)
- Impossible event: ∅ (never occurs)

Event Operations:
- Union: A ∪ B (A or B occurs)
- Intersection: A ∩ B (both A and B occur)  
- Complement: A^c (A does not occur)
- Difference: A - B = A ∩ B^c
Probability Axioms
Kolmogorov's axioms define probability as a function P: Events → [0,1] satisfying:

Axiom 1 (Non-negativity): P(A) ≥ 0 for all events A
Axiom 2 (Normalization): P(S) = 1  
Axiom 3 (Countable Additivity): For mutually exclusive events A₁, A₂, ...:
P(A₁ ∪ A₂ ∪ ...) = P(A₁) + P(A₂) + ...

Important consequences:
- P(∅) = 0
- P(A^c) = 1 - P(A)
- If A ⊆ B, then P(A) ≤ P(B)
- P(A ∪ B) = P(A) + P(B) - P(A ∩ B)
2

Conditional Probability

Definition
Conditional probability P(B|A) is the probability of event B occurring given that event A has occurred.

Definition: P(B|A) = P(A ∩ B) / P(A), provided P(A) > 0

Interpretation:
- Updates our probability assessment based on new information
- Restricts our sample space to the condition A
- Foundation for Bayesian inference
Properties of Conditional Probability
Conditional probability satisfies all probability axioms:

1. P(B|A) ≥ 0 for all events B
2. P(S|A) = 1
3. For mutually exclusive events B₁, B₂, ...:
   P(B₁ ∪ B₂ ∪ ...|A) = P(B₁|A) + P(B₂|A) + ...

Additional properties:
- P(B^c|A) = 1 - P(B|A)
- P(B₁ ∪ B₂|A) = P(B₁|A) + P(B₂|A) - P(B₁ ∩ B₂|A)
Multiplication Rule
From the definition of conditional probability:

For two events: P(A ∩ B) = P(A) × P(B|A) = P(B) × P(A|B)

For multiple events (chain rule):
P(A₁ ∩ A₂ ∩ ... ∩ Aₙ) = P(A₁) × P(A₂|A₁) × P(A₃|A₁ ∩ A₂) × ... × P(Aₙ|A₁ ∩ ... ∩ Aₙ₋₁)

This is fundamental for:
- Sequential experiments
- Tree diagrams
- Markov chains
3

Total Probability & Bayes' Theorem

Law of Total Probability
If B₁, B₂, ..., Bₙ form a partition of the sample space (mutually exclusive and exhaustive), then for any event A:

P(A) = P(A|B₁)P(B₁) + P(A|B₂)P(B₂) + ... + P(A|Bₙ)P(Bₙ)
     = Σᵢ P(A|Bᵢ)P(Bᵢ)

This allows us to compute P(A) by conditioning on different scenarios.

Applications:
- Quality control with multiple suppliers
- Medical diagnosis with different diseases
- System reliability analysis
Bayes' Theorem
Bayes' theorem provides a way to update probabilities based on new evidence:

P(Bⱼ|A) = P(A|Bⱼ)P(Bⱼ) / [Σᵢ P(A|Bᵢ)P(Bᵢ)]
        = P(A|Bⱼ)P(Bⱼ) / P(A)

Components:
- P(Bⱼ): prior probability (before observing A)
- P(Bⱼ|A): posterior probability (after observing A)  
- P(A|Bⱼ): likelihood (probability of evidence given hypothesis)
- P(A): marginal probability (normalizing constant)

This is the foundation of Bayesian statistics and machine learning.
Medical Diagnosis Example
A diagnostic test for a rare disease:
- Disease prevalence: P(D) = 0.001 (0.1% of population)
- Test sensitivity: P(+|D) = 0.99 (99% detection rate)
- Test specificity: P(-|D^c) = 0.95 (95% correct negative rate)

Question: If test is positive, what's the probability of having the disease?

Solution using Bayes' theorem:
P(D|+) = P(+|D)P(D) / [P(+|D)P(D) + P(+|D^c)P(D^c)]
       = (0.99)(0.001) / [(0.99)(0.001) + (0.05)(0.999)]
       = 0.00099 / 0.050985
       ≈ 0.0194 or 1.94%

Despite a positive test, the probability is only ~2% due to the low disease prevalence!
4

Independence

Definition of Independence
Two events A and B are independent if:
P(A ∩ B) = P(A) × P(B)

Equivalent conditions (when P(A) > 0 and P(B) > 0):
- P(B|A) = P(B)
- P(A|B) = P(A)
- P(A ∩ B) = P(A) × P(B)

Independence means:
- Knowledge of A doesn't change probability of B
- Events don't influence each other
- Different from mutually exclusive (which means P(A ∩ B) = 0)
Multiple Events Independence
For events A₁, A₂, ..., Aₙ to be mutually independent:

P(Aᵢ₁ ∩ Aᵢ₂ ∩ ... ∩ Aᵢₖ) = P(Aᵢ₁) × P(Aᵢ₂) × ... × P(Aᵢₖ)

for every subset {i₁, i₂, ..., iₖ} of {1, 2, ..., n}.

This requires checking 2ⁿ - n - 1 conditions!

Pairwise independence: Every pair is independent
- P(Aᵢ ∩ Aⱼ) = P(Aᵢ) × P(Aⱼ) for all i ≠ j
- Does NOT imply mutual independence
- Easier to verify (only n(n-1)/2 conditions)
Properties of Independence
If A and B are independent:

1. A and B^c are independent
2. A^c and B are independent  
3. A^c and B^c are independent
4. A - B and B are independent
5. Any function of A is independent of any function of B

For independent events:
- P(A ∪ B) = P(A) + P(B) - P(A)P(B) = 1 - P(A^c)P(B^c)
- If P(A) > 0 and P(B) > 0, then A and B cannot be mutually exclusive

Applications:
- Repeated trials (coin flips, die rolls)
- Component reliability in systems
- Statistical sampling

📐 Essential Formula Reference

Key formulas from this module that you'll use throughout stochastic processes

Basic Probability
P()=0P(\emptyset) = 0
P(Ac)=1P(A)P(A^c) = 1 - P(A)
P(AB)=P(A)+P(B)P(AB)P(A \cup B) = P(A) + P(B) - P(A \cap B)
Conditional Probability
P(BA)=P(AB)P(A)P(B|A) = \frac{P(A \cap B)}{P(A)}
P(AB)=P(A)×P(BA)P(A \cap B) = P(A) \times P(B|A)
P(A)=iP(ABi)P(Bi)P(A) = \sum_{i} P(A|B_i)P(B_i)
Bayes' Theorem
P(BjA)=P(ABj)P(Bj)iP(ABi)P(Bi)P(B_j|A) = \frac{P(A|B_j)P(B_j)}{\sum_{i} P(A|B_i)P(B_i)}
Posterior=Likelihood×PriorEvidence\text{Posterior} = \frac{\text{Likelihood} \times \text{Prior}}{\text{Evidence}}
Independence
P(AB)=P(A)×P(B)P(A \cap B) = P(A) \times P(B)
P(BA)=P(B) (when independent)P(B|A) = P(B) \text{ (when independent)}

🚀 Ready for Next Steps?

Now that you've mastered probability theory prerequisites, you're ready to explore random variables, mathematical expectation, and advance toward stochastic processes!

Practice Problems

Reinforce your understanding with interactive practice problems.

Practice Now

Use Calculators

Apply concepts with interactive probability calculators.

Try Calculators

Next Module

Continue with Random Variables & Distributions.

Continue Learning

Master the foundations