MathIsimple

Exact Inference

Master precise probability computation methods without approximation. Learn variable elimination, belief propagation, and when exact inference is feasible for acyclic graphs.

Module 5 of 7
Intermediate to Advanced
120-150 min

Inference Goals

In probabilistic graphical models, inference refers to computing probabilities of interest from the joint distribution. The two main types are:

Marginal Probability

P(xE)=xFP(xE,xF)P(x_E) = \sum_{x_F} P(x_E, x_F)

Compute probability of variable set EE by summing over all possible values of other variables FF.

Conditional Probability

P(xFxE)=P(xE,xF)xFP(xE,xF)P(x_F | x_E) = \frac{P(x_E, x_F)}{\sum_{x_F} P(x_E, x_F)}

Compute probability of FF given evidence EEusing Bayes' rule.

Variable Elimination

Variable elimination is a general exact inference algorithm that eliminates variables one by one through step-by-step integration (for continuous) or summation (for discrete).

Algorithm Steps

  1. 1. Choose elimination order: Select an order to eliminate variablesx1,x2,,xkx_1, x_2, \ldots, x_k (not in query set EE).
  2. 2. For each variable xix_i:
    • • Collect all factors (probability tables) involving xix_i
    • • Multiply these factors together
    • • Sum out (eliminate) xix_i from the product
    • • Result is a new factor (without xix_i)
  3. 3. Final result: After eliminating all variables, remaining factors give the marginal probability P(xE)P(x_E).

Example: Computing P(x3)P(x_3)

Given joint distribution P(x1,x2,x3,x4)=P(x1)P(x2x1)P(x3x2)P(x4x3)P(x_1, x_2, x_3, x_4) = P(x_1) P(x_2 | x_1) P(x_3 | x_2) P(x_4 | x_3):

Step 1: Eliminate x1x_1

x1P(x1)P(x2x1)=P(x2)\sum_{x_1} P(x_1) P(x_2 | x_1) = P(x_2)

Step 2: Eliminate x2x_2

x2P(x2)P(x3x2)=P(x3)\sum_{x_2} P(x_2) P(x_3 | x_2) = P(x_3)

Step 3: Eliminate x4x_4

x4P(x3)P(x4x3)=P(x3)\sum_{x_4} P(x_3) P(x_4 | x_3) = P(x_3)

Result: P(x3)P(x_3) (marginal probability of x3x_3).

Complexity

Time complexity depends on elimination order. Optimal order minimizes the size of intermediate factors. For tree structures, variable elimination is efficient. For graphs with cycles, complexity can be exponential.

Belief Propagation (Message Passing)

Belief Propagation (BP), also called the sum-product algorithm, is an efficient exact inference method for tree-structured graphs (acyclic graphs). It works by passing messages between nodes.

Message Passing Formula

Message from node ii to node jj:

mij(xj)=xiψij(xi,xj)ϕi(xi)kn(i)jmki(xi)m_{i \to j}(x_j) = \sum_{x_i} \psi_{ij}(x_i, x_j) \phi_i(x_i) \prod_{k \in n(i) \setminus j} m_{k \to i}(x_i)

Where:

  • ψij(xi,xj)\psi_{ij}(x_i, x_j) = potential function on edge (i, j)
  • ϕi(xi)\phi_i(x_i) = local potential at node i
  • n(i)jn(i) \setminus j = neighbors of i except j
  • • Messages are passed from leaves to root, then root to leaves

Marginal Probability Formula

After message passing, marginal probability of node ii:

P(xi)ϕi(xi)jn(i)mji(xi)P(x_i) \propto \phi_i(x_i) \prod_{j \in n(i)} m_{j \to i}(x_i)

Normalize to ensure probabilities sum to 1.

Algorithm Steps

  1. 1. Initialize: Set all messages to 1 (or uniform distribution).
  2. 2. Upward pass (leaves → root):
    • • Start from leaf nodes (nodes with only one neighbor)
    • • Each leaf sends message to its neighbor
    • • Continue until root receives messages from all children
  3. 3. Downward pass (root → leaves):
    • • Root sends messages to all children
    • • Each node sends messages to children after receiving from parent
    • • Continue until all leaves receive messages
  4. 4. Compute marginals: For each node, combine incoming messages with local potential to get marginal probability.

Applicable Scenarios

Tree Structures (Acyclic Graphs)

Exact inference is most efficient for tree-structured graphs (no cycles):

  • Belief propagation: Guaranteed to converge and give exact results
  • Time complexity: O(NS2)O(N \cdot |S|^2) where N = nodes, |S| = state space size
  • Examples: Chains (HMM), trees (hierarchical models), polytrees

Graphs with Cycles

For graphs with cycles, exact inference becomes intractable:

  • Variable elimination: Complexity can be exponential
  • Belief propagation: May not converge or give incorrect results
  • Solution: Use approximate inference methods (see next module)

Small Graphs

For small graphs (few variables, small state spaces), exact inference is always feasible:

  • Brute force: Enumerate all possible assignments
  • Variable elimination: Efficient even with cycles if graph is small
  • Examples: Medical diagnosis networks with <10 variables

Advantages and Limitations

Advantages

  • Exact results: No approximation error
  • Efficient for trees: Polynomial time complexity
  • Interpretable: Clear algorithmic steps
  • Deterministic: Same input always gives same output

Limitations

  • Only for trees: Requires acyclic graph structure
  • Intractable for cycles: Exponential complexity
  • Large state spaces: Can be slow for many states
  • Memory intensive: May require storing large factors