Bayesian Networks and Probabilistic Graphical Models

Posted on Mar 4, 2026 in Mathematics and Computer Science

1. Bayesian Networks (Directed Models)

Joint Probability Factorization:

Formula:

P(X₁, X₂, ..., Xₙ) = Π P(Xᵢ | parents(Xᵢ))

Variable Types:

Observed: User inputs and sensor measurements (Uₜ, Zₜ)
Latent/Hidden: States and landmarks (Xₜ, L)

Example Factorization:

Formula:

P(uₜ, l, xₜ, xₜ₊₁, zₜ, zₜ₊₁) = P(uₜ)P(l)P(xₜ|uₜ)P(xₜ₊₁|xₜ)P(zₜ|xₜ,l)P(zₜ₊₁|l,xₜ₊₁)

2. Conditional Independence and D-Separation

Blocking Rules:

Chain (A → B → C): Blocked if B is observed.
Fork (A ← B → C): Blocked if B is observed.
Collider (A → B ← C): Blocked if B is not observed; it is open if B or its descendants are observed.

Graphoid Axioms:

Decomposition: (X ⟂ Y, W | Z) ⇒ (X ⟂ Y | Z)
Weak Union: (X ⟂ Y, W | Z) ⇒ (X ⟂ Y | Z, W)
Contraction: (X ⟂ Y | Z) ∧ (X ⟂ W | Z, Y) ⇒ (X ⟂ Y, W | Z)
Intersection: (X ⟂ Y | Z, W) ∧ (X ⟂ W | Z, Y) ⇒ (X ⟂ Y, W | Z)

3. Inference Algorithms

Variable Elimination:

Sum out variables in order.
Store intermediate factors.

Belief (Sum-Product) Propagation:

Used for tree-structured graphs.
Messages: μ_{f→x} = Σ_{x_neighbors\f} f(X) Π μ_{y→f}

Max-Product Algorithm:

Find the Most Probable Explanation (MPE): argmax P(X | evidence).
Replace sums with max and track the argmax.

Marginal Probability Formula:

Formula:

P(X₂ | X₄) = P(X₂, X₄) / Σ_{X₂} P(X₂, X₄)

4. Junction Tree Algorithm

Steps:

Moralize: Marry all parents (drop directions and connect parents).
Triangulate: Add edges to eliminate cycles greater than three.
Form Clusters: Create clusters based on the elimination order.
Build Junction Tree: Connect clusters via separators.

Elimination Order Impact:

Good order: Results in small cliques and efficient inference.
Bad order: Results in large cliques and exponential complexity.

5. Treewidth

Definition:

Formula:

Treewidth = min_over_orders(max_clique_size) - 1

Finding Treewidth:

Try different elimination orders.
Find the smallest possible maximum clique.
Subtract 1 from the result.

6. Factor Graphs

Conversion from Bayesian Network:

Create a factor node for each Conditional Probability Table (CPT).
Connect variables to their respective factors.

Example:

Formula:

P(X₁)P(X₂)P(X₃|X₁,X₂)P(X₄|X₃)P(X₅|X₂) → Factors: f₁(X₁), f₂(X₂), f₁₂₃(X₁,X₂,X₃), f₃₄(X₃,X₄), f₂₅(X₂,X₅)

7. Parameter Learning (MLE)

Gaussian MLE:

Formula:

μ̂ = (1/N) Σ Xᵢ
σ̂² = (1/N) Σ (Xᵢ - μ̂)²

Linear Gaussian:

Formula:

X₃ ~ N(w₀ + w₁X₁ + w₂X₂, σ²)
Solve: A𝐰 = 𝐛 where Aᵢⱼ = Σ XᵢXⱼ, bᵢ = Σ X₃Xᵢ

8. Hidden Markov Models (HMMs)

Joint Probability:

Formula:

P(X, Y) = P(X₁) Π P(Xₜ | Xₜ₋₁) Π P(Yₜ | Xₜ)

With Additional Potentials:

Formula:

P(X, Y, Z) = P(X₁) Π P(Xₜ | Xₜ₋₁) Π P(Yₜ | Xₜ) Π ψ(Xₜ, Zₜ)

9. Key Probability Rules

Conditional Probability:

Formula:

P(A | B) = P(A, B) / P(B)

Chain Rule:

Formula:

P(A, B, C) = P(A) P(B | A) P(C | A, B)

Marginalization:

Formula:

P(A) = Σ_{B, C} P(A, B, C)

10. Common Patterns and Tricks

V-Structure (Collider):

Creates dependence when observed.
X → Z ← Y: X ⟂ Y but X ⟂̸ Y | Z.

Explaining Away:

Occurs when there are multiple causes for one effect.
Observing the effect makes the causes dependent on each other.

D-separation Quick Check:

Identify all paths between variables.
Check if any path is unblocked.
If all paths are blocked, the variables are independent.

Factorization Simplification:
Use conditional independence to reduce expressions:

Formula:

P(A | B, C) = P(A | B) if A ⟂ C | B

Bayesian Networks and Probabilistic Graphical Models

1. Bayesian Networks (Directed Models)

2. Conditional Independence and D-Separation

3. Inference Algorithms

4. Junction Tree Algorithm

5. Treewidth

6. Factor Graphs

7. Parameter Learning (MLE)

8. Hidden Markov Models (HMMs)

9. Key Probability Rules

10. Common Patterns and Tricks

Recent Notes

Subjects

Publicidad