Bayesian Network

Note

Here I use \(\perp\!\!\!\perp\) as an independence symbol.

A Bayesian network(BN) is a directed acyclic graph (DAG) in which nodes represent random variables, whose joint distribution is as follows,

\[p(x_1, ..., x_D) = \prod_{i=1}^D p\big(x_i| pa(x_i)\big)\]

where \(pa(x_i)\) represents the parents of \(x_i\).

alternate text

< An example of a bayesian network. Source: Aalto course CS-E4820: Advanced probabilistic methods >

BNs are used in ML, because they are

  • a concise way to represent and communicate the structure and assumptions of a model.
  • a compact representation of the joint distribution. => efficient!

Independence in Bayesian networks

Example 1

alternate text
Independent? D connection
\(A \perp\!\!\!\perp B\)
\(A \perp\!\!\!\perp B | C\) Separated
\(A \perp\!\!\!\perp B | E\) Separated
\(D \not\!\perp\!\!\!\perp E | C\) Connected

Example 2

alternate text

< Conditional Independence >

alternate text

< Marginal Independence >

Here’s my real life example about marginal independence. Say you won a $1M lottery(C). You can either buy a $1M house(A) or buy a $1M Ferrari(B). If you’ve bought a Ferrari, you haven’t bought a house for sure.


Collider

A cllider (v-structure, head-to-head meeting) has two incoming arrows along a chosen path.

alternate text

< Source: Aalto course CS-E4820: Advanced probabilistic methods >


D-Connection & D-Separation

  • A path between variables A and B is blocked by a set of variables \(\mathcal{C}\), if
    • there is a collider in the path such that neither the collider nor any of its descendasnts is in the conditioning set \(\mathcal{C}\).
    • there is a non-collider in the path that is in the conditioning set \(\mathcal{C}\).
  • Sets of variables \(\mathcal{A}\) and \(\mathcal{B}\) are d-separated by \(\mathcal{C}\) if all paths between \(\mathcal{A}\) and \(\mathcal{B}\) are blocked by \(\mathcal{C}\).
    • d-separation implies \(A \perp\!\!\!\perp B | C\)
    • \(\mathcal{X}\) and \(\mathcal{Y}\) are d-separated by \(\mathcal{Z}\) in \(G\) iff they are not d-connected by \(\mathcal{Z}\) in \(G\).

Bottom line

  • Non-collider in the conditioning set \(\Rightarrow\) Blocked \(\Rightarrow\) d-separated \(\Rightarrow\) conditionally independent BUT unconditionally dependent
  • Collider or its descendants in the conditioning set \(\Rightarrow\) Not blocked \(\Rightarrow\) d-connected \(\Rightarrow\) conditionally dependent BUT unconditionally independent

Examples

alternate text

< b d-separates a from e. {b,d} d-connect a from e. >

alternate text

< c and e are (unconditionally) d-connected. b d-connects a and e >

alternate text

< t and f are d-connected by g >

alternate text

< b and f are d-separated by u >

Markov equivalence

Two graphs are Markov equivalent if they

  • entail(need) the same conditional independencies
  • equivalently have the same d-separations
alternate text

< A markov equivalent example >

Graph

alternate text
  • Parent: pa(D) = {A,C}
  • Children: ch(D) = E
  • Family: A node itself and its parents.
    • fa(E) = {B,D,E,F}
  • Markov blanket: A node itself, its parents, children and the parents of its children.
    • MB(B) = {A,B,C,D,E,F}