Probabilistic Graphical Model is a probabilistic model for which a graph denotes the conditional dependence structure between random variables. They are commonly used in probability theory, statics--particularly Bayesian statics--and machine learning.
Preliminaries
Factors
Factor is a fundamental building block for defining distributions in high-dimensional spaces. Factor product defined as below $$ \phi(a_1, b_1) \phi(b_1, c_1) = \phi(a_1, b_1, c_1) $$
Reasoning Patterns
- Causal Reasoning
- Evidential Reasoning
- Intercausal Reasoning
Independence
For random variables $$$ X $$$, $$$ Y $$$, $$$ P \models X \perp Y $$$ if:
- $$$ P(X,Y) = P(X)P(Y) $$$
- $$$ P(X \mid Y) = P(X) $$$
- $$$ P(Y \mid X) = P(Y) $$$
For random variables $$$ X $$$, $$$ Y $$$, $$$ Z $$$, $$$ P \models (X \perp Y \mid Z)$$$ if:
- $$$ P(X, Y \mid Z) = P(X \mid Z)P(Y \mid Z)$$$
- $$$ P(X \mid Y, Z) = P(X \mid Z) $$$
- $$$ P(Y \mid X, Z) = P(Y \mid Z) $$$
- $$$ P(X, Y, Z) \propto \phi(X, Z) \phi(Y, Z)$$$
Bayesian Network
Bayesian Network is a directed acyclic graph(DAG)
Nodes represent the random variables $$$X_1$$$, $$$X_2$$$,…,$$$X_n$$$, each node $$$X_i$$$ represents a CPD $$$P(X_i \mid Par_G(X_i))$$$, the joint distribution represented by this graph is $$ P(X_1, X_2, …, X_n) = \prod_in P(X_i \mid Par_G(X_i)) $$
Naive Bayes is a bayesian network with very strong independence assumptions that every pair of features $$$X_i$$$ and $$$X_j$$$ are conditionally independent given class. that is $$ P(X_i \perp X_j \mid C) $$
Naive Bayes can be classified into Bernoulli Naive Bayes and Multinomial Naive Bayes according to the distribution over features.
Dynamic Bayesian Networks are a compact representation for encoding structured distributions over arbitrarily long temporal trajectories, they make assumptions:
- Markov assumption
- Time invariance
Two equivalent views of Bayesian Network structure:
- Factorization: G allows P to be represented
- I-map: Independencies encoded by G hold in P
If P factorizes over a graph G, we can read from the graph independences that must hold in P (an independency map)
Markov Network
Pairwise Markov Network is an undirected graph whose nodes represent the random variables $$$X_1$$$, $$$X_2$$$, …, $$$X_n$$$ and each edge $$$X_i - X_j$$$ is associated with a factor(potential) $$$ \phi_{ij}(X_i - X_j) $$$.
Two equivalent(for positive distributions) views of graph structure:
- Factorization: H allows P to be represented
- I-map: Independencies encoded by H hold in P
If P factorizes over a graph H, we can read from the graph independencies that must hold in P(an independency map)