Probabilistic Graphical Model is a probabilistic model for which a graph denotes the conditional dependence structure between random variables. They are commonly used in probability theory, statics–particularly Bayesian statics–and machine learning.
Preliminaries
Factors
Factor is a fundamental building block for defining distributions in high-dimensional spaces. Factor product defined as below \(\\phi(a_1, b_1) \\phi(b_1, c_1) = \\phi(a_1, b_1, c_1)\)
Reasoning Patterns
- Causal Reasoning
- Evidential Reasoning
- Intercausal Reasoning
Independence
For random variables \($ X\)$, \($ Y\)$, \($ P \\models X \\perp Y\)$ if:
- \($ P(X,Y) = P(X)P(Y)\)$
- \($ P(X \\mid Y) = P(X)\)$
- \($ P(Y \\mid X) = P(Y)\)$
For random variables \($ X\)$, \($ Y\)$, \($ Z\)$, \($ P \\models (X \\perp Y \\mid Z)\)$ if:
- \($ P(X, Y \\mid Z) = P(X \\mid Z)P(Y \\mid Z)\)$
- \($ P(X \\mid Y, Z) = P(X \\mid Z)\)$
- \($ P(Y \\mid X, Z) = P(Y \\mid Z)\)$
- \($ P(X, Y, Z) \\propto \\phi(X, Z) \\phi(Y, Z)\)$
Bayesian Network
Bayesian Network is a directed acyclic graph(DAG)
Nodes represent the random variables \($X_1\)$, \($X_2\)$,…,\($X_n\)$, each node \($X_i\)$ represents a CPD \($P(X_i \\mid Par_G(X_i))\)$, the joint distribution represented by this graph is \(P(X_1, X_2, …, X_n) = \prod_i^n P(X_i \\mid Par_G(X_i))\)
Naive Bayes is a bayesian network with very strong independence assumptions that every pair of features \($X_i\)$ and \($X_j\)$ are conditionally independent given class. that is \(P(X_i \\perp X_j \\mid C)\)
Naive Bayes can be classified into Bernoulli Naive Bayes and Multinomial Naive Bayes according to the distribution over features.
Dynamic Bayesian Networks are a compact representation for encoding structured distributions over arbitrarily long temporal trajectories, they make assumptions:
- Markov assumption
- Time invariance
Two equivalent views of Bayesian Network structure:
- Factorization: G allows P to be represented
- I-map: Independencies encoded by G hold in P
If P factorizes over a graph G, we can read from the graph independences that must hold in P (an independency map)
Markov Network
Pairwise Markov Network is an undirected graph whose nodes represent the random variables \($X_1\)$, \($X_2\)$, …, \($X_n\)$ and each edge \($X_i - X_j\)$ is associated with a factor(potential) \($ \\phi_{ij}(X_i - X_j)\)$.
Two equivalent(for positive distributions) views of graph structure:
- Factorization: H allows P to be represented
- I-map: Independencies encoded by H hold in P
If P factorizes over a graph H, we can read from the graph independencies that must hold in P(an independency map)