Probabilistic Graphical Models

2013-04-26 | Tags Machine Learning

Probabilistic Graphical Model is a probabilistic model for which a graph denotes the conditional dependence structure between random variables. They are commonly used in probability theory, statics–particularly Bayesian statics–and machine learning.

Preliminaries

Factors

Factor is a fundamental building block for defining distributions in high-dimensional spaces. Factor product defined as below $\\phi(a_1, b_1) \\phi(b_1, c_1) = \\phi(a_1, b_1, c_1)$

Reasoning Patterns

Causal Reasoning
Evidential Reasoning
Intercausal Reasoning

Independence

For random variables $$ X$$, $$ Y$$, $$ P \\models X \\perp Y$$ if:

$$ P(X,Y) = P(X)P(Y)$$
$$ P(X \\mid Y) = P(X)$$
$$ P(Y \\mid X) = P(Y)$$

For random variables $$ X$$, $$ Y$$, $$ Z$$, $$ P \\models (X \\perp Y \\mid Z)$$ if:

$$ P(X, Y \\mid Z) = P(X \\mid Z)P(Y \\mid Z)$$
$$ P(X \\mid Y, Z) = P(X \\mid Z)$$
$$ P(Y \\mid X, Z) = P(Y \\mid Z)$$
$$ P(X, Y, Z) \\propto \\phi(X, Z) \\phi(Y, Z)$$

Bayesian Network

Bayesian Network is a directed acyclic graph(DAG)

Bayesian Network

Nodes represent the random variables $$X_1$$, $$X_2$$,…,$$X_n$$, each node $$X_i$$ represents a CPD $$P(X_i \\mid Par_G(X_i))$$, the joint distribution represented by this graph is $P(X_1, X_2, …, X_n) = \prod_i^n P(X_i \\mid Par_G(X_i))$

Naive Bayes is a bayesian network with very strong independence assumptions that every pair of features $$X_i$$ and $$X_j$$ are conditionally independent given class. that is $P(X_i \\perp X_j \\mid C)$

Naive Bayesian Network

Naive Bayes can be classified into Bernoulli Naive Bayes and Multinomial Naive Bayes according to the distribution over features.

Dynamic Bayesian Networks are a compact representation for encoding structured distributions over arbitrarily long temporal trajectories, they make assumptions:

Markov assumption
Time invariance

Two equivalent views of Bayesian Network structure:

Factorization: G allows P to be represented
I-map: Independencies encoded by G hold in P

If P factorizes over a graph G, we can read from the graph independences that must hold in P (an independency map)

Markov Network

Pairwise Markov Network is an undirected graph whose nodes represent the random variables $$X_1$$, $$X_2$$, …, $$X_n$$ and each edge $$X_i - X_j$$ is associated with a factor(potential) $$ \\phi_{ij}(X_i - X_j)$$.

Markov Network

Two equivalent(for positive distributions) views of graph structure:

Factorization: H allows P to be represented
I-map: Independencies encoded by H hold in P

If P factorizes over a graph H, we can read from the graph independencies that must hold in P(an independency map)

Previous Next