# Probabilistic Graphical Models

| Tags Machine Learning

Probabilistic Graphical Model is a probabilistic model for which a graph denotes the conditional dependence structure between random variables. They are commonly used in probability theory, statics--particularly Bayesian statics--and machine learning.

## Preliminaries

### Factors

Factor is a fundamental building block for defining distributions in high-dimensional spaces. Factor product defined as below $$\phi(a_1, b_1) \phi(b_1, c_1) = \phi(a_1, b_1, c_1)$$

### Reasoning Patterns

• Causal Reasoning
• Evidential Reasoning
• Intercausal Reasoning

### Independence

For random variables $$X$$$, $$Y$$$, $$P \models X \perp Y$$$if: • $$P(X,Y) = P(X)P(Y)$$$
• $$P(X \mid Y) = P(X)$$$• $$P(Y \mid X) = P(Y)$$$

For random variables $$X$$$, $$Y$$$, $$Z$$$, $$P \models (X \perp Y \mid Z)$$$ if:

• $$P(X, Y \mid Z) = P(X \mid Z)P(Y \mid Z)$$$• $$P(X \mid Y, Z) = P(X \mid Z)$$$
• $$P(Y \mid X, Z) = P(Y \mid Z)$$$• $$P(X, Y, Z) \propto \phi(X, Z) \phi(Y, Z)$$$

## Bayesian Network

Bayesian Network is a directed acyclic graph(DAG)

Nodes represent the random variables $$X_1$$$, $$X_2$$$,…,$$X_n$$$, each node $$X_i$$$ represents a CPD $$P(X_i \mid Par_G(X_i))$$$, the joint distribution represented by this graph is $$P(X_1, X_2, …, X_n) = \prod_in P(X_i \mid Par_G(X_i))$$ Naive Bayes is a bayesian network with very strong independence assumptions that every pair of features $$X_i$$$ and $$X_j$$$are conditionally independent given class. that is $$P(X_i \perp X_j \mid C)$$ Naive Bayes can be classified into Bernoulli Naive Bayes and Multinomial Naive Bayes according to the distribution over features. Dynamic Bayesian Networks are a compact representation for encoding structured distributions over arbitrarily long temporal trajectories, they make assumptions: • Markov assumption • Time invariance Two equivalent views of Bayesian Network structure: • Factorization: G allows P to be represented • I-map: Independencies encoded by G hold in P If P factorizes over a graph G, we can read from the graph independences that must hold in P (an independency map) ## Markov Network Pairwise Markov Network is an undirected graph whose nodes represent the random variables $$X_1$$$, $$X_2$$$, …, $$X_n$$$ and each edge $$X_i - X_j$$$is associated with a factor(potential) $$\phi_{ij}(X_i - X_j)$$$.

Two equivalent(for positive distributions) views of graph structure:

• Factorization: H allows P to be represented
• I-map: Independencies encoded by H hold in P

If P factorizes over a graph H, we can read from the graph independencies that must hold in P(an independency map)