# Probabilistic Graphical Models

| Tags Machine Learning

Probabilistic Graphical Model is a probabilistic model for which a graph denotes the conditional dependence structure between random variables. They are commonly used in probability theory, statics–particularly Bayesian statics–and machine learning.

## Preliminaries

### Factors

Factor is a fundamental building block for defining distributions in high-dimensional spaces. Factor product defined as below $$\\phi(a_1, b_1) \\phi(b_1, c_1) = \\phi(a_1, b_1, c_1)$$

### Reasoning Patterns

• Causal Reasoning
• Evidential Reasoning
• Intercausal Reasoning

### Independence

For random variables $$X$$$, $$Y$$$, $$P \\models X \\perp Y$$$if: • $$P(X,Y) = P(X)P(Y)$$$
• $$P(X \\mid Y) = P(X)$$$• $$P(Y \\mid X) = P(Y)$$$

For random variables $$X$$$, $$Y$$$, $$Z$$$, $$P \\models (X \\perp Y \\mid Z)$$$ if:

• $$P(X, Y \\mid Z) = P(X \\mid Z)P(Y \\mid Z)$$$• $$P(X \\mid Y, Z) = P(X \\mid Z)$$$
• $$P(Y \\mid X, Z) = P(Y \\mid Z)$$$• $$P(X, Y, Z) \\propto \\phi(X, Z) \\phi(Y, Z)$$$

## Bayesian Network

Bayesian Network is a directed acyclic graph(DAG) Nodes represent the random variables $$X_1$$$, $$X_2$$$,…,$$X_n$$$, each node $$X_i$$$ represents a CPD $$P(X_i \\mid Par_G(X_i))$$$, the joint distribution represented by this graph is $$P(X_1, X_2, …, X_n) = \prod_i^n P(X_i \\mid Par_G(X_i))$$ Naive Bayes is a bayesian network with very strong independence assumptions that every pair of features $$X_i$$$ and $$X_j$$$are conditionally independent given class. that is $$P(X_i \\perp X_j \\mid C)$$ Naive Bayes can be classified into Bernoulli Naive Bayes and Multinomial Naive Bayes according to the distribution over features. Dynamic Bayesian Networks are a compact representation for encoding structured distributions over arbitrarily long temporal trajectories, they make assumptions: • Markov assumption • Time invariance Two equivalent views of Bayesian Network structure: • Factorization: G allows P to be represented • I-map: Independencies encoded by G hold in P If P factorizes over a graph G, we can read from the graph independences that must hold in P (an independency map) ## Markov Network Pairwise Markov Network is an undirected graph whose nodes represent the random variables $$X_1$$$, $$X_2$$$, …, $$X_n$$$ and each edge $$X_i - X_j$$$is associated with a factor(potential) $$\\phi_{ij}(X_i - X_j)$$$. Two equivalent(for positive distributions) views of graph structure:

• Factorization: H allows P to be represented
• I-map: Independencies encoded by H hold in P

If P factorizes over a graph H, we can read from the graph independencies that must hold in P(an independency map)