Probabilistic Graphical Models CMSC 678 UMBC Probabilistic - PowerPoint PPT Presentation

Multinomial Naïve Bayes: A Generative Story Generative Story 𝜚 = distribution over 𝐿 labels y for label 𝑙 = 1 to 𝐿: 𝜄 𝑙 = distribution over J feature values for item 𝑗 = 1 to 𝑂: 𝑦 𝑗1 𝑦 𝑗2 𝑦 𝑗3 𝑦 𝑗4 𝑦 𝑗5 𝑧 𝑗 ~ Cat 𝜚 for each feature 𝑘 𝑦 𝑗𝑘 ∼ Cat(𝜄 𝑧 𝑗 ) Maximize Log-likelihood ℒ 𝜄 = ෍ ෍ log 𝜄 𝑧 𝑗 ,𝑦 𝑗,𝑘 + ෍ log 𝜚 𝑧 𝑗 s. t. 𝑗 𝑘 𝑗 𝜚 𝑙 ≥ 0 ෍ 𝜄 𝑙𝑘 = 1 ∀𝑙 ෍ 𝜚 𝑙 = 1 𝜄 𝑙𝑘 ≥ 0, 𝑘 𝑙

Multinomial Naïve Bayes: A Generative Story Generative Story 𝜚 = distribution over 𝐿 labels y for label 𝑙 = 1 to 𝐿: 𝜄 𝑙 = distribution over J feature values for item 𝑗 = 1 to 𝑂: 𝑦 𝑗1 𝑦 𝑗2 𝑦 𝑗3 𝑦 𝑗4 𝑦 𝑗5 𝑧 𝑗 ~ Cat 𝜚 for each feature 𝑘 𝑦 𝑗𝑘 ∼ Cat(𝜄 𝑧 𝑗 ,𝑘 ) Maximize Log-likelihood via Lagrange Multipliers ( ≥ 𝟏 constraints not shown) ℒ 𝜄 = ෍ ෍ log 𝜄 𝑧 𝑗 ,𝑦 𝑗,𝑘 + ෍ log 𝜚 𝑧 𝑗 − 𝜈 ෍ 𝜚 𝑙 − 1 − ෍ 𝜇 𝑙 ෍ 𝜄 𝑙𝑘 − 1 𝑗 𝑘 𝑗 𝑙 𝑙 𝑘

Multinomial Naïve Bayes: Learning Calculate feature generation terms Calculate class priors For each k : For each k : obs k = single object containing all items k = all items with class = k items labeled as k Foreach feature j n kj = # of occurrences of j in obs k 𝑜 𝑙𝑘 𝑞 𝑙 = |items 𝑙 | 𝑞 𝑘|𝑙 = σ 𝑘 ′ 𝑜 𝑙𝑘 ′ # items

Brill and Banko (2001) With enough data, the classifier may not matter Adapted from Jurafsky & Martin (draft)

Summary: Naïve Bayes is Not So Naïve, but not without issue Pro Con Model the posterior in one go? Very Fast, low storage requirements (e.g., use conditional maxent) Robust to Irrelevant Features Are the features really uncorrelated? Very good in domains with many equally important features Are plain counts always appropriate? Optimal if the independence assumptions hold Are there “better” ways of handling missing/noisy data? Dependable baseline for text (automated, more principled) classification (but often not the best) Adapted from Jurafsky & Martin (draft)

Outline Directed Graphical Models Naïve Bayes Undirected Graphical Models Factor Graphs Ising Model Message Passing: Graphical Model Inference

Undirected Graphical Models An undirected graph G=(V,E) that represents a probability distribution over random variables 𝑌 1 , … , 𝑌 𝑂 Joint probability factorizes based on cliques in the graph

Undirected Graphical Models An undirected graph G=(V,E) that represents a probability distribution over random variables 𝑌 1 , … , 𝑌 𝑂 Joint probability factorizes based on cliques in the graph Common name: Markov Random Fields

Undirected Graphical Models An undirected graph G=(V,E) that represents a probability distribution over random variables 𝑌 1 , … , 𝑌 𝑂 Joint probability factorizes based on cliques in the graph Common name: Markov Random Fields Undirected graphs can have an alternative formulation as Factor Graphs

Markov Random Fields: Undirected Graphs 𝑞 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂

Markov Random Fields: Undirected Graphs clique : subset of nodes, where nodes are pairwise connected maximal clique : a clique that cannot add a node and remain a clique 𝑞 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂

Markov Random Fields: Undirected Graphs clique : subset of nodes, where nodes are pairwise connected maximal clique : a clique that cannot add a node and remain a clique 𝑞 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = 1 𝑎 ෑ 𝜔 𝐷 𝑦 𝑑 𝐷 variables part of the clique C global normalization maximal potential function (not cliques necessarily a probability!)

Markov Random Fields: Undirected Graphs clique : subset of nodes, where nodes are pairwise connected maximal clique : a clique that cannot add a node and remain a clique 𝑞 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = 1 𝑎 ෑ 𝜔 𝐷 𝑦 𝑑 𝐷 variables part Q : What restrictions should we of the clique C place on the potentials 𝜔 𝐷 ? global normalization maximal potential function (not cliques necessarily a probability!)

Markov Random Fields: Undirected Graphs clique : subset of nodes, where nodes are pairwise connected maximal clique : a clique that cannot add a node and remain a clique 𝑞 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = 1 𝑎 ෑ 𝜔 𝐷 𝑦 𝑑 𝐷 variables part Q : What restrictions should we of the clique C place on the potentials 𝜔 𝐷 ? global normalization maximal potential function (not A : 𝜔 𝐷 ≥ 0 (or 𝜔 𝐷 > 0 ) cliques necessarily a probability!)

Terminology: Potential Functions 𝑞 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = 1 𝑎 ෑ 𝜔 𝐷 𝑦 𝑑 𝐷 energy function (for clique C) (get the total energy of a configuration by summing the individual energy functions) 𝜔 𝐷 𝑦 𝑑 = exp −𝐹(𝑦 𝐷 ) Boltzmann distribution

Ambiguity in Undirected Model Notation 𝑞 𝑦, 𝑧, 𝑨 ∝ 𝜔(𝑦, 𝑧, 𝑨) X Y Z 𝑞 𝑦, 𝑧, 𝑨 ∝ 𝜔 1 𝑦,𝑧 𝜔 2 𝑧,𝑨 𝜔 3 𝑦,𝑨

MRFs as Factor Graphs Undirected graphs: G=(V,E) that represents 𝑞(𝑌 1 , … , 𝑌 𝑂 ) Factor graph of p : Bipartite graph of evidence nodes X, factor nodes F, and edges T Evidence nodes X are the random variables Factor nodes F take values associated with the potential functions Edges show what variables are used in which factors

MRFs as Factor Graphs Undirected graphs: X G=(V,E) that represents Y Z 𝑞(𝑌 1 , … , 𝑌 𝑂 ) Factor graph of p : Bipartite graph of evidence nodes X, factor nodes F, and edges T

MRFs as Factor Graphs Undirected graphs: X G=(V,E) that represents 𝑞(𝑌 1 , … , 𝑌 𝑂 ) Y Z Factor graph of p : Bipartite graph of evidence nodes X, factor nodes F, and X edges T Y Z Evidence nodes X are the random variables

MRFs as Factor Graphs Undirected graphs: G=(V,E) X that represents 𝑞(𝑌 1 , … , 𝑌 𝑂 ) Y Z Factor graph of p : Bipartite graph of evidence nodes X, factor nodes F, and edges T Evidence nodes X are the X random variables Factor nodes F take values Y Z associated with the potential functions

MRFs as Factor Graphs Undirected graphs: G=(V,E) that X represents 𝑞(𝑌 1 , … , 𝑌 𝑂 ) Factor graph of p : Bipartite graph of evidence nodes X, Y Z factor nodes F, and edges T Evidence nodes X are the random variables X Factor nodes F take values associated with the potential functions Y Z Edges show what variables are used in which factors

Different Factor Graph Notation for the Same Graph X X Y Z Y Z X Y Z

Directed vs. Undirected Models: Moralization x 1 x 3 x 2 x 4

Directed vs. Undirected Models: Moralization x 1 x 3 x 1 x 3 x 2 x 2 x 4 x 4 𝑞 𝑦 1 , … , 𝑦 4 = 𝑞 𝑦 1 𝑞 𝑦 2 𝑞 𝑦 3 𝑞(𝑦 4 |𝑦 1 , 𝑦 2 , 𝑦 3 )

Directed vs. Undirected Models: Moralization x 1 x 3 x 1 x 3 x 2 x 2 x 4 x 4 𝑞 𝑦 1 , … , 𝑦 4 = parents of nodes in a 𝑞 𝑦 1 𝑞 𝑦 2 𝑞 𝑦 3 𝑞(𝑦 4 |𝑦 1 , 𝑦 2 , 𝑦 3 ) directed graph must be connected in an undirected graph

Example: Linear Chain z 1 z 2 z 3 z 4 Directed (e.g., hidden Markov model [HMM]; generative) w 1 w 2 w 3 w 4

Example: Linear Chain z 1 z 2 z 3 z 4 Directed (e.g., hidden Markov model [HMM]; generative) w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 Directed (e.g.., maximum entropy Markov model [MEMM]; conditional) w 1 w 2 w 3 w 4

Example: Linear Chain z 1 z 2 z 3 z 4 Directed (e.g., hidden Markov model [HMM]; generative) w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 Directed (e.g.., maximum entropy Markov model [MEMM]; conditional) w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 Undirected (e.g., conditional random field w 1 w 2 w 3 w 4 [CRF])

Example: Linear Chain z 1 z 2 z 3 z 4 Directed (e.g., hidden Markov model [HMM]; generative) w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 Directed (e.g.., maximum entropy Markov model [MEMM]; conditional) w 1 w 2 w 3 w 4 z 1 z 2 z 3 z 4 Undirected as factor graph (e.g., conditional random field [CRF])

Example: Linear Chain Conditional Random Field z 1 z 2 z 3 z 4 Widely used in applications like part-of-speech tagging Noun-Mod Noun Noun Verb President Obama told Congress …

Example: Linear Chain Conditional Random Field z 1 z 2 z 3 z 4 Widely used in applications like part-of-speech tagging Noun-Mod Noun Noun Verb President Obama told Congress … and named entity recognition Person Person Org. Other President Obama told Congress …

Linear Chain CRFs for Part of Speech Tagging A linear chain CRF is a conditional probabilistic model of the sequence of tags 𝑨 1 , 𝑨 2 , … , 𝑨 𝑂 conditioned on the entire input sequence 𝑦 1:𝑂

Linear Chain CRFs for Part of Speech Tagging 𝑞 ♣|♢ A linear chain CRF is a conditional probabilistic model of the sequence of tags 𝑨 1 , 𝑨 2 , … , 𝑨 𝑂 conditioned on the entire input sequence 𝑦 1:𝑂

Linear Chain CRFs for Part of Speech Tagging 𝑞 𝑨 1 , 𝑨 2 , … , 𝑨 𝑂 |♢ A linear chain CRF is a conditional probabilistic model of the sequence of tags 𝑨 1 , 𝑨 2 , … , 𝑨 𝑂 conditioned on the entire input sequence 𝑦 1:𝑂

Linear Chain CRFs for Part of Speech Tagging 𝑞 𝑨 1 , 𝑨 2 , … , 𝑨 𝑂 |𝑦 1:𝑂 A linear chain CRF is a conditional probabilistic model of the sequence of tags 𝑨 1 , 𝑨 2 , … , 𝑨 𝑂 conditioned on the entire input sequence 𝑦 1:𝑂

Linear Chain CRFs for Part of Speech Tagging 𝑞 𝑨 1 , 𝑨 2 , … , 𝑨 𝑂 |𝑦 1:𝑂 𝑕 1 𝑕 3 z 1 z 2 𝑕 2 z 3 z 4 𝑕 4 𝑔 𝑔 𝑔 𝑔 1 2 3 4

Linear Chain CRFs for Part of Speech Tagging 𝑕 1 𝑕 3 z 1 z 2 𝑕 2 z 3 z 4 𝑕 4 𝑔 𝑔 𝑔 𝑔 1 2 3 4 𝑞 𝑨 1 , 𝑨 2 , … , 𝑨 𝑂 |𝑦 1:𝑂 ∝ N exp( 𝜄 𝑔 , 𝑔 + 𝜄 𝑕 , 𝑕 𝑗 𝑨 𝑗 , 𝑨 𝑗+1 ෑ 𝑗 𝑨 𝑗 ) i=1

Linear Chain CRFs for Part of Speech Tagging 𝑕 𝑘 : inter-tag features (can depend on any/all input words 𝑦 1:𝑂 ) 𝑕 1 𝑕 3 z 1 z 2 𝑕 2 z 3 z 4 𝑕 4 𝑔 𝑔 𝑔 𝑔 1 2 3 4

Linear Chain CRFs for Part of Speech Tagging 𝑕 𝑘 : inter-tag features 𝑔 𝑗 : solo tag features (can depend on (can depend on any/all input words any/all input words 𝑦 1:𝑂 ) 𝑦 1:𝑂 ) 𝑕 1 𝑕 3 z 1 z 2 𝑕 2 z 3 z 4 𝑕 4 𝑔 𝑔 𝑔 𝑔 1 2 3 4

Linear Chain CRFs for Part of Speech Tagging 𝑕 𝑘 : inter-tag features 𝑔 𝑗 : solo tag features (can depend on (can depend on any/all input words any/all input words 𝑦 1:𝑂 ) 𝑦 1:𝑂 ) Feature design, just 𝑕 1 𝑕 3 z 1 z 2 𝑕 2 z 3 z 4 𝑕 4 like in maxent 𝑔 𝑔 𝑔 𝑔 1 2 models! 3 4

Linear Chain CRFs for Part of Speech Tagging 𝑕 𝑘 : inter-tag features 𝑔 𝑗 : solo tag features (can depend on (can depend on any/all input words any/all input words 𝑦 1:𝑂 ) 𝑦 1:𝑂 ) Example: 𝑕 𝑘,𝑂→𝑊 z j , z j+1 = 1 (if z j == N & z j+1 == V) else 0 𝑕 𝑘,told,𝑂→𝑊 z j , z j+1 = 1 (if z j == N & z j+1 == V & x j == told) else 0 𝑕 1 𝑕 3 z 1 z 2 𝑕 2 z 3 z 4 𝑕 4 𝑔 𝑔 𝑔 𝑔 1 2 3 4

Example: Ising Model Image denoising (Bishop, 2006; Fig 8.30) y: observed (noisy) pixel/state w/ 10% noise original X Y x: original pixel/state

Example: Ising Model Image denoising (Bishop, 2006; Fig 8.30) y: observed (noisy) pixel/state w/ 10% noise original x: original pixel/state two solutions Q : What are the cliques?

Example: Ising Model y: Image denoising (Bishop, 2006; Fig 8.30) observed (noisy) pixel/state w/ 10% noise original x: original pixel/state two solutions neighboring pixels should be similar 𝐹 𝑦, 𝑧 = ℎ ෍ 𝑦 𝑗 − 𝛾 ෍ 𝑦 𝑗 𝑦 𝑘 − 𝜃 ෍ 𝑦 𝑗 𝑧 𝑗 𝑗 𝑗𝑘 𝑗 x i and y i should allow for a bias be correlated

Example: Ising Model y: Image denoising (Bishop, 2006; Fig 8.30) observed (noisy) pixel/state w/ 10% noise original x: original pixel/state two solutions neighboring pixels should be similar Q : Why subtract β and η ? 𝐹 𝑦, 𝑧 = ℎ ෍ 𝑦 𝑗 − 𝛾 ෍ 𝑦 𝑗 𝑦 𝑘 − 𝜃 ෍ 𝑦 𝑗 𝑧 𝑗 𝑗 𝑗𝑘 𝑗 x i and y i should allow for a bias be correlated

Example: Ising Model y: Image denoising (Bishop, 2006; Fig 8.30) observed (noisy) pixel/state w/ 10% noise original x: original pixel/state two solutions neighboring pixels should be similar Q : Why subtract β and η ? 𝐹 𝑦, 𝑧 = ℎ ෍ 𝑦 𝑗 − 𝛾 ෍ 𝑦 𝑗 𝑦 𝑘 − 𝜃 ෍ 𝑦 𝑗 𝑧 𝑗 A : Better states → lower 𝑗 𝑗𝑘 𝑗 energy (higher potential) x i and y i should allow for a bias 𝜔 𝐷 𝑦 𝑑 = exp −𝐹(𝑦 𝐷 ) be correlated

Markov Random Fields with Factor Graph Notation unary y: observed factor (noisy) pixel/state variable factor nodes are added according to maximal x: original cliques pixel/state binary factor factor graphs are bipartite

Two Problems for Undirected Models 𝑞 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = 1 𝑎 ෑ 𝜔 𝐷 𝑦 𝑑 𝐷 Finding the normalizer Computing the marginals

Two Problems for Undirected Models 𝑞 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = 1 𝑎 ෑ 𝜔 𝐷 𝑦 𝑑 𝐷 Finding the normalizer Computing the marginals 𝑎 = ෍ ෑ 𝜔 𝑑 (𝑦 𝑑 ) 𝑦 𝑑

Two Problems for Undirected Models 𝑞 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = 1 𝑎 ෑ 𝜔 𝐷 𝑦 𝑑 𝐷 Finding the normalizer Computing the marginals Sum over all variable combinations, with the x n coordinate fixed 𝑎 𝑜 (𝑤) = ෍ ෑ 𝜔 𝑑 (𝑦 𝑑 ) 𝑎 = ෍ ෑ 𝜔 𝑑 (𝑦 𝑑 ) 𝑦:𝑦 𝑜 =𝑤 𝑑 𝑦 𝑑 Example: 3 variables, fix the 2 nd dimension 𝑎 2 (𝑤) = ෍ ෍ ෑ 𝜔 𝑑 (𝑦 = 𝑦 1 , 𝑤, 𝑦 3 ) 𝑦 1 𝑦 3 𝑑

Two Problems for Undirected Models 𝑞 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = 1 𝑎 ෑ 𝜔 𝐷 𝑦 𝑑 𝐷 Finding the normalizer Computing the marginals Sum over all variable combinations, with the x n coordinate fixed 𝑎 𝑜 (𝑤) = ෍ ෑ 𝜔 𝑑 (𝑦 𝑑 ) 𝑎 = ෍ ෑ 𝜔 𝑑 (𝑦 𝑑 ) 𝑦:𝑦 𝑜 =𝑤 𝑑 𝑦 𝑑 Example: 3 Q : Why are these difficult? variables, fix the 2 nd dimension 𝑎 2 (𝑤) = ෍ ෍ ෑ 𝜔 𝑑 (𝑦 = 𝑦 1 , 𝑤, 𝑦 3 ) 𝑦 1 𝑦 3 𝑑

Two Problems for Undirected Models 𝑞 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = 1 𝑎 ෑ 𝜔 𝐷 𝑦 𝑑 𝐷 Finding the normalizer Computing the marginals Sum over all variable combinations, with the x n coordinate fixed 𝑎 𝑜 (𝑤) = ෍ ෑ 𝜔 𝑑 (𝑦 𝑑 ) 𝑎 = ෍ ෑ 𝜔 𝑑 (𝑦 𝑑 ) 𝑦:𝑦 𝑜 =𝑤 𝑑 𝑦 𝑑 Example: 3 Q : Why are these difficult? variables, fix the 2 nd dimension A : Many different combinations 𝑎 2 (𝑤) = ෍ ෍ ෑ 𝜔 𝑑 (𝑦 = 𝑦 1 , 𝑤, 𝑦 3 ) 𝑦 1 𝑦 3 𝑑

Message Passing: Count the Soldiers If you are the front soldier in the line, say the number ‘one’ to the soldier behind you. If you are the rearmost soldier in the line, say the number ‘one’ to the soldier in front of you. If a soldier ahead of or behind you says a number to you, add one to it, and say the new number to the soldier on the other side ITILA, Ch 16

Sum-Product Algorithm Main idea: message passing An exact inference algorithm for tree-like graphs Belief propagation (forward-backward for HMMs) is a special case

Sum-Product definition of 𝑞 𝑦 𝑗 = 𝑤 = ෑ 𝑞 𝑦 1 , 𝑦 2 , … , 𝑦 𝑗 , … , 𝑦 𝑂 marginal 𝑦:𝑦 𝑗 =𝑤 … …

Sum-Product definition of 𝑞 𝑦 𝑗 = 𝑤 = ෑ 𝑞 𝑦 1 , 𝑦 2 , … , 𝑦 𝑗 , … , 𝑦 𝑂 marginal 𝑦:𝑦 𝑗 =𝑤 main idea : use bipartite nature of graph to efficiently compute the marginals … … The factor nodes can act as filters

Sum-Product definition of 𝑞 𝑦 𝑗 = 𝑤 = ෑ 𝑞 𝑦 1 , 𝑦 2 , … , 𝑦 𝑗 , … , 𝑦 𝑂 marginal 𝑦:𝑦 𝑗 =𝑤 main idea : use bipartite nature of graph to efficiently compute the marginals 𝑠 𝑛→𝑜 … … 𝑠 𝑛→𝑜 𝑠 𝑛→𝑜

Sum-Product alternative 𝑞 𝑦 𝑗 = 𝑤 = ෑ 𝑠 𝑔→𝑦 𝑗 (𝑦 𝑗 ) marginal computation 𝑔 main idea : use bipartite nature of graph to efficiently compute the marginals 𝑠 𝑛→𝑜 … … 𝑠 𝑛→𝑜 𝑠 𝑛→𝑜

Sum-Product From variables to factors m 𝑟 𝑜→𝑛 𝑦 𝑜 = ෑ 𝑠 𝑛 ′ →𝑜 𝑦 𝑜 𝑛 ′ ∈𝑁(𝑜)\𝑛 n set of factors in which variable n participates default value of 1 if empty product

Sum-Product From variables to factors m 𝑟 𝑜→𝑛 𝑦 𝑜 = ෑ 𝑠 𝑛 ′ →𝑜 𝑦 𝑜 𝑛 ′ ∈𝑁(𝑜)\𝑛 n set of factors in which variable n participates default value of 1 if From factors to variables empty product 𝑠 𝑛→𝑜 𝑦 𝑜 m = ෍ 𝑔 𝑛 𝒙 𝑛 ෑ 𝑟 𝑜 ′ →𝑛 (𝑦 𝑜′ ) 𝑜 ′ ∈𝑂(𝑛)\𝑜 𝒙 𝑛 \𝑜 n sum over configuration of set of variables that the variables for the m th factor, m th factor depends on with variable n fixed

Probabilistic Graphical Models CMSC 678 UMBC Probabilistic - PowerPoint PPT Presentation

Probabilistic Graphical Models CMSC 678 UMBC Probabilistic Graphical Models A graph G that represents a probability distribution over random variables 1 , , Probabilistic Graphical Models A graph G that represents a

Probabilistic Graphical Models Probabilistic Graphical Models Variable elimination Siamak

Graphical Models Graphical Models Bayesian Networks Siamak Ravanbakhsh Fall 2019 Previously on

Probabilistic Graphical Models Probabilistic Graphical Models introduction to learning Siamak

Computer Science Let me be provocative Probabilistic graphical models is how we do probabilistic

Probabilistic Graphical Models Probabilistic Graphical Models Undirected Models Fall 2019

Probabilistic Graphical Models Probabilistic Graphical Models parameter learning in undirected

Probabilistic Graphical Models Probabilistic Graphical Models Gaussian Network Models Fall 2019

CS 6782: Fall 2010 Probabilistic Graphical Models Guozhang Wang December 10, 2010 1

Probabilistic Graphical Models Probabilistic Graphical Models Relationship between the directed

Probabilistic Graphical Models Probabilistic Graphical Models Review of probability theory

Probabilistic Graphical Models Probabilistic Graphical Models Loopy BP and Bethe Free Energy

Probabilistic Graphical Models Probabilistic Graphical Models Structure learning in Bayesian

Probabilistic Graphical Models Probabilistic Graphical Models MAP inference Siamak Ravanbakhsh

Probabilistic Graphical Models Probabilistic Graphical Models Markov Chain Monte Carlo Inference

The Elimination Algorithm Probabilistic Graphical Models (10- Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Parameter learning in Bayesian

Observables for O ( n ) model on the honeycomb lattice. New game. Alexander Glazman University

Topological Complexity for Quantum Information Zhengwei Liu Tsinghua University Joint with

Multiplicative chaos in random matrix theory and related fields Christian Webb Aalto University,

Interfaces in planar Ising and Potts models a review Yvan V elenik Universit de Genve

Markov Random Fields: Inference and Estimation SPiNCOM reading group April 24 th , 2017 Dimitris

New Skins for an Old Ceremony The Conformal Bootstrap and the Ising Model Sheer El-Showk

Modern Discrete Probability I - Introduction (continued) Review of Markov chains S ebastien

How to Share Best Security Practices Urpo Kaila, EUDAT Security Officer urpo.kaila@csc.fi,