From Elimination to Belief Propagation Recall that Induced - PDF document

School of Computer Science The Belief Propagation (Sum-Product) Algorithm Probabilistic Graphical Models (10- Probabilistic Graphical Models (10 -708) 708) Lecture 5, Sep 31, 2007 Receptor A Receptor A X 1 X 1 X 1 Receptor B Receptor B X 2 X 2 X 2 Eric Xing Eric Xing Kinase C Kinase C X 3 X 3 X 3 Kinase D Kinase D X 4 X 4 X 4 Kinase E Kinase E X 5 X 5 X 5 TF F TF F X 6 X 6 X 6 Reading: J-Chap 4 Gene G Gene G X 7 X 7 X 7 X 8 X 8 X 8 Gene H Gene H 1 From Elimination to Belief Propagation � Recall that Induced dependency during marginalization is captured in elimination cliques Summation <-> elimination � Intermediate term <-> elimination clique � B A B A A C A A C D A E F C D E Can this lead to an generic E � E F G inference algorithm? H Eric Xing 2 1

Tree GMs Directed tree: all Undirected tree: a Poly tree: can have nodes except the root unique path between multiple parents have exactly one any pair of nodes parent We will come back to this later Eric Xing 3 Equivalence of directed and undirected trees Any undirected tree can be converted to a directed tree by choosing a root � node and directing all edges away from it A directed tree and the corresponding undirected tree make the same � conditional independence assertions Parameterizations are essentially the same. � Undirected tree: � Directed tree: � Equivalence: � Evidence:? � Eric Xing 4 2

From elimination to message passing Recall ELIMINATION algorithm: � Choose an ordering Z in which query node f is the final node � Place all potentials on an active list � Eliminate node i by removing all potentials containing i , take sum/product over x i . � Place the resultant factor back on the list � For a TREE graph: � Choose query node f as the root of the tree � View tree as a directed tree with edges pointing towards from f � Elimination ordering based on depth-first traversal � Elimination of each node can be considered as message-passing (or Belief � Propagation) directly along tree branches, rather than on some transformed graphs � thus, we can use the tree itself as a data-structure to do general inference!! Eric Xing 5 The elimination algorithm Procedure Normalization ( φ ∗ ) Procedure Initialize ( G , Z ) P ( X | E )= φ ∗ ( X )/ ∑ x φ ∗ ( X ) Let Z 1 , . . . ,Z k be an ordering of Z 1. 1. such that Z i ≺ Z j iff i < j Initialize F with the full the set of 2. factors Procedure Evidence ( E ) for each i ∈ Ι E , 1. Procedure Sum-Product-Eliminate-Var ( F = F ∪δ ( E i , e i ) F , // Set of factors Z // Variable to be eliminated Procedure Sum-Product-Variable- Elimination ( F , Z , ≺ ) ) F ′ ← { φ ∈ F : Z ∈ Scope [ φ ]} 1. for i = 1, . . . , k 1. F ′′ ← F − F ′ 2. F ← Sum-Product-Eliminate-Var( F , Z i ) ψ ← ∏ φ ∈ F ′ φ φ ∗ ← ∏ φ∈ F φ 3. 2. τ ← ∑ Z ψ 4. return φ ∗ 3. return F ′′ ∪ { τ } Normalization ( φ ∗ ) 5. 4. Eric Xing 6 3

Message passing for trees Let m ij ( x i ) denote the factor resulting from f eliminating variables from bellow up to i , which is a function of x i : This is reminiscent of a message sent i from j to i . j k l m ij ( x i ) represents a "belief" of x i from x j ! Eric Xing 7 � Elimination on trees is equivalent to message passing along tree branches! f i j k l Eric Xing 8 4

The message passing protocol: A node can send a message to its neighbors when (and only when) � it has received messages from all its other neighbors. Computing node marginals: � Naïve approach: consider each node as the root and execute the message � passing algorithm m 21 (x 1 ) X 1 Computing P(X 1 ) m 32 (x 2 ) m 42 (x 2 ) X 2 X 3 X 4 Eric Xing 9 The message passing protocol: A node can send a message to its neighbors when (and only when) � it has received messages from all its other neighbors. Computing node marginals: � Naïve approach: consider each node as the root and execute the message � passing algorithm m 12 (x 2 ) X 1 Computing P(X 2 ) m 32 (x 2 ) m 42 (x 2 ) X 2 X 3 X 4 Eric Xing 10 5

The message passing protocol: A node can send a message to its neighbors when (and only when) � it has received messages from all its other neighbors. Computing node marginals: � Naïve approach: consider each node as the root and execute the message � passing algorithm m 12 (x 2 ) X 1 Computing P(X 3 ) m 23 (x 3 ) m 42 (x 2 ) X 2 X 3 X 4 Eric Xing 11 Computing node marginals � Naïve approach: Complexity: NC � N is the number of nodes � C is the complexity of a complete message passing � � Alternative dynamic programming approach 2-Pass algorithm (next slide � ) � Complexity: 2C! � Eric Xing 12 6

The message passing protocol: � A two-pass algorithm: X 1 m 21 (X 1 ) m 12 (X 2 ) m 32 (X 2 ) m 42 (X 2 ) X 2 X 4 X 3 m 24 (X 4 ) m 23 (X 3 ) Eric Xing 13 Belief Propagation ( SP-algorithm ): Sequential implementation Eric Xing 14 7

Belief Propagation ( SP-algorithm ): Parallel synchronous implementation For a node of degree d, whenever messages have arrived on any subset of d-1 � node, compute the message for the remaining edge and send! A pair of messages have been computed for each edge, one for each direction � All incoming messages are eventually computed for each node � Eric Xing 15 Correctness of BP on tree � Collollary: the synchronous implementation is "non-blocking" � Thm: The Message Passage Guarantees obtaining all marginals in the tree � What about non-tree? Eric Xing 16 8

Another view of SP: Factor Graph � Example 1 X 1 X 5 X 1 X 5 f d f a X 3 X 3 f c f e f b X 2 X 4 X 2 X 4 P(X 1 ) P(X 2 ) P(X 3 |X 1 ,X 2 ) P(X 5 |X 1 ,X 3 ) P(X 4 |X 2 ,X 3 ) f a (X 1 ) f b (X 2 ) f c (X 3 ,X 1 ,X 2 ) f d (X 5 ,X 1 ,X 3 ) f e (X 4 ,X 2 ,X 3 ) Eric Xing 17 Factor Graphs � Example 2 X 1 X 1 f a f c X 2 X 3 X 2 X 3 f b ψ( x 1 ,x 2 ,x 3 ) = f a (x 1 ,x 2 )f b (x 2 ,x 3 )f c (x 3 ,x 1 ) � Example 3 X 1 X 1 f a X 2 X 3 X 2 X 3 ψ( x 1 ,x 2 ,x 3 ) = f a (x 1 ,x 2 ,x 3 ) Eric Xing 18 9

Factor Tree � A Factor graph is a Factor Tree if the undirected graph obtained by ignoring the distinction between variable nodes and factor nodes is an undirected tree X 1 X 1 f a X 2 X 3 X 2 X 3 ψ( x 1 ,x 2 ,x 3 ) = f a (x 1 ,x 2 ,x 3 ) Eric Xing 19 Message Passing on a Factor Tree � Two kinds of messages ν : from variables to factors 1. µ : from factors to variables 2. f 1 x j f s x i f s x i f 3 x k Eric Xing 20 10

Message Passing on a Factor Tree, con'd � Message passing protocol: A node can send a message to a neighboring node only when it has � received messages from all its other neighbors � Marginal probability of nodes: f 1 x j f s x i f s x i f 3 x k P(x i ) ∝ ∏ s 2 N(i) µ si (x i ) ∝ ν is (x i ) µ si (x i ) Eric Xing 21 BP on a Factor Tree X 3 X 1 X 2 ν 1d µ e2 µ d2 ν 3e f d f e X 3 X 1 X 2 µ d1 ν 2d ν 2e µ e3 µ c3 µ a1 ν 3c ν 1a µ b2 ν 2b f c f a f b Eric Xing 22 11

Why factor graph? � Tree-like graphs to Factor trees X 1 X 1 X 2 X 2 X 4 X 3 X 3 X 4 X 5 X 6 X 5 X 6 Eric Xing 23 Poly-trees to Factor trees X 2 X 1 X 1 X 2 X 3 X 4 X 3 X 4 X 5 X 5 Eric Xing 24 12

Why factor graph? Because FG turns tree-like � X 1 graphs to factor trees, X 1 and trees are a data-structure � X 2 X 2 that guarantees correctness of BP ! X 4 X 3 X 3 X 4 X 5 X 6 X 5 X 6 X 2 X 1 X 1 X 2 X 3 X 4 X 3 X 4 X 5 X 5 Eric Xing 25 Max-product algorithm: computing MAP probabilities f i j k l Eric Xing 26 13

Max-product algorithm: computing MAP configurations using a final bookkeeping backward pass f i j k l Eric Xing 27 Summary � Sum-Product algorithm computes singleton marginal probabilities on: Trees � Tree-like graphs � Poly-trees � � Maximum a posteriori configurations can be computed by replacing sum with max in the sum-product algorithm Extra bookkeeping required � Eric Xing 28 14

Inference on general GM Now, what if the GM is not a tree-like graph? � Can we still directly run message � message-passing protocol along its edges? For non-trees, we do not have the guarantee that message-passing � will be consistent! Then what? � Construct a graph data-structure from P that has a tree structure, and run � message-passing on it! � Junction tree algorithm Eric Xing 29 Elimination Clique � Recall that Induced dependency during marginalization is captured in elimination cliques Summation <-> elimination � Intermediate term <-> elimination clique � B A B A A C A A C D A E F C D E Can this lead to an generic E � E F G inference algorithm? H Eric Xing 30 15

A Clique Tree B A B A A m m b c C m A A d m C D f m A E F e C D m h E m E E F g G H ( , , ) m a c d e ∑ = p ( e | c , d ) m ( e ) m ( a , e ) g f e Eric Xing 31 From Elimination to Message Passing Elimination ≡ message passing on a clique tree � B A B A B A B A B A B A B A B A A C D C D C D C D C D C C D E F E F E F E F E G H G H G ≡ B A B A A m m c b C m A A d m C D f A m ( , , ) m a c d E F e e C D ∑ = m p ( e | c , d ) m ( e ) m ( a , e ) g f h E E e m E F g G H Messages can be reused � Eric Xing 32 16

From Elimination to Belief Propagation Recall that Induced - PDF document

School of Computer Science The Belief Propagation (Sum-Product) Algorithm Probabilistic Graphical Models (10- Probabilistic Graphical Models (10 -708) 708) Lecture 5, Sep 31, 2007 Receptor A Receptor A X 1 X 1 X 1 Receptor B Receptor B

Variable Elimination 1 Inference Exact inference Enumeration Variable elimination

PLANT PROPAGATION An Overview of Plant Propagation Methods Two Techniques of Stem Cutting

Dead Code Elimination & Dead code elimination Constant Propagation Conceptually similar

Overview Independence Belief Networks Conditional Independence Belief networks Chris

26:198:722 Expert Systems I Dempster-Shafer Belief Functions I Combining Belief Functions I Types

Graphical Models Graphical Models Clique trees & Belief Propagation Siamak Ravanbakhsh Fall

Introduction: Belief vs Degrees of Belief Hannes Leitgeb LMU Munich October 2014 My three

Shuffled Belief Propagation Decoding Juntan Zhang and Marc Fossorier Department of Electrical

An empirical study of Gaussian belief propagation and application in the detection of F-formations

THE AMATEURS FRIEND OR Enemy A short course on Propagation Propagation What is it? What

1 How to deal with Radio Propagation How to deal with Radio Propagation Where are you from?

Physical of radio propagation Two types of propagation models

Second Order Cut-Elimination Mikheil Rukhaia Supervisor: Prof. Alexander Leitsch Introduction

The Elimination Algorithm Chris Williams School of Informatics, University of Edinburgh October

Belief network inference Four main approaches to determine posterior distributions in belief

Bayesian Belief Networks Decision Theoretic Agents Introduction to Probability [Ch13]

Life with Baby: A Roundtable Discussion on Positive Parenting Nicci Stein, Executive Director, The

Junction Trees And Belief Propagation (Slides from Pedro Domingos) Junction Trees: Motivation

CSCE 970 Lecture 6: Inference on Discrete Variables Stephen D. Scott 1 Introduction Now that

Methods Constructors Instance methods Class methods 30 Constructors have the same

ELEMENTARY SCHOOL ANNUAL TITLE I PARENT MEETING MRS. LATOSHA PETERS, PRINCIPAL MRS. MONICA

History Prototype-based pure object-oriented language. Self Designed by Randall Smith

Steven W. Kairys, M.D., M.P.H March 13, 2015 THE THREE PARENTING STYLES AUTHORITARIAN The main

A Parallel Union-Find Library in Charm ++ Karthik Senthil Parallel Programming Laboratory

From Elimination to Belief Propagation Recall that Induced - PDF document

School of Computer Science The Belief Propagation (Sum-Product) Algorithm Probabilistic Graphical Models (10- Probabilistic Graphical Models (10 -708) 708) Lecture 5, Sep 31, 2007 Receptor A Receptor A X 1 X 1 X 1 Receptor B Receptor B

Variable Elimination 1 Inference Exact inference Enumeration Variable elimination

PLANT PROPAGATION An Overview of Plant Propagation Methods Two Techniques of Stem Cutting

Dead Code Elimination &amp; Dead code elimination Constant Propagation Conceptually similar

Overview Independence Belief Networks Conditional Independence Belief networks Chris

26:198:722 Expert Systems I Dempster-Shafer Belief Functions I Combining Belief Functions I Types

Graphical Models Graphical Models Clique trees &amp; Belief Propagation Siamak Ravanbakhsh Fall

Introduction: Belief vs Degrees of Belief Hannes Leitgeb LMU Munich October 2014 My three

Shuffled Belief Propagation Decoding Juntan Zhang and Marc Fossorier Department of Electrical

An empirical study of Gaussian belief propagation and application in the detection of F-formations

THE AMATEURS FRIEND OR Enemy A short course on Propagation Propagation What is it? What

1 How to deal with Radio Propagation How to deal with Radio Propagation Where are you from?

Physical of radio propagation Two types of propagation models

Second Order Cut-Elimination Mikheil Rukhaia Supervisor: Prof. Alexander Leitsch Introduction

The Elimination Algorithm Chris Williams School of Informatics, University of Edinburgh October

Belief network inference Four main approaches to determine posterior distributions in belief

Bayesian Belief Networks Decision Theoretic Agents Introduction to Probability [Ch13]

Life with Baby: A Roundtable Discussion on Positive Parenting Nicci Stein, Executive Director, The

Junction Trees And Belief Propagation (Slides from Pedro Domingos) Junction Trees: Motivation

CSCE 970 Lecture 6: Inference on Discrete Variables Stephen D. Scott 1 Introduction Now that

Methods Constructors Instance methods Class methods 30 Constructors have the same

ELEMENTARY SCHOOL ANNUAL TITLE I PARENT MEETING MRS. LATOSHA PETERS, PRINCIPAL MRS. MONICA

History Prototype-based pure object-oriented language. Self Designed by Randall Smith

Steven W. Kairys, M.D., M.P.H March 13, 2015 THE THREE PARENTING STYLES AUTHORITARIAN The main

A Parallel Union-Find Library in Charm ++ Karthik Senthil Parallel Programming Laboratory

Dead Code Elimination & Dead code elimination Constant Propagation Conceptually similar

Graphical Models Graphical Models Clique trees & Belief Propagation Siamak Ravanbakhsh Fall