Graphical Models L eon Bottou COS 424 4/15/2010 Introduction - PowerPoint PPT Presentation

Graphical Models L´ eon Bottou COS 424 – 4/15/2010

Introduction People like drawings better than equations – A graphical model is a diagram representing certain aspects of the algebraic structure of a probabilistic model. Purposes – Visualize the structure of a model. – Investigate conditional independence properties. – Some computations are more easily expressed on a graph than written as equations with complicated subscripts. L´ eon Bottou 2/37 COS 424 – 4/15/2010

Summary Summary I. Directed graphical models II. Undirected graphical models III. Inference in graphical models More – David Blei runs a complete course on graphical models. L´ eon Bottou 3/37 COS 424 – 4/15/2010

I. Directed graphical models “Bayesian Networks” (Pearl 1988) L´ eon Bottou 4/37 COS 424 – 4/15/2010

A pattern for independence assumptions Probability distribution P ( x 1 , x 2 , x 3 , x 4 ) Bayesian chain theorem P ( x 1 , x 2 , x 3 , x 4 ) = P ( x 1 ) P ( x 2 | x 1 ) P ( x 3 | x 1 , x 2 ) P ( x 4 | x 1 , x 2 , x 3 ) Independence assumptions P ( x 1 , x 2 , x 3 , x 4 ) = P ( x 1 ) P ( x 2 | x 1 ) P ( x 3 | x 1 , x 2 ) P ( x 4 | x 1 , x 2 , x 3 ) = P ( x 1 ) P ( x 2 | x 1 ) P ( x 3 | x 1 ) P ( x 4 | x 1 , x 2 ) L´ eon Bottou 5/37 COS 424 – 4/15/2010

Graphical representation Bayesian chain theorem P ( x 1 , x 2 , x 3 , x 4 ) = P ( x 1 ) P ( x 2 | x 1 ) P ( x 3 | x 1 , x 2 ) P ( x 4 | x 1 , x 2 , x 3 ) Directed acyclic graph � � � � � � � � Arrows do not represent causality! L´ eon Bottou 6/37 COS 424 – 4/15/2010

Graphical representation Independence assumptions P ( x 1 , x 2 , x 3 , x 4 ) = P ( x 1 ) P ( x 2 | x 1 ) P ( x 3 | x 1 , x 2 ) P ( x 4 | x 1 , x 2 , x 3 ) = P ( x 1 ) P ( x 2 | x 1 ) P ( x 3 | x 1 ) P ( x 4 | x 1 , x 2 ) � � � � � � � � Missing links represent independence assumptions L´ eon Bottou 7/37 COS 424 – 4/15/2010

A more complicated example P ( x 1 ) P ( x 2 ) P ( x 3 ) P ( x 4 | x 1 , x 2 ) P ( x 5 | x 1 , x 2 , x 3 ) P ( x 6 | x 4 ) P ( x 7 | x 4 , x 5 ) � � � � � � � � � � � � � � Parametrization The graph says nothing about the parametric form of the probabilities. – Discrete distributions – Continuous distributions L´ eon Bottou 8/37 COS 424 – 4/15/2010

Discrete distributions Input x = ( x 1 , x 2 . . . x d ) ∈ { 0 , 1 } d . Class y ∈ { A 1 , . . . , A k } . General generative model Na ¨ ıve Bayes model P ( x , y ) = P ( y ) P ( x | y ) P ( x , y ) = P ( y ) P ( x 1 | y ) . . . P ( x d | y ) � � � � � � � – k parameters for P ( y ) – k 2 d parameters for P ( x | y ) � � – k parameters for P ( y ) – k d parameters for P ( x | y ) L´ eon Bottou 9/37 COS 424 – 4/15/2010

Discrete distributions Na ¨ ıve Bayes model Linear discriminant model P ( x , y ) = P ( y ) P ( x 1 | y ) . . . P ( x d | y ) P ( x , y ) = P ( x ) P ( y | x ) � � � � � � � y ( x ) = arg max ˆ P ( x , y ) y � � = arg max P ( y | x ) y y ( x ) = arg max ˆ P ( x , y ) y – k parameters for P ( y ) . – k ( d + 1) parameters for P ( y | x ) . – 2 d unused parameters for P ( x ) . – k d parameters for P ( x | y ) . Fails when the x i are correlated ! Works when the x i are correlated ! L´ eon Bottou 10/37 COS 424 – 4/15/2010

Continuous distributions Linear regression – Input x = ( x 1 , x 2 . . . x d ) ∈ R d . – Output y ∈ R . P ( x , y ) = P ( y | x ) P ( x ) � � � � − 1 � 2 � y − w ⊤ x P ( y | x ) ∝ exp 2 σ 2 No need to model P ( x ) . L´ eon Bottou 11/37 COS 424 – 4/15/2010

Bayesian regression Consider a dataset D = { ( x 1 , y 1 ) , . . . , ( x n , y n ) } . n � P ( D , w ) = P ( w ) P ( D| w ) = P ( w ) P ( y i | x i , w ) P ( x i ) i =1 � � � � � � � � �� Plates represent repeated subgraphs. Although the parameter w is explicit, other details about the distributions are not. L´ eon Bottou 12/37 COS 424 – 4/15/2010

Hidden Markov Models P ( x 1 . . . x T , s 1 . . . s T ) = P ( s 1 ) P ( x 1 | s 1 ) P ( s 2 | s 1 ) P ( x 2 | s 2 ) . . . P ( s T | s T − 1 ) P ( x T | s T ) � � � � � � � � � � � � � � � � What is the relation between this graph and that graph? �� L´ eon Bottou 13/37 COS 424 – 4/15/2010

Conditional independence patterns (1) Tail-to-tail � � � � � � P ( a, b, c ) = P ( a | c ) P ( b | c ) P ( c ) P ( a, b, c ) = P ( a | c ) P ( b | c ) P ( c ) � P ( a, b ) = P ( a | c ) P ( b | c ) P ( c ) P ( a, b | c ) = P ( a, b, c ) /P ( c ) c = P ( a | c ) P ( b | c ) � = P ( a ) P ( b ) in general a ⊥ �⊥ b | ∅ a ⊥ ⊥ b | c L´ eon Bottou 14/37 COS 424 – 4/15/2010

Conditional independence patterns (3) Head-to-head � � � � � � P ( a, b, c ) = P ( a ) P ( b ) P ( c | a, b ) P ( a, b, c ) = P ( a ) P ( b ) P ( c | a, b ) � P ( a, b | c ) � = P ( a | c ) P ( b | c ) in general P ( a, b ) = P ( a ) P ( b ) P ( c | a, b ) c Example: � = P ( a ) P ( b ) P ( c | a, b )) c = “the house is shaking” c a = “there is an earthquake” = P ( a ) P ( b ) b = “a truck hits the house” a ⊥ ⊥ b | ∅ a ⊥ �⊥ b | c L´ eon Bottou 16/37 COS 424 – 4/15/2010

D-separation Problem – Consider three disjoint sets of nodes: A , B , C . – When do we have A ⊥ ⊥ B | C ? Definition A and B are d-separated by C if all paths from a ∈ A to b ∈ B – contain a head-to-tail or tail-to-tail node c ∈ C , or – contain a head-to-head node c such that neither c nor any of its descendants belongs to C . Theorem A and B are d-separated by C ⇐ ⇒ A ⊥ ⊥ B | C L´ eon Bottou 17/37 COS 424 – 4/15/2010

II. Undirected graphical models “Markov Random Fields” L´ eon Bottou 18/37 COS 424 – 4/15/2010

Another independence assumption pattern Boltzmann distribution P ( x ) = 1 � � � � � Z exp − E ( x ) Z = exp − E ( x ) with x – The function E ( x ) is called energy function . – The quantity Z is called the partition function . Markov Random Field – Let { x C } be a family of subsets of the variables x . – The distribution P ( x ) is a Markov Random Field with cliques { x C } if � there are functions E C ( x C ) such that E ( x ) = E C ( x C ) . C Equivalently, P ( x ) = 1 � Ψ C ( x C ) with Ψ C ( x C ) = exp( − E C ( x C )) > 0 . Z C L´ eon Bottou 19/37 COS 424 – 4/15/2010

Graphical representation P ( x 1 , x 2 , x 3 , x 4 , x 5 ) = 1 Z Ψ 1 ( x 1 , x 2 ) Ψ 2 ( x 2 , x 3 ) Ψ 3 ( x 3 , x 4 , x 5 ) � � � � � � � � � � � � � � � � – Completely connect the nodes belonging to each x C . – Each subset x C forms a clique of the graph. L´ eon Bottou 20/37 COS 424 – 4/15/2010

Markov Blanket Definition – The Markov blanket of x is the minimal subset of variables B x of the variables x such that P ( x | x \ x ) = P ( x | B x ) . Example Ψ 1 ( x 1 , x 2 ) Ψ 2 ( x 2 , x 3 ) Ψ 3 ( x 3 , x 4 , x 5 ) P ( x 3 | x 1 , x 2 , x 4 , x 5 ) = � Ψ 1 ( x 1 , x 2 ) Ψ 2 ( x 2 , x ′ 3 ) Ψ 3 ( x ′ 3 , x 4 , x 5 ) x ′ 3 Ψ 2 ( x 2 , x 3 ) Ψ 3 ( x 3 , x 4 , x 5 ) = � Ψ 2 ( x 2 , x ′ 3 ) Ψ 3 ( x ′ 3 , x 4 , x 5 ) x ′ 3 = P ( x 3 | x 2 , x 4 , x 5 ) L´ eon Bottou 21/37 COS 424 – 4/15/2010

Graph and Markov blanket The Markov blanket of a MRF variable is the set of its neighbors. P ( x 3 | x 1 , x 2 , x 4 , x 5 ) = P ( x 3 | x 2 , x 4 , x 5 ) � � � � � � � � � � � � � � � � Consequence – Consider three disjoint sets of nodes: A , B , C . � Any path between a ∈ A and b ∈ B A ⊥ ⊥ B | C ⇐ ⇒ passes through a node c ∈ C. Conversely (Hammersley-Clifford theorem) – Any distribution that satisfies such properties with respect to an undirected graph is a Markov Random Field. L´ eon Bottou 22/37 COS 424 – 4/15/2010

Directed vs. undirected graphs Consider a directed graph. P ( x ) = P ( x 1 ) P ( x 2 ) P ( x 3 | x 1 , x 2 ) P ( x 4 | x 2 ) � �� Ψ 1 ( x 1 ) Ψ 2 ( x 2 ) Ψ 3 ( x 1 , x 2 , x 3 ) Ψ 4 ( x 2 , x 4 ) ( Z = 1 ) � � � � � � � � � � � � � � � � The opposite inclusion is not true because the undirected graph marries the parents of x 3 with a moralization link. Directed and undirected graphs represent different sets of distributions. Neither set is included in the other one. L´ eon Bottou 23/37 COS 424 – 4/15/2010

Example: image denoising Noise model: randomly flipping a small proportion of the pixels. Image model: pixel distribution given its four neighbors. �� Inference problem – Given the observed noisy pixels, reconstruct the true pixel distributions. L´ eon Bottou 24/37 COS 424 – 4/15/2010

Graphical Models L eon Bottou COS 424 4/15/2010 Introduction - PowerPoint PPT Presentation

Graphical Models L eon Bottou COS 424 4/15/2010 Introduction People like drawings better than equations A graphical model is a diagram representing certain aspects of the algebraic structure of a probabilistic model. Purposes

Graphical Models Graphical Models Bayesian Networks Siamak Ravanbakhsh Fall 2019 Previously on

Transforming Graphical System Models to Graphical Attack Models ! Joint work with Marieta

Probabilistic Graphical Models Probabilistic Graphical Models Variable elimination Siamak

Probabilistic Graphical Models CMSC 678 UMBC Probabilistic Graphical Models A graph G that

Undirected Graphical Models Aaron Courville, Universit de Montral 2 (UNDIRECTED) GRAPHICAL

Graphical models Review Graphical models (Bayes nets, Markov random fields, factor graphs) !

Probabilistic Graphical Models CMSC 691 UMBC Two Problems for Graphical Models 1 ,

Probabilistic Graphical Models Probabilistic Graphical Models introduction to learning Siamak

Graphical Models Graphical Models Relationship between the directed & undirected models

Probabilistic Graphical Models Probabilistic Graphical Models Undirected Models Fall 2019

Probabilistic Graphical Models Probabilistic Graphical Models parameter learning in undirected

Probabilistic Graphical Models Probabilistic Graphical Models Gaussian Network Models Fall 2019

Graphical Screen Design Grids are an essential tool for graphical design Important graphical

Graphical > Tangible? What are their limitations? 93 94 Graphical > Tangible? Graphical

Graphical Screen Design Grids are an essential tool for graphical design Important graphical

10/4/15 Graphical Programming (1) Maze Program TOPICS Graphical Programming Using

Probabilistic Graphical Models Christian Borgelt Dept. of Knowledge Processing and Language

Undirected Graphical Models Dr. Shuang LIANG School of Software Engineering TongJi University

Graphical models Sunita Sarawagi IIT Bombay http://www.cse.iitb.ac.in/~sunita 1 Probabilistic

Probabilistic Graphical Models Part II: Undirected Graphical Models Selim Aksoy Department of

Part 2: Introduction to Graphical Models Sebastian Nowozin and Christoph H. Lampert Providence,

Graphical models and inference II Milos Hauskrecht milos@pitt.edu 5329 Sennott Square, x4-8845

Directed Random Graphs with Given Degree Distributions Mariana Olvera-Cravioto Columbia

Model-based Deep Hand Pose Estimation Xingyi Zhou, Qingfu Wan, Wei Zhang, Xiangyang Xue, Yichen

Graphical Models L eon Bottou COS 424 4/15/2010 Introduction - PowerPoint PPT Presentation

Graphical Models L eon Bottou COS 424 4/15/2010 Introduction People like drawings better than equations A graphical model is a diagram representing certain aspects of the algebraic structure of a probabilistic model. Purposes

Graphical Models Graphical Models Bayesian Networks Siamak Ravanbakhsh Fall 2019 Previously on

Transforming Graphical System Models to Graphical Attack Models ! Joint work with Marieta

Probabilistic Graphical Models Probabilistic Graphical Models Variable elimination Siamak

Probabilistic Graphical Models CMSC 678 UMBC Probabilistic Graphical Models A graph G that

Undirected Graphical Models Aaron Courville, Universit de Montral 2 (UNDIRECTED) GRAPHICAL

Graphical models Review Graphical models (Bayes nets, Markov random fields, factor graphs) !

Probabilistic Graphical Models CMSC 691 UMBC Two Problems for Graphical Models 1 ,

Probabilistic Graphical Models Probabilistic Graphical Models introduction to learning Siamak

Graphical Models Graphical Models Relationship between the directed &amp; undirected models

Probabilistic Graphical Models Probabilistic Graphical Models Undirected Models Fall 2019

Probabilistic Graphical Models Probabilistic Graphical Models parameter learning in undirected

Probabilistic Graphical Models Probabilistic Graphical Models Gaussian Network Models Fall 2019

Graphical Screen Design Grids are an essential tool for graphical design Important graphical

Graphical &gt; Tangible? What are their limitations? 93 94 Graphical &gt; Tangible? Graphical

Graphical Screen Design Grids are an essential tool for graphical design Important graphical

10/4/15 Graphical Programming (1) Maze Program TOPICS Graphical Programming Using

Probabilistic Graphical Models Christian Borgelt Dept. of Knowledge Processing and Language

Undirected Graphical Models Dr. Shuang LIANG School of Software Engineering TongJi University

Graphical models Sunita Sarawagi IIT Bombay http://www.cse.iitb.ac.in/~sunita 1 Probabilistic

Probabilistic Graphical Models Part II: Undirected Graphical Models Selim Aksoy Department of

Part 2: Introduction to Graphical Models Sebastian Nowozin and Christoph H. Lampert Providence,

Graphical models and inference II Milos Hauskrecht milos@pitt.edu 5329 Sennott Square, x4-8845

Directed Random Graphs with Given Degree Distributions Mariana Olvera-Cravioto Columbia

Model-based Deep Hand Pose Estimation Xingyi Zhou, Qingfu Wan, Wei Zhang, Xiangyang Xue, Yichen

Graphical Models Graphical Models Relationship between the directed & undirected models

Graphical > Tangible? What are their limitations? 93 94 Graphical > Tangible? Graphical