Graphical Models Graphical Models Monte-Carlo Inference Siamak - PowerPoint PPT Presentation

Graphical Models Graphical Models Monte-Carlo Inference Siamak Ravanbakhsh Winter 2018

Learning objectives Learning objectives the relationship between sampling and inference sampling from univariate distributions Monte Carlo sampling in graphical models

Mote Carlo inference Mote Carlo inference calculating marginals p ( x = ¯ 1 ) = p ( ¯ 1 , x , … , x ) ∑ x ,…, x x x 1 2 n 2 n

Mote Carlo Mote Carlo inference inference calculating marginals p ( x = ¯ 1 ) = p ( ¯ 1 , x , … , x ) ∑ x ,…, x x x 1 2 n 2 n approximate it by sampling X ( l ) ∼ p ( x ) 1 ∑ l ( l ) I ( X p ( x = ¯ 1 ) ≈ = ¯ 1 ) x x 1 1 L

Mote Carlo inference Mote Carlo inference calculating marginals p ( x = ¯ 1 ) = p ( ¯ 1 , x , … , x ) ∑ x ,…, x x x 1 2 n 2 n approximate it by sampling X ( l ) ∼ p ( x ) 1 ∑ l ( l ) I ( X p ( x = ¯ 1 ) ≈ = ¯ 1 ) x x 1 1 L inference in exponential family p ( x ) = exp(⟨ θ , ψ ⟩ − A ( θ )) θ is about finding the mean parameters μ = E [ ψ ( x )] p θ 1 ∑ l using L samples (particles) ( l ) μ ≈ ψ ( X ) L

Sampling from Sampling from categorical categorical dist. dist. access to pseudo random number generator for X ∼ U (0, 1) given p ( X = d ) = p ∀1 ≤ d ≤ D d p 1 p 2 p 6 0 1 generate and see where it falls X ∼ U (0, 1) use binary search O (log( D ))

Transforming Transforming probability densities probability densities given a random variable X ∼ p X what is the prob. density of ? Y = ϕ ( X ) −1 d ϕ ( y ) −1 Y ∼ p ( y ) = p ( ϕ ( y ))∣ ∣ Y X d y corresponding x how changes the volume around each point y ϕ (bonus) ϕ in multivariate case: determinant of the Jacobian matrix image: wikipedia

Inverse transform Inverse transform sampling sampling let be uniform = U (0, 1) X p X given a density p Y images: work.thaslwanter.at, Murphy's book

Inverse transform Inverse transform sampling sampling let be uniform = U (0, 1) X p X given a density p Y let be its CDF F ( y ) = P ( Y < y ) F Y Y images: work.thaslwanter.at, Murphy's book

Inverse transform Inverse transform sampling sampling let be uniform = U (0, 1) X p X given a density p Y let be its CDF F ( y ) = P ( Y < y ) F Y Y transform X using −1 ϕ ( X ) = F ( X ) Y what is the density of ? Y = ϕ ( X ) images: work.thaslwanter.at, Murphy's book

Inverse transform Inverse transform sampling sampling let be uniform = U (0, 1) X p X given a density p Y let be its CDF F ( y ) = P ( Y < y ) F Y Y transform X using −1 ϕ ( X ) = F ( X ) Y what is the density of ? Y = ϕ ( X ) F Y −1 d ϕ ( y ) d F ( y ) X −1 Y ∼ p ( ϕ ( y ))∣ ∣ = p ( F ( y ))∣ ∣ X X d y d y constant: p ( y ) Y = U (0, 1) p X Y images: work.thaslwanter.at, Murphy's book

Inverse transform sampling: Inverse transform sampling: example example Expoenential distribution p ( y ) = λe − λy p ( y ) x − λy F ( y ) = 1 − e Y F Y y calculate the inverse CDF: y −1 1 ( x ) = − ln(1 − x ) F Y λ image:wikipedia

Sampling in graphical models Sampling in graphical models for Bayes-nets ancestral sampling

Sampling in graphical models Sampling in graphical models for Bayes-nets ancestral sampling find a topological ordering (how?) e.g., D,I,G,S,L or I,S,D,G,L sample by conditioning on parents G ∼ P ( g ∣ I , D )

Introducing evidence Introducing evidence what if we have an evidence E.g., how to sample from the posterior? 0 p ( D , I , S , L ∣ G = g )

Introducing evidence Introducing evidence what if we have an evidence E.g., how to sample from the posterior? 0 p ( D , I , S , L ∣ G = g ) rejection sampling find a topological ordering sample by conditioning on parents only keep samples compatible with evidence 0 ( G = g ) wasteful if evidence has a low probability

Rejection sampling Rejection sampling general form ~ to sample from 1 p p ( x ) = ( x ) Z use a proposal distribution q ( x ) such that everywhere ~ Mq ( x ) > ( x ) p sample X ∼ q ( x ) ~ ( x ) accept the sample with probability p Mq ( x ) image: Murphy's book

Rejection sampling Rejection sampling general form ~ to sample from 1 p p ( x ) = ( x ) Z use a proposal distribution q ( x ) such that everywhere ~ Mq ( x ) > ( x ) p sample X ∼ q ( x ) ~ ( x ) accept the sample with probability p Mq ( x ) ~ ( x ) what is the probability of acceptance? p q ( x ) d x = Z ∫ x Mq ( x ) M Z for high-dimensional dists. becomes small! M rejection sampling becomes wasteful image: Murphy's book

Likelihood weighting Likelihood weighting what if we have an evidence? E.g., how to sample from the posterior? 0 p ( D , I , S , L ∣ G = g ) find a topological ordering assign a weight to each particle ( l ) ← 1 w sample by conditioning on parents when sampling an observed variable set it to its observed value G = g 1 update the sample's weight ( l ) ( l ) 1 ( l ) ( l ) ← w × p ( G = g ∣ D = d , I = i ) w current assignments to parents

Likelihood weighting Likelihood weighting what if we have an evidence? E.g., how to sample from the posterior? 0 p ( D , I , S , L ∣ G = g ) using weighted particles for inference: ( l ) 0 w I ( S = s ) ∑ l 0 1 p ( S = s ∣ G = g ) = l ∑ l w l

Likelihood weighting Likelihood weighting what if we have an evidence? E.g., how to sample from the posterior? 0 p ( D , I , S , L ∣ G = g ) using weighted particles for inference: ( l ) 0 w I ( S = s ) ∑ l 0 1 p ( S = s ∣ G = g ) = l ∑ l w l special case of importance sampling

Unnormalized Unnormalized importance sampling importance sampling Objective: Monte Carlo estimate E [ f ( x )] q ( x ) p ( x ) p f ( x ) difficult to sample from p (yet easy to evaluate) use a proposal distribution q : p ( x ) > 0 ⇒ q ( x ) > 0 x image: Bishop's book

Unnormalized Unnormalized importance sampling importance sampling Objective: Monte Carlo estimate E [ f ( x )] q ( x ) p ( x ) p f ( x ) difficult to sample from p (yet easy to evaluate) use a proposal distribution q : p ( x ) > 0 ⇒ q ( x ) > 0 x since p ( x ) p ( x ) E [ f ( x )] = f ( x )d x = E [ ∫ x p ( x ) f ( x )d x = ∫ x q ( x ) f ( x )] p q q ( x ) q ( x ) image: Bishop's book

Unnormalized Unnormalized importance sampling importance sampling Objective: Monte Carlo estimate E [ f ( x )] q ( x ) p ( x ) p f ( x ) difficult to sample from p (yet easy to evaluate) use a proposal distribution q : p ( x ) > 0 ⇒ q ( x ) > 0 x since p ( x ) p ( x ) E [ f ( x )] = f ( x )d x = E [ ∫ x p ( x ) f ( x )d x = ∫ x q ( x ) f ( x )] p q q ( x ) q ( x ) sample X ∼ q ( x ) l ( l ) p ( X ) assign an importance sampling weight ( l ) w ( X ) = ( l ) q ( X ) image: Bishop's book

Unnormalized importance sampling Unnormalized importance sampling Objective: Monte Carlo estimate E [ f ( x )] q ( x ) p ( x ) p f ( x ) difficult to sample from p (yet easy to evaluate) use a proposal distribution q : p ( x ) > 0 ⇒ q ( x ) > 0 x since p ( x ) p ( x ) E [ f ( x )] = f ( x )d x = E [ ∫ x p ( x ) f ( x )d x = ∫ x q ( x ) f ( x )] p q q ( x ) q ( x ) sample X ∼ q ( x ) l ( l ) p ( X ) assign an importance sampling weight ( l ) w ( X ) = ( l ) q ( X ) 1 ∑ l E [ f ( x )] ≈ ( l ) ( l ) w ( X ) f ( X ) is an unbiased estimator p L can be more efficient than sampling from p itself! (why?) image: Bishop's book

normalized normalized importance sampling importance sampling ~ What if we can evaluate p, up to a constant? 1 p p ( x ) = ( x ) Z 1 posterior in directed models p ( x ∣ E = e ) = p ( x , e ) Examples p ( e ) 1 ∏ I prior in undirected models p ( x ) = ϕ ( x ) I I Z

normalized normalized importance sampling importance sampling ~ What if we can evaluate p, up to a constant? 1 p p ( x ) = ( x ) Z 1 posterior in directed models p ( x ∣ E = e ) = p ( x , e ) Examples p ( e ) 1 ∏ I prior in undirected models p ( x ) = ϕ ( x ) I I Z ~ ~ ( x ) define p then E [ w ( x )] = w ( x ) = q ( x ) ( x )d x = Z ∫ x p q ~ 1 ∫ x E [ w ( x ) f ( x )] since ( x ) E [ f ( x )] = 1 E [ w ( x ) f ( x )] = p ∫ x p ( x ) f ( x )d x = q ( x ) f ( x )d x = q p q E [ w ( x )] q ( x ) Z Z q

normalized importance sampling normalized importance sampling ~ What if we can evaluate p, up to a constant? 1 p p ( x ) = ( x ) Z 1 posterior in directed models p ( x ∣ E = e ) = p ( x , e ) Examples p ( e ) 1 ∏ I prior in undirected models p ( x ) = ϕ ( x ) I I Z ~ ~ ( x ) define p then E [ w ( x )] = w ( x ) = q ( x ) ( x )d x = Z ∫ x p q ~ 1 ∫ x E [ w ( x ) f ( x )] since ( x ) E [ f ( x )] = 1 E [ w ( x ) f ( x )] = p ∫ x p ( x ) f ( x )d x = q ( x ) f ( x )d x = q p q E [ w ( x )] q ( x ) Z Z q sample ( l ) ∼ q ( x ) X ~ ( l ) ( X ) assign an importance sampling weight ( l ) p w ( X ) = ( l ) q ( X ) ( l ) ( l ) w ( X ) f ( X ) ∑ l E [ f ( x )] ≈ is a biased estimator (e.g., consider L=1) p w ( X ( l ) ) ∑ l

Revisiting likelihood weighting Revisiting likelihood weighting likelihood weighting: ( l ) 0 w I ( S = s ) ∑ l 0 2 1 p ( S = s ∣ G = g , I = i ) = l ∑ l w l equivalent to:

Graphical Models Graphical Models Monte-Carlo Inference Siamak - PowerPoint PPT Presentation

Graphical Models Graphical Models Monte-Carlo Inference Siamak Ravanbakhsh Winter 2018 Learning objectives Learning objectives the relationship between sampling and inference sampling from univariate distributions Monte Carlo sampling in

Graphical Models Graphical Models Bayesian Networks Siamak Ravanbakhsh Fall 2019 Previously on

Transforming Graphical System Models to Graphical Attack Models ! Joint work with Marieta

Probabilistic Graphical Models Probabilistic Graphical Models Variable elimination Siamak

Probabilistic Graphical Models CMSC 678 UMBC Probabilistic Graphical Models A graph G that

Undirected Graphical Models Aaron Courville, Universit de Montral 2 (UNDIRECTED) GRAPHICAL

Graphical models Review Graphical models (Bayes nets, Markov random fields, factor graphs) !

Probabilistic Graphical Models CMSC 691 UMBC Two Problems for Graphical Models 1 ,

Probabilistic Graphical Models Probabilistic Graphical Models introduction to learning Siamak

Graphical Models Graphical Models Relationship between the directed & undirected models

Probabilistic Graphical Models Probabilistic Graphical Models Undirected Models Fall 2019

Probabilistic Graphical Models Probabilistic Graphical Models parameter learning in undirected

Probabilistic Graphical Models Probabilistic Graphical Models Gaussian Network Models Fall 2019

Graphical Screen Design Grids are an essential tool for graphical design Important graphical

Graphical > Tangible? What are their limitations? 93 94 Graphical > Tangible? Graphical

Graphical Screen Design Grids are an essential tool for graphical design Important graphical

10/4/15 Graphical Programming (1) Maze Program TOPICS Graphical Programming Using

CREATIVE SOLUTIONS TO PROBLEMS John McCarthy Computer Science Department Stanford University

The Mutilated Checkerboard in Set Theory John McCarthy Computer Science Department Stanford

WhoDunnit??? Someones been murdered and its your jo job to to sol olve e the he cr

A R A Reaso sonable nable F Fait aith in An Unc in An Uncert rtain W in World orld Is

NFPs Reid Holmes Lecture 5 - Tuesday, Sept 27 2010. [TAILOR ET AL.] NFPs NFPs are

Are Our OSes Prepared for Edge Computing? Pekka Enberg www.cs.helsinki.fi 1 Introduction

Parallel Programming and High-Performance Computing Part 1: Introduction Dr. Ralf-Peter Mundani

DM820 Advanced T opics in Programming Languages Peter Schneider-Kamp petersk@imada.sdu.dk

Graphical Models Graphical Models Monte-Carlo Inference Siamak - PowerPoint PPT Presentation

Graphical Models Graphical Models Monte-Carlo Inference Siamak Ravanbakhsh Winter 2018 Learning objectives Learning objectives the relationship between sampling and inference sampling from univariate distributions Monte Carlo sampling in

Graphical Models Graphical Models Bayesian Networks Siamak Ravanbakhsh Fall 2019 Previously on

Transforming Graphical System Models to Graphical Attack Models ! Joint work with Marieta

Probabilistic Graphical Models Probabilistic Graphical Models Variable elimination Siamak

Probabilistic Graphical Models CMSC 678 UMBC Probabilistic Graphical Models A graph G that

Undirected Graphical Models Aaron Courville, Universit de Montral 2 (UNDIRECTED) GRAPHICAL

Graphical models Review Graphical models (Bayes nets, Markov random fields, factor graphs) !

Probabilistic Graphical Models CMSC 691 UMBC Two Problems for Graphical Models 1 ,

Probabilistic Graphical Models Probabilistic Graphical Models introduction to learning Siamak

Graphical Models Graphical Models Relationship between the directed &amp; undirected models

Probabilistic Graphical Models Probabilistic Graphical Models Undirected Models Fall 2019

Probabilistic Graphical Models Probabilistic Graphical Models parameter learning in undirected

Probabilistic Graphical Models Probabilistic Graphical Models Gaussian Network Models Fall 2019

Graphical Screen Design Grids are an essential tool for graphical design Important graphical

Graphical &gt; Tangible? What are their limitations? 93 94 Graphical &gt; Tangible? Graphical

Graphical Screen Design Grids are an essential tool for graphical design Important graphical

10/4/15 Graphical Programming (1) Maze Program TOPICS Graphical Programming Using

CREATIVE SOLUTIONS TO PROBLEMS John McCarthy Computer Science Department Stanford University

The Mutilated Checkerboard in Set Theory John McCarthy Computer Science Department Stanford

WhoDunnit??? Someones been murdered and its your jo job to to sol olve e the he cr

A R A Reaso sonable nable F Fait aith in An Unc in An Uncert rtain W in World orld Is

NFPs Reid Holmes Lecture 5 - Tuesday, Sept 27 2010. [TAILOR ET AL.] NFPs NFPs are

Are Our OSes Prepared for Edge Computing? Pekka Enberg www.cs.helsinki.fi 1 Introduction

Parallel Programming and High-Performance Computing Part 1: Introduction Dr. Ralf-Peter Mundani

DM820 Advanced T opics in Programming Languages Peter Schneider-Kamp petersk@imada.sdu.dk

Graphical Models Graphical Models Relationship between the directed & undirected models

Graphical > Tangible? What are their limitations? 93 94 Graphical > Tangible? Graphical