Bayes Net Representation CS 4100: Artificial Intelligence Bayes - PowerPoint PPT Presentation

Bayes’ Net Representation CS 4100: Artificial Intelligence Bayes’ Nets: Sampling • A A di directed, d, acyclic graph ph, o , one n node p per r random v variable • A A co conditional al probab ability tab able (CP CPT) ) for each node de • A collection of distributions over X , one for each possible assignment to parent variables • Ba Bayes ’ ne nets implicitly enc ncode jo join int dis istrib ributio ions • As a product of local conditional distributions • To see what probability a BN gives to a full assignment, multiply all the relevant conditionals together: Jan-Willem van de Meent, Northeastern University [These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at http://ai.berkeley.edu.] Variable Elimination Approximate Inference: Sampling • Interleave ve jo join inin ing and and marginalizi zing • d k entries s computed for a factor ove ver k va variables s with domain si size zes s d … • Or Orde dering of elimination of hidden va variables s … can af can affect ect si size ze of factors s ge generate ted • Worst st case se: running time exp xponential in the si in size ze of the Baye yes’ s’ net

Sampling Sampling • Sampling is a lot like ke repeated simulation • Why Why sampl ple? • Example u u = 0.83 • Sampling from give ven dist stribution • Predicting the weather, basketball games, … • Re Reinforcement Learning: Can C P(C) • St Step p 1: Get sample u from uniform approximate (q-)values even when distribution over [0 [0, 1) 0 0.6 0.7 1.0 red 0.6 you don’t know the transition function • Ba Basic idea • E.g. ra random() m() in python • In Inference: : getting a sample is faster than green 0.1 Draw N samples from a sampling distribution S • Dr • St Step p 2: Convert this sample u into computing the right answer (e.g. with blue 0.3 an outcome for the given distribution variable elimination) • Co Compute an approximate posterior probability by having each target outcome • Show this co conver verges ges to the true probability P associated with a sub-interval of [0,1) with sub-interval size equal to [0 • If ra random() returns u = = 0.83 , probability of the outcome then our sample is C = = blue • E.g, after sampling 8 times: Sampling in Bayes’ Nets Prior Sampling • Pr Prior ior Sa Samp mplin ling • Re Rejecti tion Sampling • Like kelihood Weighting • Gi Gibbs Sampling

Prior Sampling Prior Sampling +c 0.5 • For For i = 1 , 2 , …, n -c 0.5 • Sa Sampl ple x i fr from P( P(X i | Parents( s(X i )) )) Cloudy Cloudy +c +s 0.1 +c +r 0.8 • Re Return (x (x 1 , x , x 2 , …, …, x n ) -s 0.9 -r 0.2 +s 0.5 +r 0.2 -c -c Sprinkler Sprinkler Rain Rain -s 0.5 -r 0.8 Samples: WetGrass WetGrass +r +w 0.99 +s +c, -s, +r, +w -w 0.01 -r +w 0.90 -c, +s, -r, +w -w 0.10 … +r +w 0.90 -s -w 0.10 +w 0.01 -r -w 0.99 Example Prior Sampling • This s process ss generates s sa samples s with pr proba babi bility ty: • We We’ll ll dr draw w a ba batch of sa samples f s from t the B BN: +c, -s, +r +c +r, +w +w C +c +c, +s +s, +r +r, +w +w S R -c, +s +s, +r +r, -w … i.e … .e. th . the B BN’s s joint probability +c +c, -s, +r +r, +w +w W -c, c, -s, s, -r, +w +w • Let the number of sa samples s of an eve vent be • If we want to kn know P( P(W) • Co Count outcomes <+ <+w: 4 : 4, , -w: 1 : 1> • Then Then • No Normaliz lize to get P(W) = = <+w <+w: 0 : 0.8 .8, , -w: 0 : 0.2 .2> • Estimate will get closer to the true distribution with more samples • Can estimate anything else, too • What about P(C | +w +w) ? P( P(C | +r, r, +w +w) ? P( P(C | -r, r, -w) ? • i.e., the sa sampling procedure is s consi sist stent* • Fa Fast st: can use fewer samples if less time (what’s the drawback?) (different from consi sist stent heurist stic, or arc consi sist stency) y)

Rejection Sampling Rejection Sampling • Le Let’ t’s s sa say y we want P( P(C) • No point keeping all samples around C • Just tally counts of C as we go S R W • Le Let’ t’s s sa say y we want P( P(C | +s) • Sa Same me id idea: tally C outcomes, but ignore +c +c, -s, +r +r, +w +w (reject) samples which don’t have S= S=+s +c, +s +c +s, +r +r, +w +w • This is called re reje jectio ion samplin ling -c, +s +s, +r +r, -w +c, -s, +r +c +r, +w +w • It is also consistent for conditional probabilities -c, c, -s, s, -r, +w +w (i.e., correct in the limit of large N ) Rejection Sampling Likelihood Weighting • In Input: t: evidence assignments • For For i = 1, 2, …, …, n • Sa Sample mple x i fr from om P(X P(X i | Parents( s(X i )) )) • If x i not consi If sist stent with evi vidence • Re Reje ject: Return – no sample is generated in this cycle • Re Return (x (x 1 , x , x 2 , …, …, x n )

<latexit sha1_base64="91+0FuZ923Dcgnlovm9ic1E65zk=">AHK3icfZVb9MwFMe9AWU28YeanokIaIqmQd3eBp2pjG2ViF6mpJidx21Dngu107Sx/Fl7hgU/DE4hXvgd2EiCJwxylsc75nb+P3WPbibFPmWl+X1q+cfNW4/bKnebde/cfPFxde3RKo4S46MSNcETOHUgR9kN0wnyG0XlMEAwcjM6c6YHyn80QoX4UvmeLGA0DOA79ke9CJk0Xq+v9zQOjdyA9rPadG6/nls4vVtkx09bSO1beae+BrPUv1hrA9iI3CVDIXAwpHVhmzIYcEua7GImnVAUQ3cKx2guyEMEB3yNHvRajafFvw8RJfxnKE5EzX2ALKJaBb1OAxoZq0YR1HIaNmKodSli6BsdYKy4gDN4io9L0PCWVONBcqRQ+N5BqnOXPwQkS/PhoX3DT6HUNa2tHVBmCvByxdk1DPhoxJgiFObO7bVi93RoTkiM0T/KVJzKuEi9hmTaz0E3WExlWp1ez5D/lWGmb7cr9Ij9dBYZb+XYdfyxmtEfeTNVz8Je1MAHCxiWxdVr/Yc+ypaikLt5rfpbAsMxKmZTmLBKXi4QbJm3CgIYOhxe4ZcMbCG3EYhTQhSNcNtJ8KeLAj54W1LCKEFZSEyNvVL0aIXzSAW3DZk0AgjV5UK37Axgh5l0YawX1UD5qI8/IzP0GLzEJjFhpzpTFXGjPRGBsxWDPHpAwmVaGPFSE2SXWqMriCYXkIeTUcq8pBbcRBYknvrb2kIwDqNYzihGBLCLqULn02QT7gc8oz/1Cj/LD6OkvzrYSUh9es4/FBopOvgtGRkr7zN0vIpo8TULXDasgx0chsw9Swsc7mJ0MN7C40ON23NWik6+abUMHpnfFStd7fG0LvnG51rG6n+267vbef3x4r4DF4AjaBXbAHngD+uAEuGABPoHP4Evja+Nb40fjZ4YuL+Ux6DUGr9+A/T9jb4=</latexit> Likelihood Weighting Likelihood Weighting Ex Examp mple: P ( C, R | + s, + w ) • Pr Problem m with rejection samp mpling: • Id Idea: ea: fi fix ev eviden ence ce an and sa sample t the r rest st +c 0.5 -c 0.5 • If evidence is unlikely, rejects lots of samples • Pr Probl blem: sample distribution not consistent! • Evidence not exploited as you sample • So Solution: Assign a we weig ight by according to Cloudy Cloudy probability of evidence given parents • Consider P( P( Sh Shape pe | bl blue ) +s 0.1 +r 0.8 +c +c -s 0.9 -r 0.2 pyramid, blue -c +s 0.5 -c +r 0.2 pyramid, green pyramid, blue Sprinkler Rain Rain -s 0.5 -r 0.8 pyramid, red sphere, blue Shape Color Shape Color sphere, blue cube, blue cube, red sphere, blue Samples: WetGrass WetGrass +w 0.99 +s +r sphere, green +c, +s, +r, +w -w 0.01 -r +w 0.90 … -w 0.10 +w 0.90 -s +r -w 0.10 -r +w 0.01 In Intuition: As Assign higher w w to to “good” samples -w 0.99 (i.e. samples with high probability for evidence) Likelihood Weighting Likelihood Weighting • t: evidence assignment In Input: • w w = 1. 1.0 • Sampling dist stribution if z sa sampled and e fixe xed evi vidence • for or i = 1 , 2 , …, n • if X i is s an evi vidence va variable Cloudy C • X i = x i (from evidence) • w w = w * P( P(x i | Parents( s(X i )) )) S R • • else se Now, sa samples s have ve we weig ights ts • Sam Sampl ple x i fro from P( P(X i | Parents( s(X i )) )) W • Ret etur urn n (x (x 1 , x , x 2 , …, …, x n ) , w • Together, weighted sa sampling dist stribution is s consi sist stent

Bayes Net Representation CS 4100: Artificial Intelligence Bayes - PowerPoint PPT Presentation

Bayes Net Representation CS 4100: Artificial Intelligence Bayes Nets: Sampling A A di directed, d, acyclic graph ph, o , one n node p per r random v variable A A co conditional al probab ability tab able (CP CPT) )

Bayes Net Representation CS 4100: Artificial Intelligence Bayes Nets: Sampling A A di

Artificial Intelligence Artificial Intelligence Artificial Intelligence Study and design of

Artificial Intelligence Course Presentation Summary Artificial Intelligence Motivations

Artificial Intelligence Course Presentation Summary Artificial Intelligence Motivations

Naive Bayes and Gaussian Bayes Classifier Ladislav Rampasek slides by Mengye Ren and others

CS 4100: Artificial Intelligence Bayes Nets Jan-Willem van de Meent, Northeastern University

CS 4100: Artificial Intelligence Bayes Nets: Sampling Jan-Willem van de Meent, Northeastern

Probabilistic Models CS 4100: Artificial Intelligence Bayes Nets Models describe how (a

Probabilistic Models CS 4100: Artificial Intelligence Bayes Nets Models describe how (a

CS 4100: Artificial Intelligence Nave Bayes Jan-Willem van de Meent, Northeastern University

Artificial intelligence Artificial Intelligence is the science of PHILOSOPHY OF ARTIFICIAL

Artificial Intelligence Intro (Chapter 1 of AIMA) Summary Artificial Intelligence What is AI?

The Nave Bayes Classifier Machine Learning 1 Todays lecture The nave Bayes Classifier

Bayes Theorem Thomas Bayes (1701-1761) Simple form of Bayes Theorem, for

What is Artificial Intelligence? CPSC 322 Lecture 1 September 5, 2007 What is Artificial

Traditional Definition of Artificial Intelligence Trends Artificial Intelligence (AI) is

Linear Systems II CS3220 Summer 2008 Jonathan Kaldor Revisiting the LU Factorization Goal:

Dense Matrix Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany

graphical models Class 1 Rina Dechter Dechter-Morgan&claypool book (Dbook): Chapters 1-2

Inference: Graph Search CS 6355: Structured Prediction 1 So far in the class Thinking about

Examples: Well-formed types These are types: int bool int * bool int * int ->

Systems of Linear Equations Systems of Linear Equations The purpose of computing is insight, not

Graphical models Review Graphical models (Bayes nets, Markov random fields, factor graphs) !

Transferring imaginaries How to eliminate imaginaries in p-adic fields Silvain Rideau (joint

Bayes Net Representation CS 4100: Artificial Intelligence Bayes - PowerPoint PPT Presentation

Bayes Net Representation CS 4100: Artificial Intelligence Bayes Nets: Sampling A A di directed, d, acyclic graph ph, o , one n node p per r random v variable A A co conditional al probab ability tab able (CP CPT) )

Bayes Net Representation CS 4100: Artificial Intelligence Bayes Nets: Sampling A A di

Artificial Intelligence Artificial Intelligence Artificial Intelligence Study and design of

Artificial Intelligence Course Presentation Summary Artificial Intelligence Motivations

Artificial Intelligence Course Presentation Summary Artificial Intelligence Motivations

Naive Bayes and Gaussian Bayes Classifier Ladislav Rampasek slides by Mengye Ren and others

CS 4100: Artificial Intelligence Bayes Nets Jan-Willem van de Meent, Northeastern University

CS 4100: Artificial Intelligence Bayes Nets: Sampling Jan-Willem van de Meent, Northeastern

Probabilistic Models CS 4100: Artificial Intelligence Bayes Nets Models describe how (a

Probabilistic Models CS 4100: Artificial Intelligence Bayes Nets Models describe how (a

CS 4100: Artificial Intelligence Nave Bayes Jan-Willem van de Meent, Northeastern University

Artificial intelligence Artificial Intelligence is the science of PHILOSOPHY OF ARTIFICIAL

Artificial Intelligence Intro (Chapter 1 of AIMA) Summary Artificial Intelligence What is AI?

The Nave Bayes Classifier Machine Learning 1 Todays lecture The nave Bayes Classifier

Bayes Theorem Thomas Bayes (1701-1761) Simple form of Bayes Theorem, for

What is Artificial Intelligence? CPSC 322 Lecture 1 September 5, 2007 What is Artificial

Traditional Definition of Artificial Intelligence Trends Artificial Intelligence (AI) is

Linear Systems II CS3220 Summer 2008 Jonathan Kaldor Revisiting the LU Factorization Goal:

Dense Matrix Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany

graphical models Class 1 Rina Dechter Dechter-Morgan&amp;claypool book (Dbook): Chapters 1-2

Inference: Graph Search CS 6355: Structured Prediction 1 So far in the class Thinking about

Examples: Well-formed types These are types: int bool int * bool int * int -&gt;

Systems of Linear Equations Systems of Linear Equations The purpose of computing is insight, not

Graphical models Review Graphical models (Bayes nets, Markov random fields, factor graphs) !

Transferring imaginaries How to eliminate imaginaries in p-adic fields Silvain Rideau (joint

graphical models Class 1 Rina Dechter Dechter-Morgan&claypool book (Dbook): Chapters 1-2

Examples: Well-formed types These are types: int bool int * bool int * int ->