algorithms for probabilistic and
play

Algorithms for Probabilistic and Deterministic graphical Models - PowerPoint PPT Presentation

Algorithms for Probabilistic and Deterministic graphical Models Class 1 Rina Dechter Dechter-Morgan&claypool book (Dechter 1 book): Chapters 1-2 class1 828X-2018 Text Books class1 828X-2018 Outline Class page Introduction:


  1. Algorithms for Probabilistic and Deterministic graphical Models Class 1 Rina Dechter Dechter-Morgan&claypool book (Dechter 1 book): Chapters 1-2 class1 828X-2018

  2. Text Books class1 828X-2018

  3. Outline Class page • Introduction: Constraint and probabilistic graphical models. • Constraint networks: Graphs, modeling, Inference • Inference in constraints: Adaptive consistency, constraint propagation, arc-conistency • Graph properties: induced-width, tree-width, chordal graphs, hypertrees, join-trees • Bayesian and Markov networks: Representing independencies by graphs • Building Bayesian networks. • Inference in Probabilistic models: Bucket-elimination (summation and optimization), Tree-decompositions, Join-tree/Junction-tree algorithm • Search in CSPs: Backtracking, pruning by constraint propagation, backjumping and learning • Search in Graphical models: AND/OR search Spaces for likelihood, optimization queries • Approximate Bounded Inference: weighted Mini-bucket, belief-propagation, generalized belief propagation • Approximation by Sampling: MCMC schemes, Gibbs sampling, Importance sampling • Causal Inference with causal graphs. class1 828X-2018

  4. Course Requirements/Textbook • Homeworks : There will be 5-6 problem sets , graded 50% of the final grades. • A term project: paper presentation, a programming project (20%). • Final (30%) • Books: • “Reasoning with probabilistic and deterministic graphical models”, R. Dechter, Claypool, 2013 https://www.morganclaypool.com/doi/abs/10.2200/S00529ED1V01Y201 308AIM023 “Modeling and Reasoning with Bayesian Networks”, A. Darwiche, MIT o Press, 2009. “Constraint Processing” , R. Dechter, Morgan Kauffman, 2003 o class1 828X-2018

  5. AI Renaissance • Deep learning • Probabilistic models – Fast predictions – Slow reasoning – “Instinctive” – “Logical / deliberative” Tools: Tools: Graphical Models, Tensorflow, PyTorch , … Probabilistic programming, Markov Logic, … 5

  6. Outline of classes • Part 1: Introduction and Inference ABC DGF G D A B BDEF F C EFH E M K H L FHK J HJ KLM • Part 2: Search OR A A AND 0 1 0 1 OR B B B 0 1 0 1 AND 0 1 0 1 OR C C C C E E E E E 0 1 0 1 0 1 0 1 AND 0 1 0 1 0 1 0 1 C 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 OR D D D D F F F F AND D 01 01 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 F 0101010101010101010101010101010101010101010101010101010101010101 Context minimal AND/OR search graph • Parr 3: Variational Methods and Monte-Carlo Sampling class1 828X-2018

  7. RoadMap: Introduction and Inference • Basics of graphical models A – Queries B B C C – Examples, applications, and tasks – Algorithms overview D E D E • Inference algorithms, exact ABC DGF – Bucket elimination for trees G D A BDEF B – Bucket elimination F EFH C E – Jointree clustering K M H FHK – Elimination orders L For Constraints first HJ KLM J • Approximate elimination – Decomposition bounds – Mini-bucket & weighted mini-bucket – Belief propagation • Summary and Part 2 class1 828X-2018

  8. RoadMap: Introduction and Inference • Basics of graphical models A – Queries B B C C – Examples, applications, and tasks – Algorithms overview D E D E • Inference algorithms, exact ABC DGF – Bucket elimination for trees G D A BDEF B – Bucket elimination F EFH C E – Jointree clustering K M H FHK – Elimination orders L HJ KLM J • Approximate elimination – Decomposition bounds – Mini-bucket & weighted mini-bucket – Belief propagation • Summary and Class 2 class1 828X-2018

  9. Probabilistic Graphical models • Describe structure in large problems – Large complex system – Made of “smaller”, “local” interactions – Complexity emerges through interdependence class1 828X-2018

  10. Probabilistic Graphical models • Describe structure in large problems – Large complex system – Made of “smaller”, “local” interactions • Protein Structure prediction : predicting the 3d structure from given – Complexity emerges through interdependence sequences • • Examples & Tasks PDB: Protein design (backbone) algorithms enumerate a combinatorial number of candidate structures to compute the – Maximization (MAP): compute the most probable configuration Global Minimum Energy Conformation (GMEC). [Yanover & Weiss 2002] [Bruce R. Donald et. Al. 2016] class1 828X-2018

  11. Probabilistic Graphical models • Describe structure in large problems – Large complex system – Made of “smaller”, “local” interactions – Complexity emerges through interdependence • Examples & Tasks – Summation & marginalization “partition function” and Image segmentation and classification: Observation y Marginals p( x i | y ) Observation y Marginals p( x i | y ) sky cow plane grass grass e.g., [Plath et al. 2009] class1 828X-2018

  12. Graphical models • Describe structure in large problems – Large complex system – Made of “smaller”, “local” interactions – Complexity emerges through interdependence • Examples & Tasks – Mixed inference (marginal MAP, MEU, …) Test Drill Oil Test cost cost sales Influence diagrams & optimal decision-making Test Oil Oil sale Drill policy result produced (the “oil wildcatter” problem) Oil Market Seismic Sales underground information structure cost e.g., [Raiffa 1968; Shachter 1986] class1 828X-2018

  13. In more details… class1 828X-2018

  14. Constraint Networks Example: map coloring Variables - countries (A,B,C,etc.) Values - colors (red, green, blue)    Constraints: , A B, A D, D E etc. Constraint graph A E A B A E red green D red yellow D green red B F B green yellow F yellow green G yellow red C G C class1 828X-2018

  15. Propositional Reasoning Example: party problem A → • B If Alex goes, then Becky goes: • If Chris goes, then Alex goes: C → A • Question: Is it possible that Chris goes to B the party but Becky does not? A Is the C propositio nal theory  =  → →   , , B, C satisfiabl e? A B C A class1 828X-2018

  16. Radio Link Frequency Assignment Problem (Cabon et al., Constraints 1999) (Koster et al. , 4OR 2003) CELAR SCEN-07r ◼ n=162, d=44, m=764, optimum=343592 CELAR SCEN-06 n=100, d=44, m=350, optimum=3389 Dechter, Flairs-2018

  17. Bayesian Networks (Pearl 1988) An early example P(S) From medical diagnosis BN = Θ) Smoking (G, P(C|S) P(B|S) Bronchitis lung Cancer CPD: C B P(D|C,B) 0 0 0.1 0.9 0 1 0.7 0.3 P(X|C,S) P(D|C,B) 1 0 0.8 0.2 X-ray Dyspnoea 1 1 0.9 0.1 Combination: Product P(S, C, B, X, D) = P(S) P(C|S) P(B|S) P(X|C,S) P(D|C,B) Marginalization: sum/max • Posterior marginals, probability of evidence, MPE P( D= 0) = σ 𝑇,𝑀,𝐶,𝑌 P(S)· P(C|S)· P(B|S)· P(X|C,S)· P(D|C,B • MAP(P)= 𝑛𝑏𝑦 𝑇,𝑀,𝐶,𝑌 P(S)· P(C|S)· P(B|S)· P(X|C,S)· P(D|C,B) class1 828X-2018

  18. Alarm network [Beinlich et al., 1989] • Bayes nets: compact representation of large joint distributions The “alarm” network: 37 variables, 509 parameters (rather than 2 37 = 10 11 !) MINVOLSET KINKEDTUBE PULMEMBOLUS INTUBATION VENTMACH DISCONNECT PAP SHUNT VENTLUNG VENITUBE PRESS MINOVL FIO2 VENTALV PVSAT ANAPHYLAXIS ARTCO2 EXPCO2 SAO2 TPR INSUFFANESTH HYPOVOLEMIA LVFAILURE CATECHOL LVEDVOLUME STROEVOLUME HR ERRCAUTER HISTORY ERRBLOWOUTPUT CO CVP PCWP HREKG HRSAT HRBP BP class1 828X-2018

  19. Dechter, Flairs-2018

  20. Probabilistic reasoning (directed) Party example: the weather effect • Alex is-likely-to-go in bad weather W A P(A|W=bad)=.9 • Chris rarely-goes in bad weather W C P(C|W=bad)=.1 • Becky is indifferent but unpredictable W B P(B|W=bad)=.5 Questions: W A P(A|W) • Given bad weather, which group of individuals is most good 0 .01 likely to show up at the party? good 1 .99 P(W) • What is the probability that Chris goes to the party bad 0 .1 W but Becky does not? bad 1 .9 A P(W,A,C,B) = P(B|W) · P(C|W) · P(A|W) · P(W) B C P(A|W) P(A,C,B|W=bad) = 0.9 · 0.1 · 0.5 P(B|W) P(C|W) class1 828X-2018

  21. Mixed Probabilistic and Deterministic networks Alex is-likely-to-go in bad weather Chris rarely-goes in bad weather Becky is indifferent but unpredictable PN CN P(W) P(W) W W B B A A C C P(B|W) P(B|W) P(C|W) P(C|W) A→B A→B C→A C→A P(A|W) P(A|W) B B A A C C Query: Is it likely that Chris goes to the party if Becky does not but the weather is bad?  = → → ( , | , , ) P C B w bad A B C A class1 828X-2018

  22. Graphical models (cost networks) Example: A graphical model consists of: -- variables (we’ll assume discrete) -- domains -- functions or “factors” and a combination operator The combination operator defines an overall function from the individual factors, e.g., “+” : Notation: Discrete Xi values called states Tuple or configuration: states taken by a set of variables Scope of f: set of variables that are arguments to a factor f often index factors by their scope, e.g., class1 828X-2018

Recommend


More recommend