Bayesian Networks Inference with Probabilistic Graphical Models - PowerPoint PPT Presentation

4190.408 2016-Spring Bayesian Networks Inference with Probabilistic Graphical Models Byoung-Tak Zhang Biointelligence Lab Seoul National University B io 4190.408 Artificial Intelligence ( 2016-Spring) 1 I ntelligence

Machine Learning? • Learning System : – A system which autonomously improves its performance (P) by automatically forming model (M) based on experiential data (D) obtained from interaction with environment (E) • Self-improving Systems (Perspective of AI) • Knowledge Discovery (Perspective of Data Mining) • Data-Driven Software Design (Perspective of Software Engineering) • Automatic Programming (Perspective of Computer Engineering) B io 4190.408 Artificial Intelligence ( 2016-Spring) 2 I ntelligence

Machine Learning as Automatic Programming Traditional Programming Data Output Computer Program Machine Learning Data Program Computer Output B io 4190.408 Artificial Intelligence ( 2016-Spring) 3 I ntelligence

Machine Learning (ML): Three Tasks • Supervised Learning – Estimate an unknown mapping from known input and target output pairs   – Learn f w from training set D = {( x , y )} s.t. ( ) ( ) f x y f x w – Classification: y is discrete – Regression: y is continuous • Unsupervised Learning – Only input values are provided  – Learn f w from D = {( x )} s.t. ( ) f x x w – Density estimation and compression – Clustering, dimension reduction • Sequential (Reinforcement) Learning – Not target, but rewards (critiques) are provided “sequentially” – Learn a heuristic function f w from D t = {( s t , a t , r t ) | t = 1, 2, …} s.t. ( , , ) f w s a r t t t – With respect to the future, not just past – Sequential decision-making – Action selection and policy learning B io 4190.408 Artificial Intelligence ( 2016-Spring) 4 I ntelligence

Machine Learning Models • Supervised Learning • Probabilistic Graph – Neural Nets – Bayesian Networks – Decision Trees – Markov Networks – K-Nearest Neighbors – Hidden Markov Models – Support Vector – Hypernetworks Machines • Unsupervised Learning • Dynamic System – Self-Organizing Maps – Kalman Filters – Clustering Algorithms – Sequential Monte Carlo – Manifold Learning – Particle Filters – Evolutionary Learning – Reinforcement Learning B io 4190.408 Artificial Intelligence ( 2016-Spring) 5 I ntelligence

Outline • Bayesian Inference – Monte Carlo – Importance Sampling – MCMC • Probabilistic Graphical Models – Bayesian Networks – Markov Random Fields • Hypernetworks – Architecture and Algorithms – Application Examples • Discussion B io 4190.408 Artificial Intelligence ( 2016-Spring) 6 I ntelligence

Bayes Theorem B io 4190.408 Artificial Intelligence ( 2016-Spring) 7 I ntelligence

MAP vs. ML • What is the most probable hypothesis given data? • From Bayes Theorem • MAP (Maximum A Posteriori) • ML (Maximum Likelihood) B io 4190.408 Artificial Intelligence ( 2016-Spring) 8 I ntelligence

Bayesian Inference B io 4190.408 Artificial Intelligence ( 2016-Spring) 9 I ntelligence

Prof. Schrater’s Lecture Notes (Univ. of Minnesota) B io 4190.408 Artificial Intelligence ( 2016-Spring) 10 I ntelligence

B io 4190.408 Artificial Intelligence ( 2016-Spring) 11 I ntelligence

Monte Carlo (MC) Approximation B io 4190.408 Artificial Intelligence ( 2016-Spring) 12 I ntelligence

Markov chain Monte Carlo B io 4190.408 Artificial Intelligence ( 2016-Spring) 13 I ntelligence

MC with Importance Sampling B io 4190.408 Artificial Intelligence ( 2016-Spring) 14 I ntelligence

Graphical Models Graphical Models (GM) Other Semantics Causal Models Chain Graphs Dependency Networks Directed GMs Undirected GMs Bayesian Networks Markov Random Fields / Markov FST DBNs Mixture networks Decision Models Simple Trees HMMs Kalman Models Segment Models Gibbs/Boltzman Factorial HMM Mixed PCA BMMs Distributions Memory Markov Models LDA B io 4190.408 Artificial Intelligence ( 2016-Spring) 15 I ntelligence

BAYESIAN NETWORKS B io 4190.408 Artificial Intelligence ( 2016-Spring) 16 I ntelligence

Bayesian Networks • Bayesian network – DAG (Directed Acyclic Graph) – Express dependence relations between variables – Can use prior knowledge on the data (parameters) n   A B C ( ) ( | ) P X P X pa i i  1 i P ( A,B,C,D,E ) = P ( A ) P ( B|A ) P ( C|B ) P ( D|A,B ) P ( E|B,C,D ) D E B io 4190.408 Artificial Intelligence ( 2016-Spring) 17 I ntelligence

Representing Probability Distributions • Probability distribution = probability for each combination of values of these attributes Hospital patients described by • Background: age, gender, history of diseases, … • Symptoms: fever, blood pressure, headache, … • Diseases: pneumonia, heart attack, … • Naïve representations (such as tables) run into troubles – 20 attributes require more than 220 106 parameters – Real applications usually involve hundreds of attributes B io 4190.408 Artificial Intelligence ( 2016-Spring) 18 I ntelligence

Bayesian Networks - Key Idea Exploit regularities !!! • utilize conditional independence • Graphical representation of conditional independence respectively “causal” dependencies B io 4190.408 Artificial Intelligence ( 2016-Spring) 19 I ntelligence

Bayesian Networks 1. Finite, directed acyclic graph E B 2. Nodes: (discrete) random variables A 3. Edges: direct influences J M 4. Associated with each node: a table representing a conditional probability distribution (CPD), quantifying the effect the parents have on the node B io 4190.408 Artificial Intelligence ( 2016-Spring) 20 I ntelligence

Bayesian Networks X 1 X 2 (0.2, 0.8) (0.6, 0.4) X 3 true 1 (0.2,0.8) true 2 (0.5,0.5) false 1 (0.23,0.77) false 2 (0.53,0.47) B io 4190.408 Artificial Intelligence ( 2016-Spring) 21 I ntelligence

Example: Use a DAG to model the causality Train Martin Norman Strike Oversleep Oversleep Martin Norman Norman Late Late untidy Boss Project Office Failure-in-Love Delay Dirty Boss Angry B io 4190.408 Artificial Intelligence ( 2016-Spring) 22 I ntelligence

Example: Attach prior probabilities to all root nodes Train Norman Strike Probability Oversleep Probability T 0.1 T 0.2 Train Martin F 0.9 Norman F 0.8 Strike Oversleep Oversleep Martin Oversleep Probability T 0.01 Martin Norman Norman F 0.99 Late Late untidy Boss Project Office Failure-in-Love Delay Dirty Boss failure- in-love Probability T 0.01 F 0.99 Boss Angry B io 4190.408 Artificial Intelligence ( 2016-Spring) 23 I ntelligence

Example: Attach prior probabilities to non-root nodes Each column is summed to 1. Train Martin Norman Strike Oversleep Oversleep Martin Norman Norman Late Late untidy Boss Project Office Failure-in-Love Delay Dirty Train strike Norman T F oversleep Martin oversleep T F Boss T F T F Norman T 0.6 0.2 Angry F 0.4 0.8 untidy T 0.95 0.8 0.7 0.05 Martin Late F 0.05 0.2 0.3 0.95 B io 4190.408 Artificial Intelligence ( 2016-Spring) 24 I ntelligence

Example: Attach prior probabilities to non-root nodes Each column is summed to 1. Boss Failure-in-love Train Martin Norman T F Strike Oversleep Oversleep Project Delay T F T F Martin Norman Norman Office Dirty Late Late untidy T F T F T F T F very 0.98 0.85 0.6 0.5 0.3 0.2 0 0.01 Boss Project Office mid 0.02 0.15 0.3 0.25 0.5 0.5 0.2 0.02 Boss Failure-in-Love Delay Dirty Angry little 0 0 0.1 0.25 0.2 0.3 0.7 0.07 no 0 0 0 0 0 0 0.1 0.9 Boss Angry B io 4190.408 Artificial Intelligence ( 2016-Spring) 25 I ntelligence

Inference B io 4190.408 Artificial Intelligence ( 2016-Spring) 26 I ntelligence

MARKOV RANDOM FIELDS (MARKOV NETWORKS) B io 4190.408 Artificial Intelligence ( 2016-Spring) 27 I ntelligence

Graphical Models Directed Graph Undirected Graph (e.g. Bayesian Network) (e.g. Markov Random Field) B io 4190.408 Artificial Intelligence ( 2016-Spring) 28 I ntelligence

Bayesian Image Analysis Noise Transmission Original Image Degraded (observed) Image Degradatio n Process A Priori Probabilit y                           Pr Pr Degraded Image Original Image Original Image  Pr Original Image Degraded Image                Pr Degraded Image          A Posteriori Probabilit y Marginal Likelihood B io 4190.408 Artificial Intelligence ( 2016-Spring) 29 I ntelligence

Image Analysis • We could thus represent both the observed image (X) and the true image (Y) as Markov random fields. X – observed image Y – true image • And invoke the Bayesian framework to find P(Y|X) B io 4190.408 Artificial Intelligence ( 2016-Spring) 30 I ntelligence

Bayesian Networks Inference with Probabilistic Graphical Models - PowerPoint PPT Presentation

4190.408 2016-Spring Bayesian Networks Inference with Probabilistic Graphical Models Byoung-Tak Zhang Biointelligence Lab Seoul National University B io 4190.408 Artificial Intelligence ( 2016-Spring) 1 I ntelligence Machine Learning?

CS 331: Bayesian Networks 2 1 Bayesian Networks Youve heard about how Bayesian networks

Bayesian Networks Youve heard about how Bayesian networks have revolutionized AI

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Bayesian networks (2) Lirong Xia Last class Bayesian networks compact, graphical

AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Bayesian Networks Directed Acyclic Graph (DAG)

Bayesian Methods for Neural Networks Readings: Bishop, Neural Networks for Pattern Recognition .

Chapter14 Probabilistic Reasoning (Bayesian Networks) Sec. 1 - 2 20070607 Chap14 1

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

Bayesian Networks Philipp Koehn 2 April 2020 Philipp Koehn Artificial Intelligence: Bayesian

Bayesian Networks Philipp Koehn 6 April 2017 Philipp Koehn Artificial Intelligence: Bayesian

Probabilistic Modeling: Bayesian Networks Bioinformatics: Sequence Analysis COMP 571 - Spring

Bayesian Networks Li Xiong Slide credits: Page (Wisconsin) CS760 , Zhu (Wisconsin) KDD 12

Bayesian Networks Philipp Koehn 29 October 2015 Philipp Koehn Artificial Intelligence: Bayesian

Part 7 Bayesian hierarchical modelling, simulation and MCMC by Gero Walter 252 Bayesian

Fields for Information Extraction S U N I T A S A R A W A G I A N D W I L L I A M C O H E N

Markov Networks [KF] Chapter 4 CS 786 University of Waterloo Lecture 7: May 24, 2012 Outline

Models CMSC 678 UMBC Announcement 1: Progress Report on Project Due Monday April 16 th , 11:59

Discrete Markov Random Fields the Inference story Pradeep Ravikumar Graphical Models, The

t t sts

ProbabilisticGraphicalModels(Cmput651): UndirectedGraphicalModels1

Sequential Supervised Learning Sequential Supervised Learning Many Application Problems Require

CS 6784: Spring 2010 Advanced Topics in Machine Learning Review Guozhang Wang February 25, 2010