Bayesian Networks Representation Machine Learning 10701/15781 - PowerPoint PPT Presentation

Bayesian Networks – Representation Machine Learning – 10701/15781 Carlos Guestrin Carnegie Mellon University March 16 th , 2005

Handwriting recognition Character recognition, e.g., kernel SVMs r r r r r c r a c c z b

Webpage classification Company home page vs Personal home page vs Univeristy home page vs …

Handwriting recognition 2

Webpage classification 2

Today – Bayesian networks � One of the most exciting advancements in statistical AI in the last 10-15 years � Generalizes naïve Bayes and logistic regression classifiers � Compact representation for exponentially-large probability distributions � Exploit conditional independencies

Causal structure � Suppose we know the following: � The flu causes sinus inflammation � Allergies cause sinus inflammation � Sinus inflammation causes a runny nose � Sinus inflammation causes headaches � How are these connected?

Possible queries � Inference Flu Allergy � Most probable Sinus explanation Nose Headache � Active data collection

Car starts BN � 18 binary attributes � Inference � P(BatteryAge|Starts=f) � 2 18 terms, why so fast? � Not impressed? � HailFinder BN – more than 3 54 = 58149737003040059690390169 terms

Factored joint distribution - Preview Flu Allergy Sinus Nose Headache

Number of parameters Nose Allergy Sinus Headache Flu

Key: Independence assumptions Flu Allergy Sinus Nose Headache Knowing sinus separates the variables from each other

(Marginal) Independence � Flu and Allergy are (marginally) independent Flu = t Flu = f � More Generally: Allergy = t Allergy = f Flu = t Flu = f Allergy = t Allergy = f

Conditional independence � Flu and Headache are not (marginally) independent � Flu and Headache are independent given Sinus infection � More Generally:

The independence assumption Flu Allergy Local Markov Assumption: A variable X is independent Sinus of its non-descendants given its parents Nose Headache

Local Markov Assumption: Explaining away A variable X is independent of its non-descendants given its parents Flu Allergy Sinus Nose Headache

Naïve Bayes revisited Local Markov Assumption: A variable X is independent of its non-descendants given its parents

What about probabilities? Conditional probability tables (CPTs) Flu Allergy Sinus Nose Headache

Joint distribution Flu Allergy Sinus Nose Headache Why can we decompose? Markov Assumption!

Real Bayesian networks applications � Diagnosis of lymph node disease � Speech recognition � Microsoft office and Windows � http://www.research.microsoft.com/research/dtg/ � Study Human genome � Robot mapping � Robots to identify meteorites to study � Modeling fMRI data � Anomaly detection � Fault dianosis � Modeling sensor network data

A general Bayes net � Set of random variables � Directed acyclic graph � Encodes independence assumptions � CPTs � Joint distribution:

Another example � Variables: � B – Burglar � E – Earthquake � A – Burglar alarm � N – Neighbor calls � R – Radio report � Both burglars and earthquakes can set off the alarm � If the alarm sounds, a neighbor may call � An earthquake may be announced on the radio

Another example – Building the BN � B – Burglar � E – Earthquake � A – Burglar alarm � N – Neighbor calls � R – Radio report

Defining a BN � Given a set of variables and conditional independence assumptions � Choose an ordering on variables, e.g., X 1 , …, X n � For i = 1 to n � Add X i to the network � Define parents of X i , Pa Xi , in graph as the minimal subset of {X 1 ,…,X i-1 } such that local Markov assumption holds – X i independent of rest of {X 1 ,…,X i-1 }, given parents Pa Xi � Define/learn CPT – P(X i | Pa Xi )

How many parameters in a BN? � Discrete variables X 1 , …, X n � Graph � Defines parents of X i , Pa Xi � CPTs – P(X i | Pa Xi )

We may not know conditional Defining a BN 2 independence assumptions and even variables � Given a set of variables and conditional independence assumptions � Choose an ordering on variables, e.g., X 1 , …, X n � For i = 1 to n There are good orderings and bad � Add X i to the network ones – A bad ordering may need more parents per variable → must � Define parents of X i , Pa Xi , in graph as the minimal learn more parameters subset of {X 1 ,…,X i-1 } such that local Markov assumption holds – X i independent of rest of {X 1 ,…,X i-1 }, given parents Pa Xi � Define/learn CPT – P(X i | Pa Xi ) How???

Learning the CPTs For each discrete variable X i Data x (1) … x (m)

Learning Bayes nets Known structure Unknown structure Fully observable data Missing data

Queries in Bayes nets � Given BN, find: � Probability of X given some evidence, P(X|e) � Most probable explanation, max x1,…,xn P(x 1 ,…,x n | e) � Most informative query � Learn more about these next class

What you need to know � Bayesian networks � A compact representation for large probability distributions � Not an algorithm � Semantics of a BN � Conditional independence assumptions � Representation � Variables � Graph � CPTs � Why BNs are useful � Learning CPTs from fully observable data � Play with applet!!! ☺

Acknowledgements � JavaBayes applet � http://www.pmr.poli.usp.br/ltd/Software/javabayes/Ho me/index.html

Bayesian Networks Representation Machine Learning 10701/15781 - PowerPoint PPT Presentation

Bayesian Networks Representation Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University March 16 th , 2005 Handwriting recognition Character recognition, e.g., kernel SVMs r r r r r c r a c c z b Webpage

CS 331: Bayesian Networks 2 1 Bayesian Networks Youve heard about how Bayesian networks

Bayesian networks (2) Lirong Xia Last class Bayesian networks compact, graphical

Bayesian Networks Youve heard about how Bayesian networks have revolutionized AI

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Bayesian Networks Directed Acyclic Graph (DAG)

Bayesian Methods for Neural Networks Readings: Bishop, Neural Networks for Pattern Recognition .

Chapter14 Probabilistic Reasoning (Bayesian Networks) Sec. 1 - 2 20070607 Chap14 1

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

Bayesian Networks Philipp Koehn 2 April 2020 Philipp Koehn Artificial Intelligence: Bayesian

Bayesian Networks Philipp Koehn 6 April 2017 Philipp Koehn Artificial Intelligence: Bayesian

Probabilistic Modeling: Bayesian Networks Bioinformatics: Sequence Analysis COMP 571 - Spring

Bayesian Networks Li Xiong Slide credits: Page (Wisconsin) CS760 , Zhu (Wisconsin) KDD 12

Bayesian Networks Philipp Koehn 29 October 2015 Philipp Koehn Artificial Intelligence: Bayesian

Part 7 Bayesian hierarchical modelling, simulation and MCMC by Gero Walter 252 Bayesian

Clinician burnout: a hot topic and getting hotter. Are electronic medical records fuelling the

Second Generation BTK Inhibitors Acalabrutinib (ACP-196) and Zanubrutinib (BGB-3111)

CSE 232A Graduate Database Systems Arun Kumar Topic 2: Indexing and Sorting Chapters 10,

CS573 Data Privacy and Security Anonymization methods Anonymization methods Li Xiong Today

Bayesian Networks George Konidaris gdk@cs.duke.edu Spring 2016 Recall Joint distributions:

Bayes Nets 10-701 recitation 04-02-2013 Bayes Nets Represent dependencies between variables

How to Explain Log-Linear Towards an Explanation Resulting Explanation Relation Between Amount

Taking Care of You July 29th, 2020 IAFSP Rapid Response Virtual Home Visiting Webinar