Artificial Intelligence: Methods and Applications Lecture 9: Probabilistic reasoning (Bayesian Networks) Juan Carlos Nieves Sánchez December 02, 2014
Outline • Motivation • Syntax • Semantics • Parameterized distributions Bayesian Networks 3
General Inference Procedure • Let X be the query variable, • Let E by the set of evidence variables, • Let e be the observed values for them, • Let Y be the remaining unobserved variables. Then the query P (X|e) can be avaluated as: where the summation is over all possible 𝒛 s (i.e., all possible combinations of values of the unobserved variables Y ) Bayesian Networks 4
Some observations This approach to inference does not scale well. For a domain described by 𝑜 variables, where 𝑒 is the largest arity. Worst-case time complexity 𝑃(𝑒 𝑜 ) 1. Space complexity 𝑃(𝑒 𝑜 ) to store the joint distribution. 2. For these reasons, the full joint distribution in tabular form is not a practical tool for building reasoning systems. How do you avoid the exponential space and time complexity of the inference based on probability distributions? Let us take advantage of independence and Bayes’ Rule Bayesian Networks 5
Independence Bayesian Networks 6
Independence How to think about conditional independence: If knowing 𝐷 tells me everything about 𝐵 , I don’t gain anything by knowing 𝐶 . Conditional independence assertions can allow probabilistic systems to scale up; moreover, they are much more commonly available than absolute independence assertions. Bayesian Networks 7
Bayesian Networks Independece and conditional independece relationships among variables can greatly reduce the number of probabilities that need to be specified in order to define the full joint distribution . • Bayesian Networks is a data structure that can represent the dependencias amoung variables . • Bayesian Networks can represent essentialy any full joint probability distributuin and in many cases can do very concisely. Bayesian networks have been one of the most important • contribution to the field of AI . Provide a way to represent knowledge in an uncertain domain and • a way to reason about this knowledge. Many applications : medicine, factories, etc. • Bayesian Networks 8
Bayesian Network A Bayesian network is made up of two parts: 1. A directed acyclic graph 2. A set of parameters Earthquake Burglary Alarm Bayesian Networks 9
A directed acyclic graph Earthquake Burglary Alarm • The nodes are random variables (which can be discrete or continuous). • Arrows connect pairs of nodes (X is a parent of Y if there is an arrow from node X to node Y). • Intuitively, an arrow from node X to node Y means X has a direct influence on Y (we can say X has a casual effect on Y). • Easy for a domain expert to determine these relationships. Bayesian Networks 10
A set of parameters Earthquak Burglary e Alarm • Each node 𝑌 𝑗 has a conditional probability distribution 𝑄 𝑌 𝑗 𝑄𝑏𝑠𝑓𝑜𝑢𝑡(𝑌 𝑗 )) that quantifies the effect of the parents on the node. • The set of parameters are the probabilities in these conditional probabilities distributions. • As we have discrete random variables, we have conditional probability tables (CPTs). Bayesian Networks 11
Observations of the set of parameters • Conditonal Probability Distribution for Alarm stores the probability distribution for Alarm given the values of Burglary and Earthquake. • For a given combination of values of the parents ( 𝐶 and 𝐹 in this example), the entries for 𝑄(𝐵 = 𝑢𝑠𝑣𝑓|𝐶, 𝐹) and 𝑄(𝐵 = 𝑔𝑏𝑚𝑡𝑓|𝐶, 𝐹) must add up to 1. • For instance, 𝑄 𝐵 = 𝑢𝑠𝑣𝑓 𝐶 = 𝑔𝑏𝑚𝑡𝑓, 𝐹 = 𝑔𝑏𝑚𝑡𝑓 + 𝑄(𝐵 = 𝑔𝑏𝑚𝑡𝑓|𝐶 = 𝑔𝑏𝑚𝑡𝑓, 𝐹 = 𝑔𝑏𝑚𝑡𝑓) = 1 Bayesian Networks 12
Indepence and Bayesian Network What does it means the absence/presence of arrows in a Bayesian Network? Weather Cavity Catch Toothache Weather is independent of the other variables • Toothache and Catch are conditionally independent given • Cavity (this is represented by the fact that there is no link between Toothache and Catch and by the fact that they have Cavity as a parent) Bayesian Networks 13
Semantics of Bayesian Networks Two ways to view Bayes networks: 1. A representation of a joint probability distribution. 2. An encoding of a collection of conditional independence statements. Bayesian Networks 14
Representation of the Full Joint Distribution Write Factorization (chain rule) Bayesian Networks 15
Global Semantics Global semantics defines the full joint distribution as the product of the local conditional distributions. In other words, as a Bayesian Network structure implies that the value of a particular node is conditional only on the values of its parent nodes, this reduces to In which Bayesian Networks 16
Global Semantics Example: Bayesian Networks 17
Local Semantics We can look at the actual graph structure and determine conditional independence relationships. Local Semantics : A node 𝑌 is conditionally independent of its non- decendants (𝑎 1𝑘 𝑎 𝑜𝑘 ) , given its parents (𝑉 1 𝑉 𝑛 ) Theorem: Local semantics if and only if global semantics Bayesian Networks 18
Conditional Independence: Markov blanket A node 𝑌 is conditionally independent of all other nodes in the network, given its parents 𝑉 1 𝑉 𝑛 , children 𝑍 𝑜 , and children’s parents 𝑎 1𝑘 𝑎 𝑜𝑘 , 1 𝑍 that is given its Markov blanket: parents + children + childen’s parents Bayesian Networks 19
Pearl’s Network Construction Algorithm Need a method such that a series of locally testable assertions of conditional independence guarantees the required global semantics Bayesian Networks 20
Example: Lung Cancer Diagnosis A patient has been suffering from shortness of breath (called dyspnoea ) and visits the doctor, worried that he has lung cancer. The doctor knows that other diseases, such as tuberculosis and bronchitis are possible causes, as well as lung cancer. She also knows that other relevant information includes whether or not the patient is a smoker (increasing the chances of cancer and bronchitis) and what sort of air pollution he has been exposed to. A positive XRay would indicate either TB or lung cancer. Bayesian Networks 21
Lung cancer example: nodes and values Bayesian Networks 22
Lung cancer example: CPTs Are the CPTs expressing all the possible combination of values? Bayesian Networks 23
Reasoning with Bayesian Networks • Basic task for any probabilistic inference system: Compute the posterior probability distribution for a set of query variables, given new information about some evidence variables. • Also called conditioning or belief updating or inference. Bayesian Networks 24
Most Usual Queries Let 𝑌 = 𝐹 ∪ 𝑍 ∪ 𝑎 , where 𝐹 are the evidence variable, 𝑓 are the observed values, 𝑍 are the varaible of interest, 𝑎 are the rest of the varaibles. Bayesian Networks 25
Types of reasoning How do you express these reasoning types in terms of conditional probabilities? Bayesian Networks 26
Some tools for using Bayesian Networks in real systems • BayesiaLab: http://www.bayesia.com/ • GeNIe: https://dslpitt.org/genie/ • Hugin: http://www.hugin.com • Netica: http://www.norsys.com/ Why not? you download one of these tools and play a little bit! Bayesian Networks 27
Sources of this Lecture • S. Russell, P. Norvig, Artificial Intelligence: A Modern Approach. Third Edition. • K. B. Korb, A. E. Nicholson, Bayesian Artificial Intelligence, Second Edition, 2010. Bayesian Networks 28
Recommend
More recommend