Intro to AI: Lecture 8 Volker Sorge Introduction A Bayesian Network Inference in Bayesian Networks Bayesian Networks Volker Sorge
Intro to AI: Specifying Probability Distributions Lecture 8 Volker Sorge Introduction A Bayesian Network ◮ Specifying a probability for every atomic event is Inference in Bayesian Networks impractical ◮ We have already seen it can be easier to specify probability distributions by using (conditional) independence ◮ Bayesian (Belief) Networks allow us ◮ to specify any distribution, ◮ to specify such distributions concisely if there is (conditional) independence, in a natural way
Intro to AI: Idea of a Bayesian Network Lecture 8 Volker Sorge Introduction A Bayesian Network ◮ Fix set of random variables { X 1 , . . . , X n } . Inference in Bayesian Networks ◮ If every variable takes k values, we have to compute k n conditional probabilities to get the complete set of probability distributions. ◮ A Bayesian Network tries to avoid this by representing direct influences between random variables and restricting the necessary probability distributions that need to be computed to those direct influences.
Intro to AI: Setting Up a Bayesian Network Lecture 8 Volker Sorge Introduction A Bayesian ◮ Every random variable { X 1 , . . . , X n } is a node in the Network network. Inference in Bayesian Networks ◮ Influences are given by directed edges between nodes. ◮ Each node holds the joint probability distribution of with its parents nodes. ◮ If we do this naively, we can still end up computing close to k n probabilities. ◮ If we exploit conditional independence, we can reduce complexity to kn . Every node of a Bayesian Network is conditionallly independent of its non-descendants given its parents.
Intro to AI: Example: Bayesian Net Lecture 8 Volker Sorge Introduction A B A Bayesian P ( A ) P ( B ) Oversleeps Network Pershore closed . 6 . 2 Inference in Bayesian Networks C D Volker late Mark late A B P ( C ) B P ( D ) E T T . 9 T . 3 T F . 7 Committee cancelled F . 4 F T . 8 C D P ( E ) F F . 2 T T . 9 T F . 4 F T . 5 F F . 3
Intro to AI: Probabilistic Inference: Goal Lecture 8 Volker Sorge Introduction A Bayesian Network ◮ Compute the probability distribution for some event Inference in given some evidence. Bayesian Networks ◮ More formally: ◮ Let Q be a set of query variables , ◮ let E be a set of evidence variables , ◮ compute P ( Q | E ). ◮ Here evidence means that we know the exact event for the variables in E . ◮ E.g., we know that Volker has overslept, how likely is it that the committee will be cancelled?
Intro to AI: Types of Inference Lecture 8 Volker Sorge Introduction A Bayesian Diagnostic Inferences From effects to causes. Network Inference in How likely is a cause for some observed event? Bayesian Networks Causal Inferences From causes to effects. How likely will some observed events cause some other event? Intercausal Inferences Between causes of a common effect. How likely is that if we know one cause for an event, that some other cause is also happening? This is also sometimes called “explaining away”. Mixed Inferences Combining two or more of the above.
Intro to AI: Inferences in Bayesian Nets Lecture 8 Volker Sorge Introduction A schematic overview for queries Q and observed evidence E : A Bayesian Network Inference in Diagnostic Causal Intercausal Mixed Bayesian Networks Q E E Q E Q Q E E
Intro to AI: Inference Examples Lecture 8 Volker Sorge Introduction A Bayesian Network Inference in Bayesian Networks Diagnostic P ( B | E ) = [ < . 21 , . 79 > ] Causal P ( E | A ) = [ < . 533 , . 467 > ] Intercausal P ( C | D ) = [ < . 557 , . 443 > ] Mixed P ( C | B , E ) = [ < . 904 , . 096 > ] Computed with http://aispace.org/bayes/ .
Intro to AI: Notation Lecture 8 Volker Sorge Introduction ◮ P stands for simple probability. A Bayesian ◮ P stands for a probability distribution (i.e., a set of Network Inference in probabilities). Bayesian Networks ◮ P ( A | B ) denotes the probability of A under the condition B . ◮ P ( A | B ) denotes the probability distribution for A under the condition B . ◮ P ( A , B ) is the not yet normalised distribution for A under the condition B . That is, α P ( A , B ) = P ( A | B ). ◮ Finally small letters stand for probability variables that have to be summed out (sometimes called nuisance variables).
Intro to AI: Example Computation Lecture 8 Volker Sorge Introduction A Bayesian ◮ P ( B | E = T ) = α P ( B , E = T ) Network Inference in ◮ We compute P ( B , E ) by summing out the remaining Bayesian Networks variables A , C , D . ◮ We will write a , c , d for the respective events. ◮ This means we have to compute � � � d P ( B , E , a , c , d ), which is the (not a c normalised) distribution of B under the assumption that E = T , while summing out a , c , d . ◮ A simple example how to sum out is: � a P ( B , a ) = P ( B | A ) + P ( B |¬ A ).
The great advantage of a Bayesian network is that we Intro to AI: Lecture 8 effectively can use all conditional probabilities given to Volker Sorge express the term P ( B , E , a , c , d ) as follows: Introduction A Bayesian � � � � � � P ( B , E ) = P ( B , E , a , c , d ) = P ( B ) P ( a ) P ( d | B ) P ( c | B , a Network a c a c Inference in d d Bayesian Networks We observe that all the probability distributions on the right hand side are indeed fully given in the network. The summing out works as follows: � � � � � � P ( B , E ) = P ( B , E , a , c , d ) = P ( B ) P ( a ) P ( d | B ) P ( c | B , a a c a c d d � � � = P ( B ) P ( a ) P ( d | B ) P ( c | B , a ) P ( E | c , d ) a c d � � = P ( B ) P ( A ) P ( d | B ) P ( c | B , A ) P ( E | c , d ) c d � � + P ( B ) P ( ¬ A ) P ( d | B ) P ( c | B , ¬ A ) P ( E | c , d ) c d � = P ( B ) P ( A ) P ( d | B ) P ( C | B , A ) P ( E | C , d )
Intro to AI: Questions Lecture 8 Volker Sorge Introduction A Bayesian Network ◮ In the above Bayesian Network, give examples for the Inference in following concepts Bayesian Networks ◮ independent events, ◮ conditionally independent events, and ◮ dependent events. ◮ Consider again the example network. Compute the mixed inference with respect to observed evidence that no one overslept and the committee was cancelled (i.e., A = T and E = T ). How likely is it that Volker was late?
Recommend
More recommend