343h honors ai
play

343H: Honors AI Lecture 15: Bayes Nets Independence 3/18/2014 - PowerPoint PPT Presentation

343H: Honors AI Lecture 15: Bayes Nets Independence 3/18/2014 Kristen Grauman UT Austin Slides courtesy of Dan Klein, UC Berkeley Probability recap Conditional probability Product rule Chain rule X, Y independent if and only


  1. 343H: Honors AI Lecture 15: Bayes Nets Independence 3/18/2014 Kristen Grauman UT Austin Slides courtesy of Dan Klein, UC Berkeley

  2. Probability recap  Conditional probability  Product rule  Chain rule  X, Y independent if and only if:  X and Y are conditionally independent given Z if and only if:

  3. Bayes’ Nets  A Bayes’ net is an efficient encoding of a probabilistic model of a domain  Questions we can ask:  Inference: given a fixed BN, what is P(X | e)?  Representation: given a BN graph, what kinds of distributions can it encode?  Modeling: what BN is most appropriate for a given domain?

  4. Example: Alarm Network E P(E) B P(B) B urglary E arthqk +e 0.002 +b 0.001  e 0.998  b 0.999 A larm B E A P(A|B,E) +b +e +a 0.95 J ohn M ary  a +b +e 0.05 calls calls  e +b +a 0.94  e  a A J P(J|A) A M P(M|A) +b 0.06  b +e +a 0.29 +a +j 0.9 +a +m 0.7  b  a  j  m +e 0.71 +a 0.1 +a 0.3  b  e  a  a +a 0.001 +j 0.05 +m 0.01  b  e  a  a  j  a  m 0.999 0.95 0.99

  5. Bayes’ Net Semantics  A directed, acyclic graph, one node per random variable A 1 A n  A conditional probability table (CPT) for each node  A collection of distributions over X, one for each combination of parents’ values X  Bayes’ nets implicitly encode joint distributions  As a product of local conditional distributions

  6. Recall: Probabilities in BNs  Why are we guaranteed that setting results in a proper distribution?  Chain rule (valid for all distributions):  Due to assumed conditional independences: =  Consequence:

  7. P(+b, -e, +a, -j, +m) = P(+b) P(-e) P(+a | +b, -e) P(-j | +a) P(+m | +a) = Example: Alarm Network 0.001 x 0.998 x 0.94 x 0.1 x 0.7 E P(E) B P(B) B urglary E arthqk +e 0.002 +b 0.001  e 0.998  b 0.999 A larm B E A P(A|B,E) +b +e +a 0.95 J ohn M ary  a +b +e 0.05 calls calls  e +b +a 0.94  e  a A J P(J|A) A M P(M|A) +b 0.06  b +e +a 0.29 +a +j 0.9 +a +m 0.7  b  a  j  m +e 0.71 +a 0.1 +a 0.3  b  e  a  a +a 0.001 +j 0.05 +m 0.01  b  e  a  a  j  a  m 0.999 0.95 0.99

  8. Size of a Bayes’ Net  How big is a joint distribution over N Boolean variables? 2 N  How big is an N-node net if nodes have up to k parents? O(N * 2 k+1 )  Both give you the power to calculate  BNs: Huge space savings!  Also easier to elicit local CPTs  Also turns out to be faster to answer queries (coming) 8

  9. Bayes’ Net  Representation  Conditional independences  Probabilistic inference  Learning Bayes ’ Nets from data 9

  10. Conditional Independence  X and Y are independent if  X and Y are conditionally independent given Z  (Conditional) independence is a property of a distribution  Example: 10

  11. Bayes Nets: Assumptions  Assumptions we are required to make to define the Bayes net when given the graph:  Beyond the above ( “ chain-rule  Bayes net ” ) conditional independence assumptions  Often have many more conditional independences  They can be read off the graph  Important for modeling: understand assumptions made when choosing a Bayes net graph 11

  12. Example X Y Z W  Conditional independence assumptions directly from simplifications in chain rule:  Additional implied conditional independence assumptions? 12

  13. Independence in a BN  Important question about a BN:  Are two nodes independent given certain evidence?  If yes, can prove using algebra (tedious in general)  If no, can prove with a counter example  Example: X Y Z  Question: are X and Z necessarily independent?  Answer: no. Example: low pressure causes rain, which causes traffic.  X can influence Z, Z can influence X (via Y)

  14. D-separation: Outline  D-Separation: a condition/algorithm for answering such queries  Study independence properties for triples  Analyze complex cases in terms of member triples – reduce big question to one of the base cases. 14

  15. Causal Chains (1 of 3 structures)  This configuration is a “ causal chain ” X: Low pressure X Y Z Y: Rain Z: Traffic  Is X independent of Z given Y? Yes!  Evidence along the chain “ blocks ” the influence 15

  16. Common Cause (2 of 3 structures)  Another basic configuration: two Y effects of the same cause  Are X and Z independent? X Z  Are X and Z independent given Y? Y: Project due X: Piazza busy Z: Lab full Yes!  Observing the cause blocks influence between effects.

  17. Common Effect (3 of 3 structures)  Last configuration: two causes of one effect (v-structures)  Are X and Z independent? X Z  Yes: the ballgame and the rain cause traffic, but they are not correlated Y  Are X and Z independent given Y?  No: seeing traffic puts the rain and the X: Raining ballgame in competition as explanation Z: Ballgame Y: Traffic  This is backwards from the other cases  Observing an effect activates influence between possible causes.

  18. The General Case  General question : in a given BN, are two variables independent (given evidence)?  Solution : analyze the graph  Any complex example can be analyzed using these three canonical cases 18

  19. Reachability  Recipe: shade evidence nodes, L look for paths in the resulting graph R B  Attempt 1: if two nodes are connected by an undirected path blocked by a shaded node, they are conditionally independent D T  Almost works, but not quite  Where does it break?  Answer: the v-structure at T doesn ’ t count as a link in a path unless “ active ” 19

  20. Active / Inactive paths  Question: Are X and Y Active Triples Inactive Triples conditionally independent given evidence vars {Z}?  Yes, if X and Y “ separated ” by Z  Consider all undirected paths from X to Y  No active paths = independence!  A path is active if each triple is active:  Causal chain A  B  C where B is unobserved (either direction)  Common cause A  B  C where B is unobserved  Common effect (aka v-structure) A  B  C where B or one of its descendents is observed  All it takes to block a path is a single inactive segment

  21. Reachability  Recipe: shade evidence nodes, L look for paths in the resulting graph R B D T Traffic report

  22. D-Separation ?  Given query  For all (undirected!) paths between X i and X j  Check whether path is active  If active return  Otherwise (i.e., if all paths are inactive) then independence is guaranteed.  Return 22

  23. Example 1 Active Triples R B Yes T T’ 23

  24. Active Triples Example 2 L Yes R B Yes D T Yes T’ 24

  25. Active Triples Example 3  Variables:  R: Raining R  T: Traffic  D: Roof drips T D  S: I’m sad  Questions: S Yes 25

  26. Structure implications  Given a Bayes net structure, can run d-separation to build a complete list of conditional independences that are necessarily true of the form  This list determines the set of probability distributions that can be represented by this BN 26

  27. Computing all independences 27

  28. Topology Limits Distributions Y Y  Given some graph X Z topology G, only certain X Z joint distributions can Y be encoded X Z  The graph structure guarantees certain Y (conditional) independences X Z  (There might be more independence) Y Y  Y Adding arcs increases the set of distributions, X Z X Z X Z but has several costs  Full conditioning can Y Y Y encode any distribution 28 X Z X Z X Z

  29. Summary  Bayes nets compactly encode joint distributions  Guaranteed independencies of distributions can be deduced from BN graph structure  D-separation gives precise conditional independence guarantees from graph alone  A Bayes ’ net ’ s joint distribution may have further (conditional) independence that is not detectable until you inspect its specific distribution 29

  30. Bayes ’ Net  Representation  Conditional independences  Probabilistic inference  Learning Bayes ’ Nets from data 30

Recommend


More recommend