cs 188 artificial intelligence
play

CS 188: Artificial Intelligence Probability Pieter Abbeel UC - PDF document

CS 188: Artificial Intelligence Probability Pieter Abbeel UC Berkeley Many slides adapted from Dan Klein. 1 Our Status in CS188 We re done with Part I Search and Planning! Part II: Probabilistic Reasoning Diagnosis


  1. CS 188: Artificial Intelligence Probability Pieter Abbeel – UC Berkeley Many slides adapted from Dan Klein. 1 Our Status in CS188 § We ’ re done with Part I Search and Planning! § Part II: Probabilistic Reasoning § Diagnosis § Tracking objects § Speech recognition § Robot mapping § Genetics § Error correcting codes § … lots more! § Part III: Machine Learning 2 1

  2. Part II: Probabilistic Reasoning § Probability § Distributions over LARGE Numbers of Random Variables § Representation § Independence § Inference § Variable Elimination § Sampling § Hidden Markov Models 3 Probability § Probability § Random Variables § Joint and Marginal Distributions § Conditional Distribution § Inference by Enumeration § Product Rule, Chain Rule, Bayes ’ Rule § Independence § You ’ ll need all this stuff A LOT for the next few weeks, so make sure you go over it now and know it inside out! The next few weeks we will learn how to make these work computationally efficiently for LARGE numbers of random variables. 4 2

  3. Inference in Ghostbusters § A ghost is in the grid somewhere § Sensor readings tell how close a square is to the ghost § On the ghost: red § 1 or 2 away: orange § 3 or 4 away: yellow § 5+ away: green § Sensors are noisy, but we know P(Color | Distance) P(red | 3) P(orange | 3) P(yellow | 3) P(green | 3) 0.05 0.15 0.5 0.3 Uncertainty § General situation: § Evidence : Agent knows certain things about the state of the world (e.g., sensor readings or symptoms) § Hidden variables : Agent needs to reason about other aspects (e.g. where an object is or what disease is present) § Model : Agent knows something about how the known variables relate to the unknown variables § Probabilistic reasoning gives us a framework for managing our beliefs and knowledge 6 3

  4. Random Variables § A random variable is some aspect of the world about which we (may) have uncertainty § R = Is it raining? § D = How long will it take to drive to work? § L = Where am I? § We denote random variables with capital letters § Like variables in a CSP, random variables have domains § R in {true, false} (sometimes write as {+r, ¬ r}) § D in [0, ∞ ) § L in possible locations, maybe {(0,0), (0,1), … } 7 Probability Distributions § Unobserved random variables have distributions T P W P warm 0.5 sun 0.6 cold 0.5 rain 0.1 fog 0.3 meteor 0.0 § A distribution is a TABLE of probabilities of values § A probability (lower case value) is a single number § Must have: 8 4

  5. Joint Distributions § A joint distribution over a set of random variables: specifies a real number for each assignment (or outcome ): T W P hot sun 0.4 § Size of distribution if n variables with domain sizes d? hot rain 0.1 cold sun 0.2 § Must obey: cold rain 0.3 § For all but the smallest distributions, impractical to write out 9 Probabilistic Models Distribution over T,W § A probabilistic model is a joint distribution over a set of random variables T W P hot sun 0.4 § Probabilistic models: § (Random) variables with domains hot rain 0.1 Assignments are called outcomes cold sun 0.2 § Joint distributions: say whether assignments (outcomes) are likely cold rain 0.3 § Normalized: sum to 1.0 § Ideally: only certain variables directly Constraint over T,W interact T W P § Constraint satisfaction probs: hot sun T § Variables with domains § Constraints: state whether assignments hot rain F are possible § Ideally: only certain variables directly cold sun F interact cold rain T 10 5

  6. Events § An event is a set E of outcomes T W P hot sun 0.4 hot rain 0.1 cold sun 0.2 § From a joint distribution, we can calculate cold rain 0.3 the probability of any event § Probability that it ’ s hot AND sunny? § Probability that it ’ s hot? § Probability that it ’ s hot OR sunny? § Typically, the events we care about are partial assignments , like P(T=hot) 11 Marginal Distributions § Marginal distributions are sub-tables which eliminate variables § Marginalization (summing out): Combine collapsed rows by adding T P hot 0.5 T W P cold 0.5 hot sun 0.4 hot rain 0.1 cold sun 0.2 W P cold rain 0.3 sun 0.6 rain 0.4 12 6

  7. Conditional Probabilities § A simple relation between joint and conditional probabilities § In fact, this is taken as the definition of a conditional probability T W P hot sun 0.4 hot rain 0.1 cold sun 0.2 cold rain 0.3 13 Conditional Distributions § Conditional distributions are probability distributions over some variables given fixed values of others Conditional Distributions Joint Distribution W P T W P sun 0.8 hot sun 0.4 rain 0.2 hot rain 0.1 cold sun 0.2 cold rain 0.3 W P sun 0.4 rain 0.6 14 7

  8. Normalization Trick § A trick to get a whole conditional distribution at once: § Select the joint probabilities matching the evidence § Normalize the selection (make it sum to one) T W P hot sun 0.4 T R P T P hot rain 0.1 hot rain 0.1 hot 0.25 Normalize Select cold sun 0.2 cold rain 0.3 cold 0.75 cold rain 0.3 § Why does this work? Sum of selection is P(evidence)! (P(r), here) 15 Probabilistic Inference § Probabilistic inference: compute a desired probability from other known probabilities (e.g. conditional from joint) § We generally compute conditional probabilities § P(on time | no reported accidents) = 0.90 § These represent the agent ’ s beliefs given the evidence § Probabilities change with new evidence: § P(on time | no accidents, 5 a.m.) = 0.95 § P(on time | no accidents, 5 a.m., raining) = 0.80 § Observing new evidence causes beliefs to be updated 16 8

  9. Inference by Enumeration § P(sun)? S T W P summer hot sun 0.30 summer hot rain 0.05 § P(sun | winter)? summer cold sun 0.10 summer cold rain 0.05 winter hot sun 0.10 winter hot rain 0.05 winter cold sun 0.15 § P(sun | winter, warm)? winter cold rain 0.20 17 Inference by Enumeration § General case: § Evidence variables: § Query* variable: All variables § Hidden variables: § We want: § First, select the entries consistent with the evidence § Second, sum out H to get joint of Query and evidence: § Finally, normalize the remaining entries to conditionalize § Obvious problems: * Works fine with § Worst-case time complexity O(d n ) multiple query § Space complexity O(d n ) to store the joint distribution variables, too 9

  10. Inference by Enumeration Example 2: Model for Ghostbusters § Reminder: ghost is hidden, Joint Distribution sensors are noisy T B G P(T,B,G) § T: Top sensor is red B: Bottom sensor is red +t +b +g 0.16 G: Ghost is in the top +t +b ¬ g 0.16 § Queries: +t ¬ b +g 0.24 P( +g) = ?? +t ¬ b ¬ g 0.04 P( +g | +t) = ?? P( +g | +t, -b) = ?? ¬ t +b +g 0.04 ¬ t +b ¬ g 0.24 § Problem: joint ¬ t ¬ b +g 0.06 distribution too large / complex ¬ t ¬ b ¬ g 0.06 The Product Rule § Sometimes have conditional distributions but want the joint § Example: D W P D W P wet sun 0.1 wet sun 0.08 R P dry sun 0.9 dry sun 0.72 sun 0.8 wet rain 0.7 wet rain 0.14 rain 0.2 20 dry rain 0.3 dry rain 0.06 10

  11. The Chain Rule § More generally, can always write any joint distribution as an incremental product of conditional distributions § Why is this always true? § Can now build a joint distributions only specifying conditionals! § Bayesian networks essentially apply the chain rule plus make 21 conditional independence assumptions. Bayes ’ Rule § Two ways to factor a joint distribution over two variables: That ’ s my rule! § Dividing, we get: § Why is this at all helpful? § Lets us build one conditional from its reverse § Often one conditional is tricky but the other one is simple § Foundation of many systems we ’ ll see later (e.g. ASR, MT) § In the running for most important AI equation! 22 11

  12. Inference with Bayes ’ Rule § Example: Diagnostic probability from causal probability: § Example: § m is meningitis, s is stiff neck Example givens § Note: posterior probability of meningitis still very small § Note: you should still get stiff necks checked out! Why? 23 Ghostbusters, Revisited § Let ’ s say we have two distributions: § Prior distribution over ghost location: P(G) § Let ’ s say this is uniform § Sensor reading model: P(R | G) § Given: we know what our sensors do § R = reading color measured at (1,1) § E.g. P(R = yellow | G=(1,1)) = 0.1 § We can calculate the posterior distribution P(G|r) over ghost locations given a reading using Bayes ’ rule: 24 12

Recommend


More recommend