ece 4524 artificial intelligence and engineering
play

ECE 4524 Artificial Intelligence and Engineering Applications - PowerPoint PPT Presentation

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 17: Bayesian Inference Reading: AIAMA 13.5 and MacKay book Chapter 28 Todays Schedule: Bayes Rule and its implications Causal versus Diagnostic Reasoning


  1. ECE 4524 Artificial Intelligence and Engineering Applications Lecture 17: Bayesian Inference Reading: AIAMA 13.5 and MacKay book Chapter 28 Today’s Schedule: ◮ Bayes’ Rule and its implications ◮ Causal versus Diagnostic Reasoning ◮ Combining Evidence ◮ Conditional Independence ◮ Examples

  2. Bayes’ Theorem Consider a joint probability P ( A , B ) with A , B ∈ A . We can factor it using conditionals one of two ways: P ( A , B ) = P ( A | B ) P ( B ) = P ( B | A ) P ( A ) Rearranging give Bayes rule for probabilities: P ( A | B ) = P ( B | A ) P ( A ) P ( B ) or equivalently P ( B | A ) = P ( A | B ) P ( B ) P ( A ) This same relation holds for PMFs and PDFs.

  3. Bayes’ Theorem (Discrete Case) Given a discrete r.v. X and some data D: p ( D | x ) p ( x ) p ( x | D ) = � i p ( D | x ) P ( x ) or Posterior = Likelihood Prior Evidence

  4. Bayes’ Theorem (Continuous Case) Given a continuous random variable X and some data D: f ( D | x ) f ( x ) f ( x | D ) = � f ( D | x ) f ( x ) dx or again Posterior = Likelihood Prior Evidence

  5. Models To specify the likelihood, we need a way to generate the probability of the data, D , given x . This is a forward or generative model that depends on x .

  6. Causal versus Diagnostic Reasoning Two ways to view Bayes’ rule: P ( cause | effect ) = P ( effect | cause ) P ( cause ) P ( effect ) This allows us to do causal reasoning using Models. P ( effect | cause ) = P ( cause | effect ) P ( effect ) P ( cause ) This allows us to do diagnostic reasoning using Models.

  7. Warmup #1 There is a test for a deadly disease you could have. A test outcome of T=0 implies you do not have the disease and T=1 that you do. The test is 95% reliable (meaning it is correct 95% of the time). Given your age and family history you have a 1% prior probability of having the disease. The test comes back positive (T=1). How worried are you and why?

  8. Exercise Suppose a Robot has an acoustic sensor that measures distance to an obstacle every T seconds. The sensor has an associated error represented as a bias and variance from the true distance. Establish a probability model for this problem, making appropriate suggestions for the form of any probability densities.

  9. Another Classic Example (Pearl 1988, McKay 2003) ◮ Fred lives in Los Angeles and commutes 60 miles to work. Whilst at work, he receives a phone-call from his neighbor saying that Fred’s burglar alarm is ringing. What is the probability that there was a burglar in his house today? ◮ While driving home to investigate, Fred hears on the radio that there was a small earthquake that day near his home. ‘Oh’, he says, feeling relieved, ‘it was probably the earthquake that set off the alarm’. What is the probability that there was a burglar in his house?

  10. Combining Evidence ◮ Conditional Independence ◮ Factoring the Joint Probability Recall that the joint probability distribution tells us all we need to know to make inferences. However, ◮ The complexity of Bayesian inference is dominated by the dimensionality of the joint density. ◮ For every additional evidence feature introduced the data required to estimate the parameters goes up by a factor of at least 10, for even simple N-D Gaussians. ◮ For 100s of features, most samples from a 100-D Gaussian distribution are not even inside the variance ellipsoid! ◮ This is even worse for more complex joint distributions.

  11. Warmup #2 You are given the prior probability of an event A , P ( A ) = 0 . 7. There is another event, B , that we know the outcome of. Describe briefly what effect there is on our knowledge of event A in three cases: if P ( A | B ) < P ( A ), if P ( A | B ) > P ( A ), and if P ( A | B ) = P ( A ).

  12. The naive Bayes model Let C be a condition (class) and E i the evidence for that condition (feature), the naive Bayes model assumes a factorization of the joint probability: N � P ( C , E 1 , E 2 , · · · , E N ) = P ( C ) P ( E i | C ) i =1 i.e. that the evidence features are independent given the condition.

  13. Example from Wumpus World

  14. Next Actions ◮ Reading on Bayesian Networks AIAMA 14.1-14.3 ◮ There is no warmup. Quiz II is Thursday 3/22 . Covers lectures 9-15 (PL and FOL).

Recommend


More recommend