statistical filtering and control for ai and robotics
play

Statistical Filtering and Control for AI and Robotics Part I. Bayes - PowerPoint PPT Presentation

Statistical Filtering and Control for AI and Robotics Part I. Bayes filtering Riccardo Muradore 1 / 49 Outline of the Course lesson 1 Introduction to Probabilistic Robotics; Basics of Probability; Bayes filtering [R.M.] lesson 2 Basics


  1. Statistical Filtering and Control for AI and Robotics Part I. Bayes filtering Riccardo Muradore 1 / 49

  2. Outline of the Course ◮ lesson 1 Introduction to Probabilistic Robotics; Basics of Probability; Bayes filtering [R.M.] ◮ lesson 2 Basics of Linear methods for Regression; Kalman filtering and applications [R.M.] ◮ lesson 3 Nonparametric filters; Particle filter [R.M.] ◮ lesson 4 Planning and Control: Markov Decision Processes [A.F.] ◮ lesson 5 Exploration and information gathering [A.F.] ◮ lesson 6 Plan monitoring for robotics; Applications for mobile robots [A.F.] 2 / 49

  3. Outline of this Lesson Motivation Basics of probability Bayes filtering 3 / 49

  4. Motivation 4 / 49

  5. 50 years of robotics 5 / 49

  6. Boston dynamics 6 / 49

  7. DARPA challenge 7 / 49

  8. Basics of probability 8 / 49

  9. Why Probabilistic Robotics? ◮ At the core of probabilistic robotics is the idea of estimating state from sensor data. State estimation addresses the problem of estimating quantities from sensor data that are not directly observable, but that can be inferred. ◮ Sensors carry only partial information about those quantities, and their measurements are corrupted by noise. State estimation seeks to recover state variables from the data. Probabilistic state estimation algorithms compute belief distributions over possible world states. ◮ In probabilistic robotics, quantities such as sensor measurements, controls, and the states a robot and its environment might assume are all modeled as random variables. ◮ Probabilistic inference is the process of calculating these laws for random variables that are derived from other random variables, such as those modeling sensor data. 9 / 49

  10. Reference This lecture is based on the following book Sebastian Thrun, Wolfram Burgard and Dieter Fox, “Probabilistic Robotics”, MIT Press, 2005 Several pictures from this book have been copied and pasted here 10 / 49

  11. Discrete random variables Let X be a Discrete random variable, i.e. X ∈ X := { x 1 , . . . , x N } , N is countable p ( X = x ) = p ( x ) probability than X takes the value x ∈ X p ( · ) is called probability mass function, p ( · ) ≥ 0 Law of total probability � p ( x ) = 1 x ∈X 11 / 49

  12. Continuous random variables Let X be a Continuous random variable, i.e. X takes on an uncountably infinite number of possible outcomes (support S ) � b P ( a < X < b ) = p ( x ) dx , ( a , b ) ⊂ S a p ( · ) is called probability density function (PDF) Definition (PDF) The probability density function of a continuous random variable X with support S is an integrable function p ( x ) such that 1. p ( x ) is positive everywhere in the support S ; p ( x ) > 0 , ∀ x ∈ S 2. p ( x ) satisfies the Law of total probability � p ( x ) dx = 1 S 3. the probability that X ∈ A , where A ⊆ S , is given by � P ( X ∈ A ) = A p ( x ) dx 12 / 49

  13. Notation We will ofter refer to the probability mass function and to the probability density function as probability. 13 / 49

  14. Joint probability Let X and Y be two random variables, the joint distribution is p ( x , y ) = p ( X = x and Y = y ) X and Y are independent if p ( x , y ) = p ( X = x ) p ( Y = y ) = p ( x ) p ( y ) p xy ( x , y ) = p x ( X = x ) p y ( Y = y ) = p x ( x ) p y ( y ) 14 / 49

  15. Conditional probability Conditional probability: probability that X has value x conditioned on the fact that Y value is y p ( x | y ) = p ( X = x | Y = y ) If p ( y ) > 0, the conditional probability of x given y is p ( x | y ) = p ( x , y ) p ( y ) if X and Y are independent p ( x | y ) = p ( x ) 15 / 49

  16. Theorem of Total probability Discrete random variables � p ( x ) = p ( x | y ) p ( y ) y ∈Y Continuous random variables � p ( x ) = p ( x | y ) p ( y ) dy S y 16 / 49

  17. Bayes rule Discrete random variables p ( x | y ) = p ( y | x ) p ( x ) p ( y | x ) p ( x ) ( ∗ ) ( ∗∗ ) = = η p ( y | x ) p ( x ) x ′ ∈X p ( y | x ′ ) p ( x ′ ) p ( y ) � Continuous random variables p ( x | y ) = p ( y | x ) p ( x ) p ( y | x ) p ( x ) ( ∗ ) ( ∗∗ ) = = η p ( y | x ) p ( x ) � S x p ( y | x ′ ) p ( x ′ ) dx ′ p ( y ) (*) = theorem of total probability (**) = η is the normalization symbol 17 / 49

  18. Bayes rule’s meaning Let us focus on the continuous r.v. p ( x | y ) = p ( y | x ) p ( x ) p ( y ) ◮ x is the quantity we need to infer from the data y ◮ p ( x ) is the prior probability (or a priori probability), i.e. it is the knowledge about x we have before using the information in y ◮ p ( y ) is the probability of the measurements y (e.g. how the sensor works) ◮ p ( x | y ) is the posterior probability ◮ p ( y | x ) is the “inverse” probability. It describes how the x causes the measurement y 18 / 49

  19. More on Bayes rule Remark 1. if y is independent of x (i.e. if y carries no information about x ) we end up with p ( x | y ) = p ( y | x ) p ( x ) = p ( y , x ) p ( x ) p ( y ) = p ( y ) p ( x ) p ( x ) p ( y ) = p ( x ) p ( y ) p ( x ) p ( x ) Remark 2. It is possible to condition the Bayes rule on Z = z p ( x | y , z ) = p ( y | x , z ) p ( x | z ) p ( y | z ) 19 / 49

  20. Conditional independence Let x and y be two independent r.v., we know that p ( x , y ) = p ( x ) p ( y ) What is the meaning of? p ( x , y | z ) = p ( x | z ) p ( y | z ) → x and y are conditionally independent on another r.v. Z = z . the r.v. y carries no information about the r.v. x if z is known 20 / 49

  21. Conditional independence p ( x , y | z ) = p ( x | z ) p ( y | z ) is equivalent to p ( x | z ) = p ( x | y , z ) p ( y | z ) = p ( y | x , z ) Pay attention! Conditional independence does not imply independence p ( x , y | z ) = p ( x | z ) p ( y | z ) p ( x , y ) = p ( x ) p ( y ) � Independence does not imply conditional independence p ( x , y ) = p ( x ) p ( y ) p ( x , y | z ) = p ( x | z ) p ( y | z ) � 21 / 49

  22. Mean and Variance Let X be a discrete r.v., the expectation (or expected value, or mean) is � E [ X ] := xp ( x ) x ∈X The conditional mean of X assuming M is given by � E [ X |M ] := xp ( x |M ) x ∈X Let X be a continuous r.v., the expectation (or expected value, or mean) is � E [ X ] := xp ( x ) dx S x The conditional mean of X assuming M is given by � E [ X |M ] := xp ( x |M ) dx S x 22 / 49

  23. Mean and Variance If M = { Y = y } then � E [ X | y ] := xp ( x | y ) dx S x Theorem Given the r.v. X and a function g ( · ) , the mean of the random variable Y = g ( X ) is � E [ Y ] = g ( x ) p ( x ) dx S x Theorem (Linearity) E [ a 1 g 1 ( X ) + . . . + a N g N ( X )] = a 1 E [ g 1 ( X )] + . . . + a N E [ g N ( X )] ( E [ aX + b ] = a E [ X ] + b ) 23 / 49

  24. Mean and Variance Let X be a discrete r.v. with mean µ = E [ X ], the variance σ 2 is σ 2 := E [( X − µ ) 2 ] = � ( x − µ ) 2 P ( x ) x ∈X Let X be a continuous r.v. with mean µ = E [ X ], the variance σ 2 is � σ 2 := E [( X − µ ) 2 ] = ( x − µ ) 2 p ( x ) dx S x The following relationship holds σ 2 = E [( X − µ ) 2 ] = E [ X 2 ] − E 2 [ X ] σ is called standard deviation 24 / 49

  25. Covariance Let X and Y be two r.v. with mean µ x = E [ X ] and µ y = E [ Y ], respectively. The covariance of X and Y is by definition the number Σ xy = E [( X − µ x )( Y − µ y )] . The following relationship holds Σ xy = E [( X − µ x )( Y − µ y )] = E [ XY ] − E [ X ] E [ Y ] The correlation coefficient r is the ratio r xy = Σ xy σ x σ y with | r xy | ≤ 1 Remark. the r.v. X , Y and X − E [ X ], Y − E [ Y ] have the same covariance and correlation coefficient 25 / 49

  26. Uncorrelation and Orthogonality Definition Two r.v. X , Y are uncorrelated if their covariance is zero, i.e. Σ xy = 0 ⇔ r xy = 0 ⇔ E [ XY ] = E [ X ] E [ Y ] Definition Two r.v. X , Y are orthogonal ( X ⊥ Y ) if E [ XY ] = 0 26 / 49

  27. Uncorrelation and Orthogonality Observations: ◮ if X and Y are uncorrelated, then X − µ x and Y − µ y are orthogonal X − µ x ⊥ Y − µ y , ◮ if X and Y are uncorrelated and µ x = 0 and µ y = 0, then X ⊥ Y , ◮ if X and Y are independent, then they are uncorrelated (the converse is false), ◮ if X and Y are Gaussian and uncorrelated, then they are independent, ◮ if X and Y are uncorrelated with mean µ x , µ y and variance σ 2 x , σ 2 y , then the mean and the variance of the r.v. Z = X + Y are = µ x + µ y µ z σ 2 σ 2 x + σ 2 = z y 27 / 49

  28. Conditional Mean and Variance We already introduced the conditional mean of the r.v. X assuming Y = y � µ x | y = E [ X | y ] = xp ( x | y ) dx S x We can also define the conditional variance of the r.v. X assuming Y = y � σ 2 x | y = E [( X − µ x | y ) 2 | y ] = ( x − µ x | y ) 2 p ( x | y ) dx S x Observations: ◮ E [ g ( X , Y ) | y ] = � S x g ( x , y ) p ( x | y ) dx = E [ g ( X , y ) | y ] ◮ E [ E [ X | y ]] = E [ X ] 28 / 49

  29. Conditional Mean and Variance Is there any difference between E [ X | y ] and E [ X | Y ]? YES!!! ϕ ( y ) = E [ X | y ] is a function of y whereas ϕ ( Y ) = E [ X | Y ] is a random variable Observations: ◮ E [ E [ X | Y ]] = E [ X ] ◮ E [ E [ g ( X , Y ) | Y ]] = E [ g ( X , Y )] 29 / 49

Recommend


More recommend