continuous probability rvs distributions
play

Continuous Probability, RVs, Distributions EECS 126 Fall 2019 - PowerPoint PPT Presentation

Continuous Probability, RVs, Distributions EECS 126 Fall 2019 September 17, 2019 Agenda Announcements Review Continuous Probability Definitions Cumulative Distribution Functions Distributions Uniform Exponential Gaussian Analogs to


  1. Continuous Probability, RVs, Distributions EECS 126 Fall 2019 September 17, 2019

  2. Agenda Announcements Review Continuous Probability Definitions Cumulative Distribution Functions Distributions Uniform Exponential Gaussian Analogs to Discrete Probability / RVs Derived Distributions

  3. Announcements ◮ HW3 AND Lab2 are due Friday (9/20). ◮ Feel free to come to Lab Party with HW questions on Thursday! ◮ HW4 will be optional to give you more time to study. We still recommend reading and attempting the problems. ◮ Midterm 1 is coming up quick on 9/26! You can find past exams on the Exams page of the website.

  4. Probability Densities In a continuous space, we describe distributions with probability density functions (PDFs) rather than assigned probability values. A valid probability density of a continuous random variable X in R , f X ( x ), requires ◮ Non-negativity: ∀ x ∈ R f X ( x ) ≥ 0 ◮ Normalized: � R f X ( x ) dx = 1

  5. Continuous Probability Definitions Getting probabilities from densities: ◮ P ( X ∈ B ) = � B f X ( x ) dx � b ◮ P ( X ∈ [ a , b ]) = P ( a ≤ X ≤ b ) = a f X ( x ) dx (Note: P ( X = a ) = 0, so open and closed intervals do not matter here) Figure: Geometric interpretation of the PDF

  6. Questions Suppose we uniformly sample a point in a ball of radius 1. What is the ◮ Probability of picking the origin? ◮ Probability density of picking the origin? ◮ Probability of picking a point on the surface? ◮ Probability of picking a point within a radius of 1 2 ?

  7. Answers ◮ Probability of picking the origin? 0. ◮ Probability density of picking the origin? 3 π r 3 = 4 Volume of ball is 4 3 3 π . Density is 4 π . ◮ Probability of picking a point on the surface? 0. A 2D surface has 0 volume in a 3D object. ◮ Probability of picking a point within a radius of 1 2 ? Since the we’re uniformly picking a point in the ball, we can 2 ) 3 4 π 3 ( 1 = 1 just look at the ratio of the volumes. 8 . 4 π 3

  8. Cumulative Distribution Functions (CDFs) In both discrete and continuous distributions, the cumulative distribution is defined as F X ( x ) := P ( X ≤ x ). However, they are computed slightly differently. � x F X ( x ) = f ( t ) dt −∞ Consequently (by the Fundamental Theorem of Calculus), f X ( x ) = d dx F X ( x )

  9. More familiar definitions Expectation: � ◮ E [ X ] := R xf X ( x ) dx ◮ E [ g ( X )] := � R g ( x ) f X ( x ) dx ◮ Linearity of expectation holds due to the linearity of integrals: E [ X + Y ] = E [ X ] + E [ Y ] Variance stays the same Var( X ) = E [( X − E [ X ]) 2 ] = E [ X 2 ] − E [ X ] 2

  10. Questions Let R be equal to the distance from the origin of a point randomly sampled on a unit ball. What is the ◮ CDF of R ? ◮ PDF of R ? ◮ Expectation of R ?

  11. Answers Let R be the distance from the origin of a point randomly sampled on a unit ball. What is the ◮ CDF of R ? 3 π r 3 = r 3 . 4 π · 4 3 F R ( r ) = ◮ PDF of R ? dr r 3 = 3 r 2 . d ◮ Expectation of R ? � 1 0 r · 3 r 2 = 3 4 .

  12. Uniform Distribution The density is uniform across a bounded interval ( a , b ). For X ∼ Unif ( a , b ) 1 f X ( x ) = b − a , a < x < b , Var( X ) = ( b − a ) 2 E [ X ] = a + b 2 12 Easy to work with distribution. Many problems can reduce to a uniform distribution!

  13. Uniform Variance Proof Var( X ) = E [ X 2 ] − E [ X ] 2 � b 1 E [ X ] = x b − adx a x 2 2( b − a ) | b = a = a + b 2 � b 1 E [ X 2 ] = x 2 b − adx a x 3 3( b − a ) | b = a = b 3 − a 3 3( b − a ) Var( X ) = b 3 − a 3 3( b − a ) − ( a + b ) 2 = ( b − a ) 2 4 12

  14. Exponential Distribution The exponential distribution PDF: f X ( x ) = λ e − λ x , x > 0 The exponential distribution CDF: F X ( x ) = 1 − e − λ x , x > 0 E [ X ] = 1 λ, Var( X ) = 1 λ 2 Figure: Exponential distribution for varying λ

  15. Memoryless Property The defining characteristic of the exponential is the memoryless property. Recall the memoryless property is: P ( X > x + a | X > x ) = P ( X > a ) Think about banging your head on the wall. What distribution does this remind you of?

  16. Connection to Geometric One can think of the exponential distribution as the continuous analog to the geometric distribution. Remark: These are the only distributions in discrete and continuous spaces respectively with the memoryless property! Figure: Relating the Exponential dist. to the Geometric dist.

  17. Connection to Geometric cont. Intuition that the geometric distribution approaches the exponential distribution as trials per second approaches infinity. Let X ∼ Geo ( p ) , Y ∼ Expo ( λ ). Recall the CDF of the geometric distribution F X ( n ) = 1 − (1 − p ) n , we have e − λδ = 1 − p . Thus, If we let δ = − ln (1 − p ) λ F X ( n ) = F Y ( n δ ). If we drive δ down, we can interpret this as a geometric r.v. holding infinitely many trials per second while making sure that the expected number of trials passed stays the same. As δ → 0, we approach a continuous exponential distribution.

  18. Normal / Gaussian Distribution The Gaussian is seen abundantly in nature (e.g. exam scores). This can be explained by the Central Limit Theorem (CLT), which we will go over later in the course. Gaussian PDF and CDF for mean µ and variance σ 2 : 1 2 πσ 2 e − ( x − µ ) 2 / 2 σ 2 f X ( x ) = √ F X ( x ) = Φ( x ) , (cannot be expressed in elementary functions)

  19. Properties of the Gaussian ◮ The sum of two independent Gaussians is Gaussian. If X ∼ N ( µ 1 , σ 2 1 ), Y ∼ N ( µ 2 , σ 2 2 ), and Z = X + Y , then Z ∼ N ( µ 1 + µ 2 , σ 2 1 + σ 2 2 ) ◮ The sum of two dependent Gaussians isn’t always Gaussian. Consider the following example. X = N (0 , 1) � w.p. 1 X 2 Y = w.p. 1 − X 2 They are both Gaussian but X + Y is not Gaussian. ◮ A Gaussian multiplied by a constant is Gaussian. If X ∼ N ( µ, σ 2 ) and Y = aX , then Y ∼ N ( a · µ, a 2 · σ 2 )

  20. Scaling to the Standard Gaussian ◮ The properties on the previous slide allow us to convert any Gaussian into the standard Gaussian. ◮ If X ∼ N ( µ, σ 2 ), then Z = X − µ σ is distributed with Z ∼ N (0 , 1). ◮ Intuition: I got 1 SD on midterm 1.

  21. Joint PDFs Just how multiple discrete RVs have a joint PMF, multiple continuous RVs have a joint PDF. ◮ Discrete p X , Y ( x , y ) ◮ Continuous f X , Y ( x , y ) ◮ Still needs to be non-negative. ◮ Still needs to integrate to 1.

  22. Joint CDFs ◮ Single RV F X ( x ) = P ( X ≤ x ) ◮ Multiple RVs F X , Y ( x , y ) = P ( X ≤ x , Y ≤ y ) ◮ Single RV d dx F X ( x ) = f X ( x ) ◮ Multiple RV ∂ 2 ∂ x ∂ y F X , Y ( x , y ) = f X , Y ( x , y )

  23. Marginal Probability Density ◮ Discrete � p X ( x ) = p X , Y ( x , y ) y ∈Y ◮ Continuous � ∞ f X ( x ) = f X , Y ( x , y ) dy −∞ ◮ f X ( x ) is still a density, not a probability.

  24. Conditional Probability Density ◮ Discrete p X | Y ( x | y ) = p X , Y ( x , y ) p Y ( y ) ◮ Continuous f X | Y ( x | y ) = f X , Y ( x , y ) f Y ( y ) ◮ By definition, Multiplication Rule still holds.

  25. Independence Similar to discrete, 3 equivalent definitions. ◮ For all x and y , f X , Y ( x , y ) = f X ( x ) f Y ( y ) ◮ For all x and y , f X | Y ( x | y ) = f X ( x ) ◮ For all x and y , f Y | X ( y | x ) = f Y ( y )

  26. Bayes Rule ◮ Discrete (simple form) p X | Y ( x | y ) = p Y | X ( y | x ) p X ( x ) p Y ( y ) ◮ Discrete (extended form) p Y | X ( y | x ) p X ( x ) p X | Y ( x | y ) = x ′ ∈X p Y | X ( y | x ′ ) p X ( x ′ ) � ◮ Continuous (simple form) f X | Y ( x | y ) = f Y | X ( y | x ) f X ( x ) f Y ( y ) ◮ Continuous (extended form) f Y | X ( y | x ) f X ( x ) f X | Y ( x | y ) = � ∞ −∞ f Y | X ( y | t ) p X ( t ) dt

  27. Conditional Expectation ◮ Discrete � E [ Y | X = x ] = y · p Y | X ( y | x ) y ∈Y ◮ Continuous � ∞ E [ Y | X = x ] = y · f Y | X ( y | x ) dy −∞

  28. Combining Discrete and Continuous RVs ◮ You can also have discrete and continuous RVs defined jointly. ◮ Ex. let X be the outcome of a dice roll and Y be Exp ( X ). p X ( x ) = 1 6 f Y | X ( y | x ) = xe − xy

  29. Change of Variables / Derived Distributions ◮ Let X ∼ U [0 , 1], and Y = 2 X . Then is it true that f Y ( y ) = P ( Y = y ) = P (2 X = y ) = P ( X = y 2) = f X ( y 2) ◮ No, this won’t integrate to 1. ◮ You have to use the CDF. F Y ( y ) = P ( Y ≤ y ) = P (2 X ≤ y ) = P ( X ≤ y 2) = F X ( y 2) ◮ 2) · 1 f Y ( y ) = d dy F X ( y 2) = f X ( y 2

  30. References Introduction to probability. DP Bertsekas, JN Tsitsiklis - 2002

Recommend


More recommend