Continuous RVs Continued: Independence, Conditioning, Gaussians, CLT CS 70, Summer 2019 Lecture 25, 8/6/19 1 / 26
Not Too Di ff erent From Discrete... Discrete RV: X and Y are independent i ff for all a , b : P [ X = a , Y = b ] = P [ X = a ] · P [ Y = b ] Continuous RV: X and Y are independent i ff for all a b , c d : P [ a X b , c Y d ] = d ] Plas x IPCC EYE X Eb ] 2 / 26
A Note on Independence For continuous RVs, what is weird about the following? P [ X = a , Y = b ] = P [ X = a ] · P [ Y = b ] To To - O = What we can do: consider a interval of length dx around a and b ! =p f X Efa , ) ) - b) , YE [ b. ) atdx btdy IPCX a , Y - - - , btdy at DX ) ) REY E [ b PEX E Ca , = b) dy ) dxlffyl ffx (a) = 3 / 26
Independence, Continued If X , Y are independent, their joint density is the product of their individual densities: . fy C y ) fxlx ) f X , Y ( x , y ) = Example: If X , Y are independent exponential RVs with parameter λ : f x ( x ) y ( y ) , y ) f f × , y ( X = - M ) - xxxx e - ( = xe - XCX ty ) He = 4 / 26
Example: Max of Two Exponentials ' is Let X ⇠ Expo( λ ) and Y ⇠ Expo( µ ) . : X and Y are independent . Compute P [max( X , Y ) � t ] . Et ] ↳ , Y ) lpfmaxcx ut I = - - IP [ X I , YET ] Et = - ¥¥E¥¥ : : indecencies , Use this to compute E [max( X , Y )] . - fo Tails Efmaxt - maxzt ] dt p[ - : HUH )dt - e- Ut atte = So - - ( e - -1 tu life at = - poftiegra.TN 5 / 26
! ni Txizx ] " Min of n Uniforms Let X 1 , . . . , X n be i.i.d. and uniform over [ 0 , 1 ] . ↳ IPCAEXEBT b a - - - OEAEBEI What is P [min( X 1 , . . . , X n ) x ] ? for - { fxtx ) ↳ overlay I ] - IPC 1- minzx ° - IPCX , aw ] . Xnzx =L 2X . , , . . " ' ' in " " . " ÷÷÷iL¥÷ Use this to compute E [min( X 1 , . . . , X n )] . Tailsvm - : ECmin%ECzndsmanesty-f.a-xn.IT#I--o-tnttiI--nt ZXIDX So lpcmin iefmax y )ndx prer = got . ' ( I X - 6 / 26
Min of n Uniforms from prev xfgn slide What is the CDF of min( X 1 , . . . , X n ) ? = BE × ] . , , Fmln l X ) mine z = . What is the PDF of min( X 1 , . . . , X n ) ? d - Xin ) C I CI - Tx tf n X ) - 1) n ( I O - - - I n Nfl x ) # f- min - 7 / 26
Memorylessness of Exponential We can’t talk about independence without talking about conditional probability ! Let X ⇠ Expo( λ ) . X is memoryless , i.e. P [ X � s + t | X > t ] = P [ X � s ] LHS p : zsttfnfxstpredundant-pxteventj-f.ie Htt ) ] lpfxzstt e- = etat = T Pasty 7S=p[ XZS ] e- = 8 / 26
Conditional Density What happens if we condition on events like X = a ? These have 0 probability! The same story as discrete, except we now need to define a conditional density : convention : f Y | X ( y | x ) = f X , Y ( x , y ) too set this f X ( x ) when × ( X ) f O ' - . Think of f ( y | x ) as YIX P [ Y 2 [ y , y + dy ] | X 2 [ x , x + dx ]] 9 / 26
Conditional Density, Continued Cy , ytdy ] ;D Given a conditional density f Y | X , compute - pCY=yTx= → a Iffy .KZ/X)dz Cxxtdx P [ Y y | X = x ] = If we know P [ Y y | X = x ] , compute I XTfxtxdxotpafb.ru 1%1 ysylx P [ Y y ] = - . - - I . PEX PETEY IX X ] § - - x ! - If discrete case on : Go with your gut! What worked for discrete also works for continuous. 10 / 26
Example: Sum of Two Exponentials Let X 1 , X 2 be i.i.d Expo( λ ) RVs. Let Y = X 1 + X 2 . ↳ ±¥i¥¥¥7EE¥ What is P [ Y < y | X 1 = x ] ? What is P [ Y < y ] ? Xi ! for values case on . IPCX , X ] jipfyeylx x ) - - , 1%1 X ' )f×¥dx " Y - e- - I Exercise x ) ) → xxdx . Hy a e - - Soya - e . - 11 / 26
Example: Total Probability Rule Exercise What is the CDF of Y ? I I What is the PDF of Y ? 12 / 26
Break If you could immediately gain one new skill, what would it be? 13 / 26
The Normal (Gaussian) Distribution X is a normal or Gaussian RV if: About SY mm # 1 . * 2 πσ 2 · e ( x � µ ) 2 / 2 σ 2 its mean f X ( x ) = p Parameters: if O , Ncel 02 ) Notation: X ⇠ , E [ X ] = Var( X ) = 02 U ¥17 02=1 µ=o Standard Normal: , 14 / 26
Gaussian Tail Bound Let X ⇠ N ( 0 , 1 ) . Easy upper bound on P [ | X | � α ] , for α � 1? (Something we’ve seen before...) Chebyshev : s VarC FPC Ix za ] - of I ± 42 15 / 26
Gaussian Tail Bound, Continued Turns out we can do better than Chebyshev. - Mdx SETTEE R 1 1 2 π e � x 2 / 2 dx Idea: Use p α ¥19 pflXH2]=2lP[XZx ] N =2SIfae-×%dx Hkd shaded : ⇐ 210¥ ,e-X%dX .lk#X-iDl : 2k¥ I 16 / 26
Shifting and Scaling Gaussians Let X ⇠ N ( µ, σ ) and Y = X � µ 2 σ . Then: NCO I ) Y ⇠ , Proof: Compute P [ a Y b ] . out of Notes scope : . Change of variables: x = σ y + µ . 17 / 26
Shifting and Scaling Gaussians Can also go the other direction: If X ⇠ N ( 0 , 1 ) , and Y = µ + σ X : Y is still Gaussian! ° to EAT XI ECU if E [ Y ] = to = if = Var fo X ) to X ) Var ( y Var( Y ) = = Var ( X ) - 2 = o ⇒ 2 f = 18 / 26
Sum of Independent Gaussians Let X , Y be independent standard Gaussians. Let Z = [ aX + c ] + [ bY + d ] . Then, Z is also Gaussian! (Proof optional.) E [ Z ] = ECaxtctbytdf.at#yIIoaECY7/--ctdVar(aXtbYtCtd)--Var(aXtbY Var( Z ) = ) I shift ) Varcax ) tvarcby = a2tb2 ) tbzvarcy azvarcx ) = = 19 / 26
Example: Height Exercise Consider a family of a two parents and twins with the same height. The parents’ heights are independently drawn from a N ( 65 , 5 ) distribution. The twins’ height are independent of the parents’, and from a N ( 40 , 10 ) distribution. Let H be the sum of the heights in the family. Define relevant RVs: 20 / 26
Example: Height E [ H ] = Var[ H ] = 21 / 26
Sample Mean We sample a RV X independently n times. X has mean µ , variance σ 2 . Denote the sample mean by A n = X 1 + X 2 + ... + X n n Eftncxit E [ X ] = An . . .tl/nD=tn(ECX.It:.tEEXnD--tnfnu)--UAnVarftnCXit...tXnD-- Var( X ) = tvarcxn ) ] ntzfyarCXD.it . . . In '#fAo4= = 22 / 26
The Central Limit Theorem (CLT) Let X 1 , X 2 , . . . , X n be i.i.d. RVs with mean µ , variance σ 2 . (Assume mean, variance, are finite.) Sample mean, as before: A n = X 1 + X 2 + ... + X n n Recall: E [ A n ] = it Var( A n ) = % Standdeairadtion Normalize the sample mean: f - µ normalized An A 0 n = ' - A Ern → 3 An as n sammpleean . A follows Then, as n ! 1 , P [ A 0 n ] ! = WAY : easier ) dist Ncaa . 23 / 26
Example: Chebyshev vs. CLT Let X 1 , X 2 , . . . be i.i.d RVs with E [ X i ] = 1 and Var( X i ) = 1 2 . Let A n = X 1 + X 2 + ... + X n . n C- expectation E [ A n ] = 1 of single sample . of single a sample variance ← 02 a- = If Var( A n ) = IE CAN ] An - - - tartan Normalize to get A 0 n : ECA 'n7=EfAn]= o : Ende this ! Var ( -1 An ⇒ ' n ) ( A Var = 24 / 26
Example: Chebyshev vs. CLT Upper bound P [ A 0 n � 2 ] for any n . (We don’t know if A 0 n is non-neg or symmetric .) variant 22 ] Plan 22 ] - of ' 1pct An ' E s 22 T s tf ECA n' I If we take n ! 1 , upper bound on P [ A 0 n � 2 ] ? tail ① Gaussian - 2% 't # 22 ] const e E X = PEA 'm const him - 95-99.7 ② 68 I 195% " " " approaches . ±¥9%¥Eao£ . 1) dist Nco 25 / 26
Summary I Independence and conditioning also generalize from the discrete RV case. I The Gaussian is a very important continuous RV. It has several nice properties, including the fact that adding independent Gaussians gets you another Gaussian I The CLT tells us that if we take a sample average of a RV, the distribution of this average will approach a standard normal . 26 / 26
Recommend
More recommend