1
play

1 Product of Expectations The Dance of the Covariance Let X and Y - PDF document

Indicators: Now With Pair-wise Flavor! From Event Pairs to Variance Recall I i is indicator variable for event A i when: Expected number of pairs of events: 1 if A occurs X


  1. Indicators: Now With Pair-wise Flavor! From Event Pairs to Variance • Recall I i is indicator variable for event A i when: • Expected number of pairs of events:      1 if A occurs       X       I i   E   E I I E [ I I ] P ( A A )    i 0 otherwise i j i j i j    2       i j i j i j n      Let X = # of events that occur: X I   1 X ( X 1 )    2 i E ( E [ X ] E [ X ]) P ( A i A )  2 2 j i 1    i j n n n              2 2 E [ X ] E [ X ] 2 P ( A A ) E [ X ] 2 P ( A A ) E [ X ] E [ X ] E  I  E [ I ] P ( A ) i j i j i i i        i j i j i 1 i 1 i 1 • Recall: Var(X) = E[X 2 ] – (E[X]) 2 • Now consider pair of events A i A j occurring     2  I i I j = 1 if both events A i and A j occur, 0 otherwise Var ( X ) 2 P ( A A ) E [ X ] ( E [ X ]) i j     i j X     Number of pairs of events that occur is I i I   2 n n   j    2       2 P ( A A ) P ( A ) P ( A ) i j i j i i      i j i 1 i 1 Let’s Try It With the Binomial Computer Cluster Utilization n   i  • X ~ Bin(n, p) E [ X ] P ( A ) np • Computer cluster with N servers  1 i  Requests independently go to server i with probability p i i   Each trial: X i ~ Ber(p) E [ X ] p  Let event A i = server i receives no requests  Let event A i = trial i is success (i.e., X i = 1)  X = # of events A 1 , A 2 , … A n that occur          X n        Y = # servers that receive ≥ 1 request = N – X 2   2 E   E [ X X ] P ( A A ) p p       i j i j  2  2    i j i j i j  E[Y] after first n requests?        n 2 2 P ( A ) ( 1 p ) E [ X ( X 1 )] E [ X ] E [ X ] n ( n 1 ) p  Since requests independent: i i N N      n E [ X ] P ( A ) ( 1 p )       2 2 2 2 i i Var ( X ) E [ X ] ( E [ X ]) ( E [ X ] E [ X ]) E [ X ] ( E [ X ])   i 1 i 1 N               n 2 2 2 2 2 2 2 E [ Y ] N E [ X ] N ( 1 p ) n ( n 1 ) p np ( np ) n p np np n p i  i 1     np ( 1 p ) N   1      1    1 when for n n p 1 i N , E [ Y ] N ( 1 ) N 1 ( 1 ) i N N N  i 1 Computer Cluster Utilization (cont.) Computer Cluster = Coupon Collecting • Computer cluster with N servers • Computer cluster with N servers  Requests independently go to server i with probability p i  Requests independently go to server i with probability p i  Let event A i = server i receives no requests  Let event A i = server i receives no requests  X = # of events A 1 , A 2 , … A n that occur  X = # of events A 1 , A 2 , … A n that occur  Y = # servers that receive ≥ 1 request = N – X  Y = # servers that receive ≥ 1 request = N – X • This is really another “Coupon Collector” problem ( = (-1) 2 Var(X) = Var(X) )  Var(Y) after first n requests?     n  Independent requests: P ( A A ) ( 1 p p ) , i j  Each server is a “coupon type” i j i j          2 n E [ X ( X 1 )] E [ X ] E [ X ] 2 P ( A A ) 2 ( 1 p p )  Request to server = collecting a coupon of that type i j i j   i j i j   • Hash table version N        n 2 n Var ( X ) 2 ( 1 p p ) E [ X ] ( E [ X ]) E [ X ] ( 1 p ) i j i  Each server is a bucket in table   i j i 1   2 N N     Request to server = string gets hashed to that bucket           n n n 2 ( 1 p p ) ( 1 p ) ( 1 p ) Var ( Y ) i j i  i     i j i 1 i 1 1

  2. Product of Expectations The Dance of the Covariance • Let X and Y are independent random variables, • Say X and Y are arbitrary random variables and g (  ) and h (  ) are real-valued functions • Covariance of X and Y:  E [ g ( X ) h ( Y )] E [ g ( X )] E [ h ( Y )]    Cov ( X , Y ) E [( X E [ X ])( Y E [ Y ])]  Proof:   • Equivalently:    E [ g ( X ) h ( Y )] g ( x ) h ( y ) f ( x , y ) dx dy     X , Y Cov ( X , Y ) E [ XY E [ X ] Y XE [ Y ] E [ Y ] E [ X ]]     y x       E [ XY ] E [ X ] E [ Y ] E [ X ] E [ Y ] E [ X ] E [ Y ]    g ( x ) h ( y ) f ( x ) f ( y ) dx dy X Y   E [ XY ] E [ X ] E [ Y ]     y x      X and Y independent, E[XY] = E[X]E[Y]  Cov(X,Y) = 0   g ( x ) f ( x ) dx h ( y ) f ( y ) dy X Y      But Cov(X,Y) = 0 does not imply X and Y independent! x y  E [ g ( X )] E [ h ( Y )] Dependence and Covariance Example of Covariance • X and Y are random variables with PMF: • Consider rolling a 6-sided die X  Let indicator variable X = 1 if roll is 1, 2, 3, or 4 -1 0 1 p Y (y) Y  Let indicator variable Y = 1 if roll is 3, 4, 5, or 6   0 if X 0 0 1/3 0 1/3 2/3   Y  1 otherwise • What is Cov(X, Y)? 1 0 1/3 0 1/3  E[X] = 2/3 and E[Y] = 2/3 p X (x) 1/3 1/3 1/3 1  xy p ( x , y )  E[XY] =  E[X] = 0, E[Y] = 1/3 x y = (0 * 0) + (0 * 1/3) + (0 * 1/3) + (1 * 1/3) = 1/3  Since XY = 0, E[XY] = 0  Cov(X, Y) = E[XY] – E[X]E[Y] = 1/3 – 4/9 = -1/9  Cov(X, Y) = E[XY] – E[X]E[Y] = 0 – 0 = 0  Consider: P(X = 1) = 2/3 and P(X = 1 | Y = 1) = 1/2 • But, X and Y are clearly dependent o Observing Y = 1 makes X = 1 less likely Another Example of Covariance Properties of Covariance • Consider the following data: • Say X and Y are arbitrary random variables  Weight Height Weight * Height  Cov ( X , Y ) Cov ( Y , X ) 64 57 3648 65    71 59 4189 2  Cov ( X , X ) E [ X ] E [ X ] E [ X ] Var ( X ) 60 53 49 2597 55   67 62 4154  Cov ( aX b , Y ) a Cov ( X , Y ) Height 50 55 51 2805 45 58 50 2900 • Covariance of sums of random variables 40 77 55 4235 35  X 1 , X 2 , …, X n and Y 1 , Y 2 , …, Y m are random variables 57 48 2736 30 56 42 2352   40 45 50 55 60 65 70 75 80 51 42 2142 n m n m       Weight Cov X , Y Cov ( X , Y ) 76 61 4636    i j i j   68 57 3876 Cov(W, H) = E[W*H] – E[W]E[H]     i 1 j 1 i 1 j 1 = 3355.83 – (62.75)(52.75) E[W] E[H] E[W*H] = 45.77 = 62.75 = 52.75 = 3355.83 2

Recommend


More recommend