Indicators: Now With Pair-wise Flavor! From Event Pairs to Variance • Recall I i is indicator variable for event A i when: • Expected number of pairs of events: 1 if A occurs X I i E E I I E [ I I ] P ( A A ) i 0 otherwise i j i j i j 2 i j i j i j n Let X = # of events that occur: X I 1 X ( X 1 ) 2 i E ( E [ X ] E [ X ]) P ( A i A ) 2 2 j i 1 i j n n n 2 2 E [ X ] E [ X ] 2 P ( A A ) E [ X ] 2 P ( A A ) E [ X ] E [ X ] E I E [ I ] P ( A ) i j i j i i i i j i j i 1 i 1 i 1 • Recall: Var(X) = E[X 2 ] – (E[X]) 2 • Now consider pair of events A i A j occurring 2 I i I j = 1 if both events A i and A j occur, 0 otherwise Var ( X ) 2 P ( A A ) E [ X ] ( E [ X ]) i j i j X Number of pairs of events that occur is I i I 2 n n j 2 2 P ( A A ) P ( A ) P ( A ) i j i j i i i j i 1 i 1 Let’s Try It With the Binomial Computer Cluster Utilization n i • X ~ Bin(n, p) E [ X ] P ( A ) np • Computer cluster with N servers 1 i Requests independently go to server i with probability p i i Each trial: X i ~ Ber(p) E [ X ] p Let event A i = server i receives no requests Let event A i = trial i is success (i.e., X i = 1) X = # of events A 1 , A 2 , … A n that occur X n Y = # servers that receive ≥ 1 request = N – X 2 2 E E [ X X ] P ( A A ) p p i j i j 2 2 i j i j i j E[Y] after first n requests? n 2 2 P ( A ) ( 1 p ) E [ X ( X 1 )] E [ X ] E [ X ] n ( n 1 ) p Since requests independent: i i N N n E [ X ] P ( A ) ( 1 p ) 2 2 2 2 i i Var ( X ) E [ X ] ( E [ X ]) ( E [ X ] E [ X ]) E [ X ] ( E [ X ]) i 1 i 1 N n 2 2 2 2 2 2 2 E [ Y ] N E [ X ] N ( 1 p ) n ( n 1 ) p np ( np ) n p np np n p i i 1 np ( 1 p ) N 1 1 1 when for n n p 1 i N , E [ Y ] N ( 1 ) N 1 ( 1 ) i N N N i 1 Computer Cluster Utilization (cont.) Computer Cluster = Coupon Collecting • Computer cluster with N servers • Computer cluster with N servers Requests independently go to server i with probability p i Requests independently go to server i with probability p i Let event A i = server i receives no requests Let event A i = server i receives no requests X = # of events A 1 , A 2 , … A n that occur X = # of events A 1 , A 2 , … A n that occur Y = # servers that receive ≥ 1 request = N – X Y = # servers that receive ≥ 1 request = N – X • This is really another “Coupon Collector” problem ( = (-1) 2 Var(X) = Var(X) ) Var(Y) after first n requests? n Independent requests: P ( A A ) ( 1 p p ) , i j Each server is a “coupon type” i j i j 2 n E [ X ( X 1 )] E [ X ] E [ X ] 2 P ( A A ) 2 ( 1 p p ) Request to server = collecting a coupon of that type i j i j i j i j • Hash table version N n 2 n Var ( X ) 2 ( 1 p p ) E [ X ] ( E [ X ]) E [ X ] ( 1 p ) i j i Each server is a bucket in table i j i 1 2 N N Request to server = string gets hashed to that bucket n n n 2 ( 1 p p ) ( 1 p ) ( 1 p ) Var ( Y ) i j i i i j i 1 i 1 1
Product of Expectations The Dance of the Covariance • Let X and Y are independent random variables, • Say X and Y are arbitrary random variables and g ( ) and h ( ) are real-valued functions • Covariance of X and Y: E [ g ( X ) h ( Y )] E [ g ( X )] E [ h ( Y )] Cov ( X , Y ) E [( X E [ X ])( Y E [ Y ])] Proof: • Equivalently: E [ g ( X ) h ( Y )] g ( x ) h ( y ) f ( x , y ) dx dy X , Y Cov ( X , Y ) E [ XY E [ X ] Y XE [ Y ] E [ Y ] E [ X ]] y x E [ XY ] E [ X ] E [ Y ] E [ X ] E [ Y ] E [ X ] E [ Y ] g ( x ) h ( y ) f ( x ) f ( y ) dx dy X Y E [ XY ] E [ X ] E [ Y ] y x X and Y independent, E[XY] = E[X]E[Y] Cov(X,Y) = 0 g ( x ) f ( x ) dx h ( y ) f ( y ) dy X Y But Cov(X,Y) = 0 does not imply X and Y independent! x y E [ g ( X )] E [ h ( Y )] Dependence and Covariance Example of Covariance • X and Y are random variables with PMF: • Consider rolling a 6-sided die X Let indicator variable X = 1 if roll is 1, 2, 3, or 4 -1 0 1 p Y (y) Y Let indicator variable Y = 1 if roll is 3, 4, 5, or 6 0 if X 0 0 1/3 0 1/3 2/3 Y 1 otherwise • What is Cov(X, Y)? 1 0 1/3 0 1/3 E[X] = 2/3 and E[Y] = 2/3 p X (x) 1/3 1/3 1/3 1 xy p ( x , y ) E[XY] = E[X] = 0, E[Y] = 1/3 x y = (0 * 0) + (0 * 1/3) + (0 * 1/3) + (1 * 1/3) = 1/3 Since XY = 0, E[XY] = 0 Cov(X, Y) = E[XY] – E[X]E[Y] = 1/3 – 4/9 = -1/9 Cov(X, Y) = E[XY] – E[X]E[Y] = 0 – 0 = 0 Consider: P(X = 1) = 2/3 and P(X = 1 | Y = 1) = 1/2 • But, X and Y are clearly dependent o Observing Y = 1 makes X = 1 less likely Another Example of Covariance Properties of Covariance • Consider the following data: • Say X and Y are arbitrary random variables Weight Height Weight * Height Cov ( X , Y ) Cov ( Y , X ) 64 57 3648 65 71 59 4189 2 Cov ( X , X ) E [ X ] E [ X ] E [ X ] Var ( X ) 60 53 49 2597 55 67 62 4154 Cov ( aX b , Y ) a Cov ( X , Y ) Height 50 55 51 2805 45 58 50 2900 • Covariance of sums of random variables 40 77 55 4235 35 X 1 , X 2 , …, X n and Y 1 , Y 2 , …, Y m are random variables 57 48 2736 30 56 42 2352 40 45 50 55 60 65 70 75 80 51 42 2142 n m n m Weight Cov X , Y Cov ( X , Y ) 76 61 4636 i j i j 68 57 3876 Cov(W, H) = E[W*H] – E[W]E[H] i 1 j 1 i 1 j 1 = 3355.83 – (62.75)(52.75) E[W] E[H] E[W*H] = 45.77 = 62.75 = 52.75 = 3355.83 2
Recommend
More recommend