ADVANCED ALGORITHMS Lecture 16: hashing (fin), sampling � 1
ANNOUNCEMENTS ➤ HW 3 is due tomorrow! ➤ Send project topics n n ➤ Send email to utah-algo-ta@googlegroups.com, with subject “Project topic”; one email per group; names and UIDs easy calculus n l E n e e I I I I � 2
LAST CLASS ➤ Hashing ➤ place n balls into n bins, independently and uniformly at random ➤ expected size of a bin = 1 ➤ number of bins with k balls ~= n/k! T ➤ max size of bin = O(log n/log log n) n � 3
MAIN IDEAS ➤ Random variables as sums of “simple” random variables ➤ Linearity of expectation Theorem. Let X be a non-negative random variable. For any t > 0, Pr[ X > t ⋅ 피 [ X ]] ≤ 1 Randomvaria t denotdeviate ➤ Markov’s inequality is usually not tight too much from ➤ Union bound their expectations � 4
MAIN IDEAS ➤ Random variables as sums of “simple” random variables ➤ Linearity of expectation Theorem. Let X be a non-negative random variable. For any t > 0, Pr[ X > t ⋅ 피 [ X ]] ≤ 1 t ➤ Markov’s inequality is usually not tight ➤ Union bound Theorem. suppose E 1 , E 2 , …, E n are n events in a probability space. Then Pr[ E 1 ∪ E 2 ∪ … ∪ E n ] ≤ Pr[ E 1 ] + Pr[ E 2 ] + … + Pr[ E n ] � 5
THOUGHTS ➤ When hashing n balls to n bins, outcomes not “as uniform” as one likes load of abin a login Max log login ➤ Many empty bins (HW) I ➤ What happens if there are more balls? hash m balls, where Ign m ≫ n t much better ➤ “Power of two choices” (Broder et al. 91) balancing load toad E0 log log n happens Max � 6
ESTIMATION winner in the full populam winner in sample Want Question: suppose each person votes R or B. Can we predict the winner without counting all votes? m of the people ask for who they Sample Answer will vote for and output the winnerin the Samp le � 7
matter i ought to Things that sampling to be truly uniform everyone answering truthfully n of samples the number how close the matters the margin votes true are in our prediction confidence
ANALYZING SAMPLING Each person che has a choice of Natural formalism: or I 0 ➤ Choose n people uniformly at random. entire population N ➤ Let X i (0/1) be outcome of i’th person that vote 0 Tanipmle No u that vote 1 in sample voting 0 µ mo 1 n n i if hi no Predicted winner 1 of wise Paki of Prf Xi D � 8
t Xn t Xzt X N l X X XitX Mo n what is 1 IE Xi n.IT Efm Elmo n rgue that tend a Eln ist n'ad n fraction of votes I received Estimation error in thesample fraction of votes I i the population in 1 We j st argued the fro r expressions
We justar from the corr expressions gued Eln EE Elmo and no n If true winner then winner sample what if 0.4N 0.6N N No 1 Our prediction is right iff moan Elmo 0.4 n claim prediction is Asks Is our no right our prediction is right iff o o l Elmo L no n 051N what if N No 0.49N and 2 our prediction is right iffno IEf.no LOo01n what is we take G oal n samples if theprob.thatno aster JLO.am
ANALYZING SAMPLING Natural formalism: ➤ Choose n people uniformly at random. ➤ Let X i (0/1) be outcome of i’th person ➤ Error in estimation: ||empirical mean - true expectation||? ➤ “Confidence” Ideal guarantee: || empirical mean - true expectation || < 0.001 w.p. 0.999 � 9
ii I t MARKOV? ItIno3Lo 1rij want nyo E I t Emf Prfn.s no s EIndftooI f sa I failure prob t _In my 106 I n N � 10
CAN WE USE THE “NUMBER OF SAMPLES”? Variance of random variable X r variable X Hitman T E EX µ Barity fx Y tXn It Iat X t var xn if theyL varlx.lt var Xi var x independent � 11
CHEBYCHEV’S INEQUALITY a random variable has low variance If MY then Markov can be improved whose be ar variable variance Then r Let theorem Pr III t E Ext r a Prl t yxY � 12
we wanted Backtosany sling so.in Efno no d or Pr f Ino Efno E 0 O l d r n compute this a idea u
VARIANCE OF AVERAGE � 13
BOUND VIA CHEBYCHEV � 14
WHAT IF WE TAKE HIGHER POWERS? 피 [( X − 피 X ) 4 ] ≤ … “Moment methods” ➤ Usually get improved bounds � 15
CHERNOFF BOUND � 16
INTERPRETING THE CHERNOFF BOUND � 17
INTERPRETING THE CHERNOFF BOUND Useful heuristic: ➤ Sums of independent random variables don’t deviate much more than the variance � 18
MCDIARMID’S INEQUALITY � 19
ESTIMATING THE SUM OF NUMBERS � 20
ESTIMATING THE SUM OF NUMBERS � 21
Recommend
More recommend