CS 498ABD: Algorithms for Big Data Probabilistic Counting and Morris Counter Lecture 04 September 3, 2020 Chandra (UIUC) CS498ABD 1 Fall 2020 1 / 18
Streaming model The input consists of m objects/items/tokens e 1 , e 2 , . . . , e m that are seen one by one by the algorithm. The algorithm has “limited” memory say for B tokens where B < m (often B ⌧ m ) and hence cannot store all the input Want to compute interesting functions over input Chandra (UIUC) CS498ABD 2 Fall 2020 2 / 18
Counting problem Simplest streaming question: how many events in the stream? Chandra (UIUC) CS498ABD 3 Fall 2020 3 / 18
Counting problem Simplest streaming question: how many events in the stream? Obvious: counter that increments on seeing each new item. Requires d log n e = Θ (log n ) bits to be able to count up to n events. (We will use n for length of stream for this lecture) Chandra (UIUC) CS498ABD 3 Fall 2020 3 / 18
Counting problem Simplest streaming question: how many events in the stream? Obvious: counter that increments on seeing each new item. Requires d log n e = Θ (log n ) bits to be able to count up to n events. (We will use n for length of stream for this lecture) Question: can we do better? Chandra (UIUC) CS498ABD 3 Fall 2020 3 / 18
Counting problem Simplest streaming question: how many events in the stream? Obvious: counter that increments on seeing each new item. Requires d log n e = Θ (log n ) bits to be able to count up to n events. (We will use n for length of stream for this lecture) Question: can we do better? Not deterministically. Chandra (UIUC) CS498ABD 3 Fall 2020 3 / 18
Counting problem Simplest streaming question: how many events in the stream? Obvious: counter that increments on seeing each new item. Requires d log n e = Θ (log n ) bits to be able to count up to n events. (We will use n for length of stream for this lecture) Question: can we do better? Not deterministically. Yes, with randomization. t “Counting large numbers of events in small registers” by Rober Morris (Bell Labs), Communications of the ACM (CACM), 1978 = = Chandra (UIUC) CS498ABD 3 Fall 2020 3 / 18
Probabilistic Counting Algorithm ProbabilisticCounting: X 0 While (a new event arrives) Toss a biased coin that is heads with probability 1 / 2 X If (coin turns up heads) X X + 1 endWhile Output 2 X � 1 as the estimate for the length of the stream. Chandra (UIUC) CS498ABD 4 Fall 2020 4 / 18
Probabilistic Counting Algorithm ProbabilisticCounting: X 0 While (a new event arrives) Toss a biased coin that is heads with probability 1 / 2 X If (coin turns up heads) X X + 1 endWhile Output 2 X � 1 as the estimate for the length of the stream. Intuition: X keeps track of log n in a probabilistic sense. Hence requires O (log log n ) bits Chandra (UIUC) CS498ABD 4 Fall 2020 4 / 18
Probabilistic Counting Algorithm ProbabilisticCounting: X 0 While (a new event arrives) Toss a biased coin that is heads with probability 1 / 2 X If (coin turns up heads) X X + 1 endWhile 0 Output 2 X � 1 as the estimate for the length of the stream. Intuition: X keeps track of log n in a probabilistic sense. Hence requires O (log log n ) bits Theorem Let Y = 2 X . Then E[ Y ] � 1 = n , the number of events seen. Chandra (UIUC) CS498ABD 4 Fall 2020 4 / 18
log n vs log log n Morris’s motivation: Had 8 bit registers. Can count only up to 2 8 = 256 events using deterministic counter. Had many counters for keeping track of di ff erent events and using 16 bits (2 registers) was infeasible. If only log log n bits then can count to 2 2 8 = 2 256 events! In practice overhead due to error control etc. Morris reports counting up to 130,000 events using 8 bits while controlling error. See 2 page paper for more details. Chandra (UIUC) CS498ABD 5 Fall 2020 5 / 18
Analysis of Expectation Induction on n . For i � 0 , let X i be the counter value after i events. Let Y i = 2 X i . Both are random variables. E=it Hi > o . E EY ] - I Chandra (UIUC) CS498ABD 6 Fall 2020 6 / 18
Analysis of Expectation Induction on n . For i � 0 , let X i be the counter value after i events. Let Y i = 2 X i . Both are random variables. Base case: n = 0 , 1 easy to check: X i , Y i � 1 deterministically equal to 0 , 1 . 4=20--1 X -0 4=21--2 X= I Chandra (UIUC) CS498ABD 6 Fall 2020 6 / 18
and pure pre E[ Yi ] tic n - =n - it ' - . E- [ Yn ]=E[2Xn]=jEFzPe[Xn=j3 + Paley .is =¥od( Paan . :B 4- ⇒ = E.in#iTEeoeafxnni&?...s , zit ) . = ELXn-D-ifogkthfxn.io in :B D Ithaca , = ( n -1+1 , f.odhlxn.es ] ! * + - nti . -
Analysis of Expectation 1 h 2 X n i 2 j Pr[ X n = j ] X E[ Y n ] = = E - j =0 1 Pr[ X n � 1 = j ] · (1 � 1 1 ✓ ◆ X 2 j = 2 j ) + Pr[ X n � 1 = j � 1] · 2 j � 1 j =0 1 2 j Pr[ X n � 1 = j ] X = j =0 1 X + (2 Pr[ X n � 1 = j � 1] � Pr[ X n � 1 = j ]) j =0 = E[ Y n � 1 ] + 1 (by applying induction) = n + 1 = Chandra (UIUC) CS498ABD 7 Fall 2020 7 / 18
Jensen’s Inequality Definition A real-valued function f : R ! R is convex if f (( a + b ) / 2) ( f ( a ) + f ( b )) / 2 for all a , b . Equivalently, f ( � a + (1 � � ) b ) � f ( a ) + (1 � � ) f ( b ) for all � 2 [0 , 1] . Chandra (UIUC) CS498ABD 8 Fall 2020 8 / 18
Jensen’s Inequality Definition A real-valued function f : R ! R is convex if f (( a + b ) / 2) ( f ( a ) + f ( b )) / 2 for all a , b . Equivalently, f ( � a + (1 � � ) b ) � f ( a ) + (1 � � ) f ( b ) for all � 2 [0 , 1] . Theorem (Jensen’s inequality) Let Z be random variable with E[ Z ] < 1 . If f is convex then I f (E[ Z ]) E[ f ( Z )] . r Chandra (UIUC) CS498ABD 8 Fall 2020 8 / 18
⇒ Implication for counter size We have Y n = 2 X n . The function f ( z ) = 2 z is convex. Hence # 2 E [ X n ] E[ Y n ] n + 1 which implies E[ X n ] log( n + 1) Hence expected number of bits in counter is d log log( n + 1)) e . = Efyn ) ZECH ⇐ Efzxn ] n -71 - - e ldI ELM Chandra (UIUC) CS498ABD 9 Fall 2020 9 / 18
Variance calculation Question: Is the random variable Y n well behaved even though expectation is right? What is its variance? Is it concentrated around expectation? Chandra (UIUC) CS498ABD 10 Fall 2020 10 / 18
Variance calculation Question: Is the random variable Y n well behaved even though expectation is right? What is its variance? Is it concentrated around expectation? Lemma O 2 n 2 + 3 = 3 ⇥ Y 2 ⇤ 2 n + 1 and hence Var [ Y n ] = n ( n � 1) / 2 . E n - El yay - LEGIT ✓ - - - ( n tilt TyuEh Chandra (UIUC) CS498ABD 10 Fall 2020 10 / 18
Variance analysis ⇥ Y 2 ⇤ Analyze E via induction. n Base cases: n = 0 , 1 are easy to verify since Y n is deterministic. 2 2 j · Pr[ X n = j ] E [ Y 2 E [2 2 X n ] = X n ] = 2- j � 0 ✓ Pr[ X n � 1 = j ](1 � 1 1 ◆ 2 2 j · X = 2 j ) + Pr[ X n � 1 = j � 1] 2 j � 1 j � 0 2 2 j · Pr[ X n � 1 = j ] X = j � 0 ⇣ � 2 j Pr[ X n � 1 = j � 1] + 42 j � 1 Pr[ X n � 1 = j � 1] ⌘ X + j � 0 E [ Y 2 = n � 1 ] + 3 E [ Y n � 1 ] 2( n � 1) 2 + 3 3 2( n � 1) + 1 + 3 n = 3 2 n 2 + 3 = 2 n + 1 . Chandra (UIUC) CS498ABD 11 Fall 2020 11 / 18
Error analysis via Chebyshev inequality We have E[ Y n ] = n and Var ( Y n ) = n ( n � 1) / 2 implies < p = � Y n = n ( n � 1) / 2 n . =hE ' Applying Cheybyshev’s inequality: ¢ Pr[ | Y n � E[ Y n ] | � tn ] 1 / (2 t 2 ) . . = = Hence constant factor approximation with constant probability (for instance set t = 1 / 2 ). 5 E- Chandra (UIUC) CS498ABD 12 Fall 2020 12 / 18
Error analysis via Chebyshev inequality We have E[ Y n ] = n and Var ( Y n ) = n ( n � 1) / 2 implies p � Y n = n ( n � 1) / 2 n . Applying Cheybyshev’s inequality: Pr[ | Y n � E[ Y n ] | � tn ] 1 / (2 t 2 ) . Hence constant factor approximation with constant probability (for instance set t = 1 / 2 ). Question: Want estimate to be tighter. For any given ✏ > 0 want e- estimate to have error at most ✏ n with say constant probability or with probability at least (1 � � ) for a given � > 0 . Chandra (UIUC) CS498ABD 12 Fall 2020 12 / 18
Part I 2 Improving Estimators Chandra (UIUC) CS498ABD 13 Fall 2020 13 / 18
Probabilistic Estimation Setting: want to compute some real-value function f of a given input I Probabilistic estimator: a randomized algorithm that given I outputs a random answer X such that E[ X ] ' f ( I ) . Estimator is exact if E[ X ] = f ( I ) for all inputs I . Additive approximation: | E[ X ] � f ( I ) | ✏ Multiplicative approximation: (1 � ✏ ) f ( I ) E[ X ] (1 + ✏ ) f ( I ) Chandra (UIUC) CS498ABD 14 Fall 2020 14 / 18
Probabilistic Estimation Setting: want to compute some real-value function f of a given input I Probabilistic estimator: a randomized algorithm that given I outputs a random answer X such that E[ X ] ' f ( I ) . Estimator is exact if E[ X ] = f ( I ) for all inputs I . Additive approximation: | E[ X ] � f ( I ) | ✏ Multiplicative approximation: (1 � ✏ ) f ( I ) E[ X ] (1 + ✏ ) f ( I ) Question: Estimator only gives expectation. Bound on Var [ X ] allows Chebyshev. Sometimes Cherno ff applies. How do we improve estimator? Chandra (UIUC) CS498ABD 14 Fall 2020 14 / 18
Recommend
More recommend