Streaming space complexity of nearly all functions of one variable Vladimir Braverman, Stephen Chestnut, David P. Woodruff, Lin F. Yang January 7, 2016
A stream of m = 7 items from [ n ] = [4] 4 , 2 , 3 , 2 , 4 , 2 , 2 0 0 f = 0 0 � f 2 i =0
A stream of m = 7 items from [ n ] = [4] 4 , 2 , 3 , 2 , 4 , 2 , 2 0 0 f = 0 1 � f 2 i = 1
A stream of m = 7 items from [ n ] = [4] 2 , 3 , 2 , 4 , 2 , 2 0 1 f = 0 1 � f 2 i = 2
A stream of m = 7 items from [ n ] = [4] 3 , 2 , 4 , 2 , 2 0 1 f = 1 1 � f 2 i = 3
A stream of m = 7 items from [ n ] = [4] 2 , 4 , 2 , 2 0 2 f = 1 1 � f 2 i = 6
A stream of m = 7 items from [ n ] = [4] 4 , 2 , 2 0 2 f = 1 2 � f 2 i = 9
A stream of m = 7 items from [ n ] = [4] 2 , 2 0 3 f = 1 2 � f 2 i = 14
A stream of m = 7 items from [ n ] = [4] 2 0 4 f = 1 2 � f 2 i = 21
A stream of m = 7 items from [ n ] = [4] 0 4 f = 1 2 � f 2 i = 21 How much storage for a streaming (1 ± ǫ ) -approximation to � i f 2 i ?
Classify g : Z ≥ 0 → R Is there a streaming (1 ± ǫ )-approximation for � i g ( f i ) using only poly( 1 ǫ log nm ) bits? Previous works 1 g ( x ) = 1 ( x � = 0): [FM85],[KNW10] ǫ = Ω( polylog( n ) ) g ( x ) = x p : [F85],[AMS96],[IW05],[I06] m = poly( n ) g (0) = 0 g ( x ) = x log x : [CDM06],[CCM07],[HNO08] g ( x ) > 0 , ∀ x > 0 monotonic g : [BO10],[BC15]
Recursive Subsampling [Indyk & Woodruff 2005] An α -heavy hitter is any item i ∗ such that g ( f i ∗ ) ≥ α � i g ( f i ). Theorem (Braverman & Ostrovsky 2010) � ǫ 2 log 3 n-heavy hitters ⇒ (1 ± ǫ ) -approximation to g ( f i ) . i
Recursive Subsampling [Indyk & Woodruff 2005] An α -heavy hitter is any item i ∗ such that g ( f i ∗ ) ≥ α � i g ( f i ). Theorem (Braverman & Ostrovsky 2010) � ǫ 2 log 3 n-heavy hitters ⇒ (1 ± ǫ ) -approximation to g ( f i ) . i Heavy hitters by CountSketch [Charikar, Chen & Farach-Colton 2002] i ∗ ≥ α � Find i ∗ such that f 2 i f 2 i Estimate f i ∗ O ( α − 1 log 2 n ) bits.
Three properties are sufficient and almost necessary for ˜ O (1) bits g ( x ) x
Slow-jumping g ( x ) g ( y ) � y � 2 g ( y ) g ( x ) � x g ( x ) x x y YES: g ( x ) = x 2 log x NO: g ( x ) = x 3
Slow-dropping g ( x ) g ( x ) g ( y ) g ( y ) g ( x ) � 1 x x y 1 NO: g ( x ) = Θ(1 YES: g ( x ) = Θ( log x ) x )
Predictable g ( x ) g ( y ) g ( x ) � �� � y − x ≪ x g ( y ) = (1 ± ǫ ) g ( x ) or g ( y − x ) � g ( x ) x NO: g ( x ) = (2 + sin x ) x 2 YES: g ( x ) = (2 + sin x ) 1 ( x > 0)
Predictable g ( x ) g ( y − x ) g ( y ) � �� � g ( x ) y − x ≪ x g ( y ) = (1 ± ǫ ) g ( x ) or g ( y − x ) � g ( x ) x NO: g ( x ) = (2 + sin x ) x 2 YES: g ( x ) = (2 + sin x ) 1 ( x > 0)
Three properties are sufficient and almost necessary for � O (1) bits � y � 2 , g ( y ) slow-jumping g ( x ) � x slow-dropping g ( y ) � g ( x ), and predictable whenever 0 < y − x ≪ x g ( y ) = (1 ± ǫ ) g ( x ) or g ( y − x ) � g ( x ). g ( x ) lower bound fails x 3 Ω( n 1 / 3 ) slow-jumping 1 / x Ω( n ) slow-dropping g ( x ) = (2 + sin x ) x 2 Ω( n ) predictability
Almost necessary ? 2 − i ( x ) 1 1 2 1 4 x i ( x ) = max { j ∈ N : 2 j divides x }
Recommend
More recommend