Lower Bounds for Quantile Estimation in Random-Order and Multi-Pass - PowerPoint PPT Presentation

Lower Bounds for Quantile Estimation in Random-Order and Multi-Pass Streams Sudipto Guha (UPenn) Andrew McGregor (UCSD)

Data Stream Model

Data Stream Model • Stream: m elements from a universe of size n : 3,5,3,7,5,4,8,5,3,7,5,4,8,6,3,2,6,4,7,3,4, ... e.g., IP packets, search engine queries, data read from external memory, device, sensor readings...

Data Stream Model • Stream: m elements from a universe of size n : 3,5,3,7,5,4,8,5,3,7,5,4,8,6,3,2,6,4,7,3,4, ... e.g., IP packets, search engine queries, data read from external memory, device, sensor readings... • Data-Stream Model: No control over the ordering of elements Limited working memory S Limited time to process each element [Morris ’78] [Munro, Paterson ’78] [Flajolet, Martin ’85] [Alon, Matias, Szegedy ’96] [Henzinger, Raghavan, Rajagopalan ’98] [Feigenbaum, Kannan, Strauss, Viswanathan ’99]

Data Stream Model • Stream: m elements from a universe of size n : 3,5,3,7,5,4,8,5,3,7,5,4,8,6,3,2,6,4,7,3,4, ... e.g., IP packets, search engine queries, data read from external memory, device, sensor readings... • Data-Stream Model: No control over the ordering of elements Limited working memory S Limited time to process each element [Morris ’78] [Munro, Paterson ’78] [Flajolet, Martin ’85] [Alon, Matias, Szegedy ’96] [Henzinger, Raghavan, Rajagopalan ’98] [Feigenbaum, Kannan, Strauss, Viswanathan ’99] • Previous work: quantiles, frequency moments, histograms, clustering, entropy, graph problems...

Stream Order?

Stream Order? • Almost all prior research considers adversarial- order model (AOM) .

Stream Order? • Almost all prior research considers adversarial- order model (AOM) . • What about the random-order model (ROM) ? Form of average case analysis Stream of independent samples Uncorrelated fields in a database...

Stream Order? • Almost all prior research considers adversarial- order model (AOM) . • What about the random-order model (ROM) ? Form of average case analysis Stream of independent samples Uncorrelated fields in a database... • Previous Work: Frequent elements [Demaine, Lopez-Ortiz, Munro ’02] Entropy & Distances [Guha, McGregor, Venkatasubramanian ’06] Histograms [Guha, McGregor ’07] Quantiles... [Munro, Paterson ’78], [Guha, McGregor ’06]

Quantile Estimation

Quantile Estimation • Given a set of m elements, a t-approx median is any element of rank = m /2±t

Quantile Estimation • Given a set of m elements, a t-approx median is any element of rank = m /2±t • Previous Work: AOM: ε m -approx in O( ε -1 lg ε m ) space [Greenwald, Khanna ’01], [Shrivastava, Buragohain, Agrawal, Suri ‘04] [Cormode, Korn, Muthukrishnan, Srivastava ’06]

Quantile Estimation • Given a set of m elements, a t-approx median is any element of rank = m /2±t • Previous Work: AOM: ε m -approx in O( ε -1 lg ε m ) space [Greenwald, Khanna ’01], [Shrivastava, Buragohain, Agrawal, Suri ‘04] [Cormode, Korn, Muthukrishnan, Srivastava ’06] ROM: 1-pass exact selection in O( m 1/2 ) space [Munro, Paterson ’78] ROM: 1-pass m 1/2+ ε -approx in O(2 1/ ε polylog m ) space ROM: O(lg lg m )-pass selection in O(polylog m ) space [Guha, McGregor ’06]

Quantile Estimation • Given a set of m elements, a t-approx median is any element of rank = m /2±t • Previous Work: AOM: ε m -approx in O( ε -1 lg ε m ) space [Greenwald, Khanna ’01], [Shrivastava, Buragohain, Agrawal, Suri ‘04] [Cormode, Korn, Muthukrishnan, Srivastava ’06] ROM: 1-pass exact selection in O( m 1/2 ) space [Munro, Paterson ’78] ROM: 1-pass m 1/2+ ε -approx in O(2 1/ ε polylog m ) space ROM: O(lg lg m )-pass selection in O(polylog m ) space [Guha, McGregor ’06] • Main Questions: Are these ROM results possible in the AOM model? Can these ROM results be improved?

Results • Thm: For a stream in random order : a) 1-pass, O(polylg m )-space, Õ( m 1/2 )-approx b) O(lg lg m )-pass, O(polylg m )-space exact selection • Thm: For a stream in adversarial order : a) 1-pass, Õ( m 1/2 )-approx requires Ω ( m 1/2 ) space b) O(polylg m )-space exact requires Ω (lg m ) passes • Bonus Thm : For a stream in random order , a single pass, t -approx requires Ω ( m 1/2 t -3/2 ) space.

1: Algorithm (Random) 2: Lower-Bound (Random) 3: Lower-Bound (Advesarial)

Algorithm

Algorithm Value Stream Position

Algorithm 1) Maintain bounds [ a,b ] for median and c in [ a , b ] Value Stream Position

Algorithm 1) Maintain bounds [ a,b ] for median and c in [ a , b ] 2) Split stream in segments: S 1 , E 1 , S 2 , E 2 , ... , S p , E p Value Stream Position

Algorithm 1) Maintain bounds [ a,b ] for median and c in [ a , b ] 2) Split stream in segments: S 1 , E 1 , S 2 , E 2 , ... , S p , E p 3) For i ∈ [ p ]: Sample c ∈ S i ∩ [ a,b ] Value Stream Position

Algorithm 1) Maintain bounds [ a,b ] for median and c in [ a , b ] 2) Split stream in segments: S 1 , E 1 , S 2 , E 2 , ... , S p , E p 3) For i ∈ [ p ]: Sample c ∈ S i ∩ [ a,b ] Estimate rank(c) from E i Value Stream Position

Algorithm 1) Maintain bounds [ a,b ] for median and c in [ a , b ] 2) Split stream in segments: S 1 , E 1 , S 2 , E 2 , ... , S p , E p 3) For i ∈ [ p ]: Sample c ∈ S i ∩ [ a,b ] Estimate rank(c) from E i Update [ a,b ] Value Stream Position

Algorithm 1) Maintain bounds [ a,b ] for median and c in [ a , b ] 2) Split stream in segments: S 1 , E 1 , S 2 , E 2 , ... , S p , E p 3) For i ∈ [ p ]: Sample c ∈ S i ∩ [ a,b ] Estimate rank(c) from E i Update [ a,b ] Value S 1 E 1 S 2 E 2 S 3 E 3 Stream Position

Algorithm 1) Maintain bounds [ a,b ] for median and c in [ a , b ] 2) Split stream in segments: S 1 , E 1 , S 2 , E 2 , ... , S p , E p 3) For i ∈ [ p ]: Sample c ∈ S i ∩ [ a,b ] Estimate rank(c) from E i Update [ a,b ] Value b a S 1 E 1 S 2 E 2 S 3 E 3 Stream Position

Algorithm 1) Maintain bounds [ a,b ] for median and c in [ a , b ] 2) Split stream in segments: S 1 , E 1 , S 2 , E 2 , ... , S p , E p 3) For i ∈ [ p ]: Sample c ∈ S i ∩ [ a,b ] Estimate rank(c) from E i Update [ a,b ] Value b c a S 1 E 1 S 2 E 2 S 3 E 3 Stream Position

Algorithm 1) Maintain bounds [ a,b ] for median and c in [ a , b ] 2) Split stream in segments: S 1 , E 1 , S 2 , E 2 , ... , S p , E p 3) For i ∈ [ p ]: Sample c ∈ S i ∩ [ a,b ] Estimate rank(c) from E i Update [ a,b ] Value b a c S 1 E 1 S 2 E 2 S 3 E 3 Stream Position

Algorithm 1) Maintain bounds [ a,b ] for median and c in [ a , b ] 2) Split stream in segments: S 1 , E 1 , S 2 , E 2 , ... , S p , E p 3) For i ∈ [ p ]: Sample c ∈ S i ∩ [ a,b ] Estimate rank(c) from E i Update [ a,b ] Value c b a S 1 E 1 S 2 E 2 S 3 E 3 Stream Position

Analysis

Analysis • Let t = O( m 1/2 lg 2 m )

Analysis • Let t = O( m 1/2 lg 2 m ) • Lemma: For | E i | = Ω ( m /lg m ), error of estimate of rank(c) is ± t w.h.p.

Analysis • Let t = O( m 1/2 lg 2 m ) • Lemma: For | E i | = Ω ( m /lg m ), error of estimate of rank(c) is ± t w.h.p. • Lemma: For | S i |= Ω ( t ), if rank( b )-rank( a )= Ω (t), then there exists c in S i ∩ [ a , b ] w.h.p.

Analysis • Let t = O( m 1/2 lg 2 m ) • Lemma: For | E i | = Ω ( m /lg m ), error of estimate of rank(c) is ± t w.h.p. • Lemma: For | S i |= Ω ( t ), if rank( b )-rank( a )= Ω (t), then there exists c in S i ∩ [ a , b ] w.h.p. • Lemma: Expect rank( b )-rank( a ) to half per-phase, hence p = O(lg m ) w.h.p.

Lower Bounds for Quantile Estimation in Random-Order and Multi-Pass - PowerPoint PPT Presentation

Lower Bounds for Quantile Estimation in Random-Order and Multi-Pass Streams Sudipto Guha (UPenn) Andrew McGregor (UCSD) Data Stream Model Data Stream Model Stream: m elements from a universe of size n :

Circuit Lower-bounds Lecture 24 Weak circuits are indeed weak 1 Circuit Lower-bounds 2

) Quantile Estimation Peter J. Haas CS 590M: Simulation Spring Semester 2020 1 / 20 Quantile

Lecture 2. Upper and lower bounds for subgaussian matrices The -net method refined 1 Random

QUANTILE AUTOREGRESSION ROGER KOENKER AND ZHIJIE XIAO Abstract. We consider quantile

Lower Bounds on Matrix Rigidity via a Quantum Argument Ronald de Wolf CWI Amsterdam Lower

Quantile Estimation Definition and Examples Point Estimates Peter J. Haas Confidence Intervals

Quantile Regression in R: For Fin and Fun Roger Koenker University of Illinois at

Generalized Quantile Regression in Stata Matthew Baker, Hunter College David Powell, RAND Travis

Quantile plots: New planks in an old campaign Nicholas J. Cox Department of Geography 1

Applications of Normal Quantile Plots David Rose June 13, 2011 David Rose () Applications of

Checking Assumptions Normal distributions: use probability plot (or quantile-quantile plot);

Kernel-Size Lower Bounds: The Evidence from Complexity Theory Andrew Drucker IAS Worker 2013,

Amit Chakrabarti Dartmouth College WAPMDS, IIT Kanpur, Dec 2009 Amit Chakrabarti 1 Multi-Pass

Kernel-Size Lower Bounds: The Evidence from Complexity Theory Andrew Drucker IAS Worker 2013,

Kernel-Size Lower Bounds: The Evidence from Complexity Theory Andrew Drucker IAS Worker 2013,

Optimal Estimation for Quantile Regression with Functional Response Xiao Wang, Purdue University

When good signals go bad The 2nd Russian banking failure via Mark L oczy Andrew Spicer

Real-Time AV1 in WebRTC Dr. Alex - CoSMo Software CoSMo Software AOM :: USE CASES VOD,

Full-Dimension MIMO: Status and Challenges in Design and Implementation Gary Xu, Yang Li,

Dynamic Model-Based Filtering for Mobile Terminal Location Estimation Michael McGuire Edward S.

Foundations of Computing II Lecture 22: Moments Stefano Tessaro tessaro@cs.washington.edu 1

Inclusive XR Roadmap Possible next steps Research Where? Industry? Acadameia? EU

Approaching Overhead-Free Execution on FPGA Soft-Processors Charles Eric LaForest Jason Anderson

NOAA GMAC 2018 Open-Path Laser Dispersion Spectrometer for Methane Emissions Mapping and