Communication and Memory Efficient Testing of Discrete Distributions Themis Gouleakis USC → MPI July 21, 2019 Joint work with: Ilias Diakonikolas (USC), Daniel Kane (UCSD) and Sankeerth Rao (UCSD)
M OTIVATION ◮ Datasets growing → too ◮ Insufficient memory! many samples needed! ◮ Design low memory ◮ Can we do property testing algorithms! distributedly? 2 / 15
Is the lottery fair? vs ◮ We can learn the distribution: Ω( n ) samples. ◮ Centralized sampling/ unbounded memory: we can test (uniform vs ε -far) with Θ( √ n/ε 2 ) samples. ◮ What if we have memory constraints/unavailable centralized sampling? 3 / 15
D EFINITION AND ( CENTRALIZED ) PRIOR WORK vs Uniformity testing problem Given samples from a probability distribution p , distinguish p = U n from � p − U n � 1 > ε with success probability at least 2 / 3 . � √ n � ◮ Sample complexity: Θ [Goldreich, Ron 00],[Batu, ε 2 Fisher, Fortnow, Kumar, Rubinfeld, White 01],[Paninski 08], [Chan, Diakonikolas, Valiant, Valiant 14], [Diakonikolas, G, Peebles, Price 17] 4 / 15
P RIOR / RELATED WORK Distributed learning ◮ Parameter estimation [ZDJW13],[GMN14],[BGMNW16],[JLY16],[HOW18] ◮ Non-parametric [DGLNOS17],[HMOW18] Distributed testing ◮ Single sample per machine with sublogarithmic size messages: [Acharya, Cannone, Tyagi 18] ◮ Two-party setting: [Andoni, Malkin, Nosatzki 18] ◮ LOCAL and CONGEST models: [Fisher, Meir, Oshman 18] 5 / 15
C ENTRALIZED C OLLISION -B ASED A LGORITHM [G OLDREICH , R ON 00],[B ATU , F ISHER , F ORTNOW , K UMAR , R UBINFELD , W HITE 01] Problem: Given distribution p over [ n ] , distinguish p = U n from � p − U n � 1 ≥ ǫ . ◮ m samples ◮ Node labels: i.i.d samples from p . ◮ Edges: { i, j } ∈ E iff L ( i ) = L ( j ) � · � p � 2 � m ◮ Define statistic Z = ♯ edges ⇒ E [ Z ] = 2 2 ◮ Minimized for p = U n ◮ Idea: Draw enough samples and compare Z to some threshold. 6 / 15
G ENERIC B IPARTITE T ESTING A LGORITHM ℓ SAMPLES PER MACHINE Problem: Given distribution p over [ n ] , distinguish p = U n from � p − U n � 1 ≥ ǫ . ◮ ℓ samples per machine . ◮ Node labels: i.i.d samples from p . ◮ Edges: { i, j } ∈ E iff ( i ∈ S 1 ) ∧ ( j ∈ S 2 ) ∧ ( L ( i ) = L ( j )) 7 / 15
G ENERIC B IPARTITE T ESTING A LGORITHM ℓ SAMPLES PER MACHINE Problem: Given distribution p over [ n ] , distinguish p = U n from � p − U n � 1 ≥ ǫ . ◮ ℓ samples per machine . ◮ Node labels: i.i.d samples from p . ◮ Edges: { i, j } ∈ E iff ( i ∈ S 1 ) ∧ ( j ∈ S 2 ) ∧ ( L ( i ) = L ( j )) ◮ Define statistic Z = ♯ edges ⇒ E [ Z ] = | S 1 | · | S 2 | · � p � 2 2 ◮ Minimized for p = U n ◮ Remark: Suboptimal sample complexity, but can lead to optimal communication complexity in certain cases. 8 / 15
C OMMUNICATION MODEL ◮ Unbounded number of players ◮ Players can broadcast on the blackboard ◮ The referee asks questions to players and receives replies. ◮ Goal: Minimize total number of bits of communication. 9 / 15
A C OMMUNICATION EFFICIENT A LGORITHM ◮ Idea: Statistic Z = sum of degrees on one side. ◮ Only the opposite side needs to reveal samples exactly. √ n/ℓ ◮ Broadcasted samples: ℓ · | S 1 | = ǫ 2 √ log n ◮ Not enough for testing. ◮ And the samples on the right? ◮ Only degrees d k sent to the referee. ◮ O (1) bits/message w.l.o.g. � √ � n/ℓ √ log n ◮ Communication complexity: O bits. ǫ 2 � √ n/ℓ √ � log n ◮ Matching lower bound of Ω bits for small ℓ . ǫ 2 � √ n log n � ◮ Better than naive O bits. ǫ 2 10 / 15
C OMMUNICATION EFFICIENT IMPLEMENTATION T WO ALGORITHMS Case I: ℓ = ˜ O ( n 1 / 3 /ε 4 / 3 ) samples/ machine ◮ Use cross collisions - bipartite graph ◮ Communication complexity: � √ � n/ℓ √ log n O bits. ǫ 2 Case II: ℓ = ˜ Ω( n 1 / 3 /ε 4 / 3 ) samples/machine ◮ Each machine sends that number of local collisions and to the referee. ◮ The referee computes the total sum Z of the collisions. � ℓ ◮ E [ Z ] = � � p � 2 2 2 ◮ Threshold: (1 + ε 2 ) E [ Z ] ◮ Communication complexity: � � n log n O bits. ℓ 2 ǫ 4 11 / 15
M EMORY EFFICIENT IMPLEMENTATION I N THE ONE - PASS STREAMING MODEL Model: One-pass streaming algorithm: The samples arrive in a stream and the algorithm can access them only once . Memory constraint: At most m bits for some m ≥ log n/ε 6 ◮ Use N 1 = m/ 2 log n samples to get the multiset of labels S 1 . � n log n/ ( mε 4 ) ◮ Use collision information from N 2 = Θ � other samples (i.e the multiset of labels S 2 ). Remarks: r � ◮ We can store d k , 1 ≤ r ≤ N 2 in a single pass. k =1 ◮ For m = Ω( √ n log n/ε 2 ) , we simply run the classical collision-based tester using the first O ( √ n/ε 2 ) samples. 12 / 15
S UMMARY OF RESULTS Sample Complexity Bounds with Memory Constraints Property Upper Bound Lower Bound 1 Lower Bound 2 � n log n � � n log n � � � n Uniformity O Ω Ω mε 4 mε 4 mε 2 n 0 . 9 ≫ m ≫ log( n ) /ε 2 m = ˜ Ω( n 0 . 34 ε 8 / 3 + n 0 . 1 ε 4 ) Conditions Unconditional log( n ) / ( √ mε 2 )) � Closeness O ( n – – ˜ Θ(min( n, n 2 / 3 /ε 4 / 3 )) ≫ m ≫ log( n ) Conditions – – Communication Complexity Bounds Property UB 1 UB 2 LB 1 LB 2 LB 3 � √ � √ √ � n log( n ) � � n log( n ) /ℓ n log( n ) /ℓ n/ℓ � n Uniformity O O Ω Ω( ) Ω( ℓ 2 ε 2 log n ) ε 2 ℓ 2 ε 4 ε 2 ε ε 4 / 3 n 0 . 3 ≫ ℓ � � � � log n ≫ ℓ ≫ ε − 4 ε 8 n √ n ℓ = ˜ n 1 / 3 ℓ = ˜ n 1 / 3 Conditions ℓ ≪ O Ω n 0 . 9 ε 2 ε 4 / 3 ε 4 / 3 � n 2 / 3 log 1 / 3 ( n ) � Closeness O - - - - ℓ 2 / 3 ε 4 / 3 nε 4 / log( n ) ≫ ℓ Conditions - - - - 13 / 15
L OWER B OUNDS (O NE P ASS ) k SAMPLES , m BITS OF MEMORY , ℓ SAMPLES PER MACHINE 1. Memory: ◮ k · m = Ω( n ε 2 ) ◮ Under technical assumptions: k · m = Ω( n log n ) ε 4 Reduction (low communication ⇒ low memory) ◮ samples/machine: ℓ ◮ bits of communication: t Store samples of the next player only ⇒ t + ℓ log n -memory � n 1 / 3 � 2. Communication ( ℓ = O )-one pass: ε 4 / 3 (log n ) 1 / 3 � √ � n/ℓ ◮ Ω samples. ε � √ � n log n/ℓ ◮ Under assumptions: Ω ε 2 � n 1 / 3 � 3. Communication ( ℓ = Ω )-one pass: ε 4 / 3 (log n ) 1 / 3 � � ◮ Ω n samples. ℓ 2 ε 2 log n 14 / 15
S UMMARY -O PEN PROBLEMS ◮ We described a bipartite collision-based algorithm for uniformity. ◮ Then applied it to memory constrained and distributed settings. ◮ Showed matching lower bounds for certain parameter regimes. ◮ An asymptotically optimal algorithm becomes (provably) suboptimal as ℓ grows. Open Problems: ◮ Do the lower bounds still hold if multiple passes are allowed? ◮ Is there an algorithm with a better communication-sample complexity trade-off? 15 / 15
Recommend
More recommend