optimal lower bounds for distributed and streaming
play

Optimal Lower Bounds for Distributed and Streaming Spanning Forest - PowerPoint PPT Presentation

Optimal Lower Bounds for Distributed and Streaming Spanning Forest Computation Huacheng Yu Oct 18, 2018 Harvard University Joint work with Jelani Nelson Warm-up Consider the following dynamic problem: edges are inserted into an initially


  1. Optimal Lower Bounds for Distributed and Streaming Spanning Forest Computation Huacheng Yu Oct 18, 2018 Harvard University Joint work with Jelani Nelson

  2. Warm-up Consider the following dynamic problem: • edges are inserted into an initially empty graph G on n vertices 1

  3. Warm-up Consider the following dynamic problem: • edges are inserted into an initially empty graph G on n vertices • must output a spanning forest when queried 1

  4. Warm-up Consider the following dynamic problem: • edges are inserted into an initially empty graph G on n vertices • must output a spanning forest when queried Goal: minimize space 1

  5. Warm-up Consider the following dynamic problem: • edges are inserted into an initially empty graph G on n vertices • must output a spanning forest when queried Goal: minimize space Space complexity: Θ( n log n ) bits • maintain list of edges in the spanning forest: O ( n log n ) • when the final graph is a tree itself, have to output the whole graph: Ω( n log n ) 1

  6. Warm-up Consider the following dynamic problem: • edges are inserted into an initially empty graph G on n vertices • must output a spanning forest when queried Goal: minimize space Space complexity: Θ( n log n ) bits • maintain list of edges in the spanning forest: O ( n log n ) • when the final graph is a tree itself, have to output the whole graph: Ω( n log n ) what if we allow edge deletions ? 1

  7. Fully dynamic spanning forest Maintain a dynamic graph on n vertices, supporting • edge insertions, • edge deletions, and • spanning forest queries Goal: minimize space Theorem (Ahn, Guha, McGregor’12) ... solvable using O ( n log 3 n ) bits of space with error probability 1 / poly ( n ) . 2

  8. Fully dynamic spanning forest Maintain a dynamic graph on n vertices, supporting • edge insertions, • edge deletions, and • spanning forest queries Goal: minimize space Theorem (Ahn, Guha, McGregor’12) ... solvable using O ( n log 3 n ) bits of space with error probability 1 / poly ( n ) . only two more log factors! 2

  9. Fully dynamic spanning forest Maintain a dynamic graph on n vertices, supporting • edge insertions, • edge deletions, and • spanning forest queries Goal: minimize space Theorem (Ahn, Guha, McGregor’12) ... solvable using O ( n log( n /δ ) log 2 n ) bits of space with error probability δ . only two more log factors! 2

  10. Fully dynamic spanning forest Maintain a dynamic graph on n vertices, supporting • edge insertions, • edge deletions, and • spanning forest queries Goal: minimize space Theorem (Ahn, Guha, McGregor’12) ... solvable using O ( n log( n /δ ) log 2 n ) bits of space with error probability δ . only two more log factors! why two more? 2

  11. Main result I Theorem (This paper) Any data structure for fully dynamic spanning forest with error probability δ must use Ω( n log( n /δ ) log 2 n ) bits of memory, for any 2 − n 0 . 99 < δ < 0 . 99 . 3

  12. Main result I Theorem (This paper) Any data structure for fully dynamic spanning forest with error probability δ must use Ω( n log( n /δ ) log 2 n ) bits of memory, for any 2 − n 0 . 99 < δ < 0 . 99 . ⇒ Ω( n log 3 n ) bits of space: δ is a constant = need exactly two more log factors! 3

  13. Simultaneous communication The [Ahn, Guha, McGregor’12] solution also solves the following n -player communication problem 4

  14. Simultaneous communication The [Ahn, Guha, McGregor’12] solution also solves the following n -player communication problem A (fixed) graph on n vertices is given to n players w. shared randomness: • each player only sees one vertex and its neighborhood 4

  15. Simultaneous communication The [Ahn, Guha, McGregor’12] solution also solves the following n -player communication problem A (fixed) graph on n vertices is given to n players w. shared randomness: • each player only sees one vertex and its neighborhood • each player sends a message to a referee 4

  16. Simultaneous communication The [Ahn, Guha, McGregor’12] solution also solves the following n -player communication problem A (fixed) graph on n vertices is given to n players w. shared randomness: • each player only sees one vertex and its neighborhood • each player sends a message to a referee • referee outputs a spanning forest w.p. 1 − δ 4

  17. Simultaneous communication The [Ahn, Guha, McGregor’12] solution also solves the following n -player communication problem A (fixed) graph on n vertices is given to n players w. shared randomness: • each player only sees one vertex and its neighborhood • each player sends a message to a referee • referee outputs a spanning forest w.p. 1 − δ Goal: minimize communication 4

  18. Simultaneous communication The [Ahn, Guha, McGregor’12] solution also solves the following n -player communication problem A (fixed) graph on n vertices is given to n players w. shared randomness: • each player only sees one vertex and its neighborhood • each player sends a message to a referee • referee outputs a spanning forest w.p. 1 − δ Goal: minimize communication (compute a global function given small “sketches” of “local information”) 4

  19. AGM sketch for simultaneous communication A graph on n vertices is given to n players w. shared randomness: • each player only sees one vertex and its neighborhood • each player sends a message to a referee • referee outputs a spanning forest w.p. 1 − δ Goal: minimize communication 5

  20. AGM sketch for simultaneous communication A graph on n vertices is given to n players w. shared randomness: • each player only sees one vertex and its neighborhood • each player sends a message to a referee • referee outputs a spanning forest w.p. 1 − δ Goal: minimize communication Theorem (AGM’12) ... solvable using (worst-case) O (log( n /δ ) log 2 n ) bits of communication per player with error probability δ . 5

  21. AGM sketch for simultaneous communication A graph on n vertices is given to n players w. shared randomness: • each player only sees one vertex and its neighborhood • each player sends a message to a referee • referee outputs a spanning forest w.p. 1 − δ Goal: minimize communication Theorem (AGM’12) ... solvable using (worst-case) O (log( n /δ ) log 2 n ) bits of communication per player with error probability δ . Trivial: Ω(log n ) since the referee has to learn Ω( n log n ) bits 5

  22. Main result II Theorem (This paper) Any simultaneous communication protocol for spanning forest with error probability 0.99 must use Ω(log 3 n ) bits of communication on average. 6

  23. Main result II Theorem (This paper) Any simultaneous communication protocol for spanning forest with error probability 0.99 must use Ω(log 3 n ) bits of communication on average. exactly two more log factors needed than the trivial information theoretical lower bound 6

  24. Main result II Theorem (This paper) Any simultaneous communication protocol for spanning forest with error probability 0.99 must use Ω(log 3 n ) bits of communication on average. exactly two more log factors needed than the trivial information theoretical lower bound Open: higher lower bounds when error probability δ is lower? 6

  25. Graph sketching for spanning forest [AGM’12] designed a (randomized) linear sketch: S : N n 2 → N O ( n log 2 n ) such that 7

  26. Graph sketching for spanning forest [AGM’12] designed a (randomized) linear sketch: S : N n 2 → N O ( n log 2 n ) such that • S is a linear mapping with poly-bounded coefficients 7

  27. Graph sketching for spanning forest [AGM’12] designed a (randomized) linear sketch: S : N n 2 → N O ( n log 2 n ) such that • S is a linear mapping with poly-bounded coefficients • S ( G ) is a concatenation of S 1 ( G ) , S 2 ( G ) , . . . , S n ( G ), each S i ( G ) has O (log 2 n ) dimensions, and it is computed from the neighborhood of vertex i 7

  28. Graph sketching for spanning forest [AGM’12] designed a (randomized) linear sketch: S : N n 2 → N O ( n log 2 n ) such that • S is a linear mapping with poly-bounded coefficients • S ( G ) is a concatenation of S 1 ( G ) , S 2 ( G ) , . . . , S n ( G ), each S i ( G ) has O (log 2 n ) dimensions, and it is computed from the neighborhood of vertex i • S ( G ) determines a spanning forest with probability 1 − 1 / n c 7

  29. Streaming algorithm Store S ( G ) in memory: • update: S ( G ± ( u , v )) = S ( G ) ± S (( u , v )) • at end of stream: S ( G ) determines a spanning forest w.h.p. Use O ( n log 3 n ) bits of space 8

  30. Communication protocol Given graph G : • Player i computes S i ( G ), and sends it to referee • referee concatenates all S i ( G ), obtains S ( G ) • referee outputs a spanning forest w.h.p. Use O (log 3 n ) bits of communication per player 9

  31. Simultaneous communication complexity of spanning forest 10

  32. Recall... An n -vertex graph is given to n players with shared randomness: • each player only sees one vertex and its neighborhood • each player sends a message to a referee • referee outputs a spanning forest w.p. 1 − δ Goal: prove an average player must send Ω(log 3 n ) bits for constant δ 11

  33. Recall... An n -vertex graph is given to n players with shared randomness: • each player only sees one vertex and its neighborhood • each player sends a message to a referee • referee outputs a spanning forest w.p. 1 − δ Goal: prove some player must send Ω(log 3 n ) bits for δ = 1 / n c 11

Recommend


More recommend