graph streaming and sketching
play

Graph Streaming and Sketching Lecture 19 Nov 5, 2020 Chandra - PowerPoint PPT Presentation

CS 498ABD: Algorithms for Big Data Graph Streaming and Sketching Lecture 19 Nov 5, 2020 Chandra (UIUC) CS498ABD 1 Fall 2020 1 / 1 Graphs G = ( V , E ) is an undirected graph n = | V | and m = | E | Edges e 1 , e 2 , . . . , e m seen as a


  1. CS 498ABD: Algorithms for Big Data Graph Streaming and Sketching Lecture 19 Nov 5, 2020 Chandra (UIUC) CS498ABD 1 Fall 2020 1 / 1

  2. Graphs G = ( V , E ) is an undirected graph n = | V | and m = | E | Edges e 1 , e 2 , . . . , e m seen as a stream, n known cite , 13107 , - - . Chandra (UIUC) CS498ABD 2 Fall 2020 2 / 1

  3. Graphs G = ( V , E ) is an undirected graph n = | V | and m = | E | Edges e 1 , e 2 , . . . , e m seen as a stream, n known Questions: What graph problems can be solve with small space? Can we handle edge deletions? = Chandra (UIUC) CS498ABD 2 Fall 2020 2 / 1

  4. Semi-streaming Model - Lower bounds show that we require Ω ( n ) memory ÷ Assume we have Θ ( n polylog ( n ) memory. About polylog per vertex of the graph Can solve several interesting problems. Essentially reduce dense graphs to sparse graphs. Chandra (UIUC) CS498ABD 3 Fall 2020 3 / 1

  5. Connectivity Is G connected? Output a spanning tree if it is. Output an MST of G in the weighted case. Is G k -edge connected? Chandra (UIUC) CS498ABD 4 Fall 2020 4 / 1

  6. Basic Connectivity Maintain spanning forest: need only O ( n ) edges When edge e i = ( u , v ) arrives. If u and v are in di ff erent components add e i to spanning forest. Otherwise discard e i . Ci , er , - , Eun . stream if end at know of to want connected A is - Chandra (UIUC) CS498ABD 5 Fall 2020 5 / 1

  7. :÷i EE ' am = ÷

  8. MST Maintain spanning forest: need only O ( n ) edges When edge e i = ( u , v ) arrives. If u and v are in di ff erent components add e i to spanning forest. What if u and v are in same connected component? Chandra (UIUC) CS498ABD 6 Fall 2020 6 / 1

  9. • ¥ . ¥ .

  10. MST Maintain spanning forest: need only O ( n ) edges When edge e i = ( u , v ) arrives. If u and v are in di ff erent components add e i to spanning forest. What if u and v are in same connected component? Check cycle formed by adding e i and discard heaviest edge in cycle. Chandra (UIUC) CS498ABD 6 Fall 2020 6 / 1

  11. MST Maintain spanning forest: need only O ( n ) edges When edge e i = ( u , v ) arrives. If u and v are in di ff erent components add e i to spanning forest. What if u and v are in same connected component? Check cycle formed by adding e i and discard heaviest edge in cycle. Exercise: Prove that algorithm outputs an MST if G is connected. Chandra (UIUC) CS498ABD 6 Fall 2020 6 / 1

  12. MST Maintain spanning forest: need only O ( n ) edges When edge e i = ( u , v ) arrives. If u and v are in di ff erent components add e i to spanning forest. What if u and v are in same connected component? Check cycle formed by adding e i and discard heaviest edge in cycle. Exercise: Prove that algorithm outputs an MST if G is connected. Note: we did not focus on time to process each edge in stream. Can use data structures to implement in O (log n ) time per operation. Chandra (UIUC) CS498ABD 6 Fall 2020 6 / 1

  13. k -edge-connectivity Definition A graph G = ( V , E ) is k -edge-connected if deleting any k � 1 edges still leaves a connected graph. ¥ I : z - edge connected wt Chandra (UIUC) CS498ABD 7 Fall 2020 7 / 1

  14. t ¥¥ ÷÷

  15. k -edge-connectivity Definition A graph G = ( V , E ) is k -edge-connected if deleting any k � 1 edges still leaves a connected graph. Definition Given a graph G = ( V , E ) and S ⇢ V , � ( S ) is the set of edges with exactly one end point in S . OF Chandra (UIUC) CS498ABD 7 Fall 2020 7 / 1

  16. k -edge-connectivity Definition A graph G = ( V , E ) is k -edge-connected if deleting any k � 1 edges still leaves a connected graph. Definition Given a graph G = ( V , E ) and S ⇢ V , � ( S ) is the set of edges with exactly one end point in S . Lemma A graph G is k -edge connected i ff | � ( S ) | � k for all S ⇢ V . = Chandra (UIUC) CS498ABD 7 Fall 2020 7 / 1

  17. ÷ . QO

  18. Sparse certificates for k -edge connectivity Observation: If G is k -edge-connected than m � kn / 2 . Why? NIO * dy lol > . K . m % Edgar K - 2 - nlzc . , m . ⑨ EE Chandra (UIUC) CS498ABD 8 Fall 2020 8 / 1

  19. Sparse certificates for k -edge connectivity Observation: If G is k -edge-connected than m � kn / 2 . Why? = Question: Suppose G is edge-minimal k -edge-connected graph on n nodes. What is an upper bound on the number of edges? Chandra (UIUC) CS498ABD 8 Fall 2020 8 / 1

  20. Sparse certificates for k -edge connectivity Observation: If G is k -edge-connected than m � kn / 2 . Why? Question: Suppose G is edge-minimal k -edge-connected graph on n nodes. What is an upper bound on the number of edges? Theorem An edge-minimal k -edge-connected graph on n nodes has at most - k ( n � 1) edges. cnn.in:8 ? -z--(n-llthz Chandra (UIUC) CS498ABD 8 Fall 2020 8 / 1

  21. Sparse certificates for k -edge connectivity Observation: If G is k -edge-connected than m � kn / 2 . Why? Question: Suppose G is edge-minimal k -edge-connected graph on n nodes. What is an upper bound on the number of edges? Theorem An edge-minimal k -edge-connected graph on n nodes has at most 0 k ( n � 1) edges. Theorem Given a graph G finding the smallest 2 -edge-connected subgraph is NP-Hard. Chandra (UIUC) CS498ABD 8 Fall 2020 8 / 1

  22. Sparse certificates for k -edge connectivity Theorem An edge-minimal k -edge-connected graph on n nodes has at most k ( n � 1) edges. Constructive proof via algorithm. For i = 1 to k do Let F i be a spanning forest in ( V , E \ [ i − 1 j =1 F j ) Output H = ( V , F 1 [ F 2 . . . [ F k ) Chandra (UIUC) CS498ABD 9 Fall 2020 9 / 1

  23. • ¥¥ :÷÷ : # Red Fa • = Q • o • A • connected Claim : edge Iof h 1 is edge connected iff Fi 1 is claim : connected edge A 2 is iff edge Fi UE Canuck 2 is connected edge claim : 3 h is 7 eef - iff , v Fr UF F is , .

  24. Sparse certificates for k -edge connectivity Theorem An edge-minimal k -edge-connected graph on n nodes has at most k ( n � 1) edges. Constructive proof via algorithm. For i = 1 to k do Let F i be a spanning forest in ( V , E \ [ i − 1 j =1 F j ) Output H = ( V , F 1 [ F 2 . . . [ F k ) Easy to see that H as at most k ( n � 1) edges. Lemma H is k -edge-connected if G is. Chandra (UIUC) CS498ABD 9 Fall 2020 9 / 1

  25. Streaming setting For i = 1 to k do Let F i be a spanning forest in ( V , E \ [ i − 1 j =1 F j ) Output H = ( V , F 1 [ F 2 . . . [ F k ) Algorithm can be implemented in streaming setting. How? Maintain Fk Fi IT , i - - , 9 Q Q Chandra (UIUC) CS498ABD 10 Fall 2020 10 / 1

  26. k -node-connectivity Definition A graph G = ( V , E ) is k -node-connected (or k -vertex-connected) if deleting any k � 1 nodes leaves a connected graph. • If cut valet £ -4 Chandra (UIUC) CS498ABD 11 Fall 2020 11 / 1

  27. k -node-connectivity Definition A graph G = ( V , E ) is k -node-connected (or k -vertex-connected) if deleting any k � 1 nodes leaves a connected graph. Theorem An edge-minimal k -edge-connected graph on n nodes has at most Ide O kn edges. Above theorem is much more tricky than for the edge case. See [Zelke] for references and streaming algorithm. = Flynt kn ) Chandra (UIUC) CS498ABD 11 Fall 2020 11 / 1

  28. Part I Graph sketching for connectivity Chandra (UIUC) CS498ABD 12 Fall 2020 12 / 1

  29. add - add ee , - law ) delete add ( u , w ) ( un ) add , , r l r r - l suit " .

  30. Graph sketching We saw previously that linear sketching on vectors x allows for several powerful applications including ability to handle deletions Graph streaming with deletions: each token in stream is of the form ( e , ∆ ) where e is an edge and ∆ 2 { � 1 , 1 } . O Want to maintain a sketch/data structure of size O ( n polylog ( n )) - such that one can answer basic questions. Example: connectivity queries. IE Rd # poly Chandra (UIUC) CS498ABD 13 Fall 2020 13 / 1

  31. Linear sketching recap Vector x 2 R n that is updated one coordinate at a time. Pick a sketch matrix M r 2 R k × n and maintain sketch M r x of dimension k The sketch matrix M r depends on a random string r and is implicitly defined and not explicitly stored. Assumption is that M r 1 i for vector 1 i (which has 1 in i ’th coordinate and 0 in all other entries) can be computed e ffi ciently from r . When x is updated to x + ↵ 1 i we update sketch by ↵ M r 1 i . . REM Do postprocessing of M r x Mc ye Rk M= ¥ - y - = Chandra (UIUC) CS498ABD 14 Fall 2020 14 / 1

  32. ` 0 sampling in turnstile model k x k 0 is number of non-zero coordinates (distinct elements) ` 0 -sampling: output a non-zero coordinate of x near uniformly. Can be done with O (log 2 n ) -sized sketch = Note: allow positive and negative entries in x I , -1,0 , 0,01 ( to I , O , O , 0 , O - Chandra (UIUC) CS498ABD 15 Fall 2020 15 / 1

  33. Sketching for graphs Consider vector f 2 R ( n 2 ) where f i 2 { 0 , 1 } indicating whether edge i in the complete graph on n nodes is in the graph or not. Graph - 3 - ] ( K , (7) - i Example: - I - , → i Sketching f is not adequate for most graph applications. We need information about edges incident to each vertex. For node v let f v 2 R ( n 2 ) be a vector that only considers edges incident to v in the complete graph. Essentially the row of v in the adjacency matrix. Chandra (UIUC) CS498ABD 16 Fall 2020 16 / 1

Recommend


More recommend