Towards a Theory ry of f Parameterized Streaming Algorithms Graham Cormode Rajesh Chitnis
Parameterized Streaming Algorithms We increasingly have to deal with huge graphs… Facebook graph Brain graph Google Maps in USA Web Graph • 10 9 nodes • 10 9 nodes • 10 8 intersection nodes • 2 32 nodes • It is inconvenient or impossible to store the whole input for random access • “Solved” problems become hard under different models of data access • E.g. External memory, MapReduce, Streaming…
Parameterized Streaming Algorithms • The paradigm of streaming algorithms is one attempt to deal with Big Data • The streaming model (for graphs) is as follows: • The vertex set 𝑊 = {1,2, … , 𝑜} is fixed, and known in advance • The edges arrive one-by-one (in arbitrary order) • For each edge arrival, we need to make a (fast) decision what information to store • Cannot (do not want to) store all the edges 1 5 2 4 3 • We allow unbounded computation at end of the stream • Which graph problems can we solve efficiently in this model? • Naïve algorithm for any graph problem uses 𝑃 𝑜 2 bits by storing whole adjacency matrix
Parameterized Streaming Algorithms • Recall that the naïve algorithm for any graph problem uses 𝑃 𝑜 2 bits • Bad News : Many graph problems have a lower bound of Ω(𝑜 2 ) space in streaming model • E.g. Does the given graph have any triangle? • Typically use communication complexity to show lower bounds for streaming algorithms • INDEX problem: Alice has string 𝑌 ∈ 0,1 𝑂 , Bob has index 𝑗 ∈ 𝑂 , want to find 𝑗 th bit of X • Lower bound of Ω(𝑂) if Alice can send only one message to Bob, even with randomization • Communication complexity reductions: show that a streaming algorithm would solve INDEX One-way communication from Alice to Bob 𝑗 = 5 10010110
Parameterized Streaming Algorithms • Sketch of a simple INDEX reduction for triangle detection: • Alice adds edges between 𝑍 and 𝑎 according to her string 𝑌 • Then she sends her data structure to Bob • Bob has an index 𝐽 ∈ 𝑂 corresponding to some 𝑘, ℓ ∈ 𝑠 × 𝑠 • Bob adds a new vertex 𝑡 and the edges (𝑡, 𝑧 𝑘 ) and (𝑡, 𝑨 ℓ ) Z Y 𝑡 𝑨 1 𝑧 1 Let 𝑂 = 𝑠 2 𝑨 ℓ 𝑧 𝑘 6 𝑨 𝑠 𝑧 𝑠 The resulting graph has a triangle iff the edge (𝑧 𝑘 , 𝑨 ℓ ) is present, i.e., 𝐽 𝑢ℎ bit of X is 1
Parameterized Streaming Algorithms • Bad News : Many graph problems require Ω(𝑜 2 ) space in streaming model • How can we cope with this (space) intractability? BIG Fine-grained understanding via parameterized analysis Time BIG Data • Feigenbaum et al. [ICALP ‘04] : Finding (size of) a min VC needs Ω(𝑜 2 ) space • But how much space does 𝑙 -VC need? • We design a streaming algorithm in 𝑃(𝑙 ⋅ log 𝑜) bits (with 2 𝑙 passes over the input) • Essentially, the standard branching FPT algorithm in streaming model…
Parameterized Streaming Algorithms 𝑯 • Streaming algorithm for 𝑙 -VC with 𝑃(𝑙 ⋅ log 𝑜) bits and 2 𝑙 passes • Consider all 2 𝑙 binary strings from 0,1 𝑙 , one in each pass 𝒚 𝟐 𝒛 𝟐 𝒇 = 𝒚 𝟐 𝒛 𝟐 • The binary search tree has 2 𝑙 leaves 𝑯 - 𝒚 𝟐 𝑯 - 𝒛 𝟐 𝒚 𝟒 𝒛 𝟒 𝒚 𝟑 𝒛 𝟑 • Each pass corresponds to a root → leaf path in the tree 𝒇 = 𝒚 𝟑 𝒛 𝟑 𝒇 = 𝒚 𝟒 𝒛 𝟒 • 0 for left branch, and 1 for right branch 𝑯 - 𝒛 𝟐 - 𝒚 𝟒 𝑯 - 𝒚 𝟐 - 𝒛 𝟑 𝑯 - 𝒛 𝟐 - 𝒛 𝟒 𝑯 - 𝒚 𝟐 - 𝒚 𝟑 • Algorithm only stores current binary string and corresponding VC • Storage is 𝑃(𝑙 ⋅ log 𝑜) bits • Optimal if you also want to output a VC! 𝑙 Streaming implementation of FPT algorithm via iterative compression: (𝑙 ⋅ 2 𝑙 ) -pass streaming algorithm for 𝑙 -VC which uses 𝑃(𝑙 ⋅ log 𝑜) bits Reducing the number of passes: Chitnis et al. [SODA ‘15] designed a 1 -pass streaming algorithm for 𝑙 -VC using 𝑃(𝑙 2 ⋅ log 𝑜) bits
Parameterized Streaming Algorithms Towards a general theory of (space) parameterized streaming algorithms….. BrutePS : 𝑃(𝑜 2 ) LinPS : 𝑔 𝑙 ⋅ 𝑜 ⋅ log 𝑜 • FPS: Fixed-Parameter Streaming • SubPS: Sublinear dependence on input 𝑜 SubPS : 𝑔 𝑙 ⋅ 𝑜 1−𝜗 ⋅ log 𝑜 • LinPS: Linear dependence on input 𝑜 • BrutePS: Naïvely storing the whole graph FPS : 𝑔 𝑙 ⋅ log 𝑜 𝒍 -Vertex-Cover K-MaxMatching 1.5-approx. for MaxMatching Goal: Develop algorithms and lower bounds to on trees categorize graph problems in this hierarchy 𝒍 -Path, 𝒍 - FVS , 𝒍 -Treewidth We study all problems, not just NP-hard ones! 𝒍 -Girth, 𝒍 -Clique, 𝒍 -Dominating-Set
Parameterized Streaming Algorithms Towards a general theory of (space) parameterized streaming algorithms….. • FPS: Fixed-Parameter Streaming Algorithms BrutePS : 𝑃(𝑜 2 ) • SubPS: Sublinear dependence on input 𝑜 • LinPS: Linear dependence on input 𝑜 LinPS : 𝑔 𝑙 ⋅ 𝑜 ⋅ log 𝑜 • BrutePS: Naïvely storing the whole graph SubPS : 𝑔 𝑙 ⋅ 𝑜 1−𝜗 ⋅ log 𝑜 FPS : 𝑔 𝑙 ⋅ log 𝑜 Picture is a bit more complicated: Any entry in this landscape is really a 6-tuple [Problem, Parameter, Approximation Ratio, Type of Stream, Type of Algorithm, # of passes] Deterministic or Randomized Insertion-only or Insertion-deletion
Parameterized Streaming Algorithms Tight problems for the class LinPS via simple upper bounds 𝑙 -Path: If 𝐹 ≥ 𝑙 ⋅ 𝑜 then there is a 𝑙 -path BrutePS : 𝑃(𝑜 2 ) 𝑙 -FVS: If there is a fvs of size 𝑙 then 𝐹 ≤ 𝑙 ⋅ 𝑜 𝑙 -Treewidth: If treewidth is ≤ 𝑙 then 𝐹 ≤ 𝑙 ⋅ 𝑜 LinPS : 𝑔 𝑙 ⋅ 𝑜 ⋅ log 𝑜 Store all edges till we see (𝑙 ⋅ 𝑜) edges SubPS : 𝑔 𝑙 ⋅ 𝑜 1−𝜗 ⋅ log 𝑜 Hence this needs 𝑃(𝑙 ⋅ 𝑜 ⋅ log 𝑜) bits FPS : 𝑔 𝑙 ⋅ log 𝑜 These problems need Ω(𝑜 ⋅ log 𝑜) space (for constant 𝑙 ) Hence, they are not in SubPS 𝒍 -Path, 𝒍 - FVS , Rules out any algorithm using space 𝒍 -Treewidth 𝑔 𝑙 ⋅ 𝑝(𝑜 ⋅ log 𝑜) for any function 𝑔
Parameterized Streaming Algorithms 𝛁(𝐨 ⋅ 𝒎𝒑𝒉 𝒐) bit or 𝒍 -Path th 𝒍 = 𝟕 bit lower r bou bound for th with • Hardness reduction : “Small” space streaming algorithm for 6 -Path ⇒ 1- way communication protocol for PERMUTATION of “small” cost • PERMUTATION problem: Alice has a permutation 𝜀: 𝑂 → 𝑂 encoded as a bit-string of length 𝑂 ⋅ log 𝑜 . Bob has an index 𝐽 ∈ 𝑂 ⋅ log 𝑂 and wants to find 𝐽 𝑢ℎ bit of 𝜀 • Sun and Woodruff [APPROX ‘15]: need Ω(𝑂 ⋅ log 𝑂) bits one-way communication Y Z • Alice adds edges between 𝑍 and 𝑎 according to the permutation 𝜀 • For each 𝑗 ∈ [𝑂] she adds an edge from 𝑧 𝑗 to 𝑨 𝜀 𝑗 𝑨 𝜀(1) 𝑧 1 • Bob’s index 𝐽 ∈ [𝑂 ⋅ log 𝑂] maps to ℓ 𝑢ℎ -bit of 𝜀(𝑘) for some 𝑘, ℓ 𝑨 𝜀(2) 𝑧 2 𝑢 𝑡 • Bob adds a new vertex 𝑡 , and the edge 𝑡 − 𝑧 𝑘 𝑨 𝜀(𝑘) 𝑧 𝑘 • Let 𝑇 ℓ = {𝑨 𝜀(𝑠) ∶ ℓ 𝑢ℎ -bit of 𝜀(𝑠) is one } • Bob adds new vertex 𝑢 , and edges from 𝑢 to each vertex of 𝑇 ℓ 𝑧 𝑂 𝑨 𝜀(𝑂) The resulting graph has a 6 -path iff edge 𝑨 𝜀(𝑘) ∈ 𝑇 ℓ is present, i.e., 𝐽 𝑢ℎ bit of X is 1
Parameterized Streaming Algorithms Tight problems for the class BrutePS BrutePS : 𝑃(𝑜 2 ) How do we show a problem does not belong to the smaller class LinPS? LinPS : 𝑔 𝑙 ⋅ 𝑜 ⋅ log 𝑜 SubPS : 𝑔 𝑙 ⋅ 𝑜 1−𝜗 ⋅ log 𝑜 • Show Ω(𝑜 2 ) bits lower bound for constant 𝑙 • Rules out any algorithm using space 𝑔 𝑙 ⋅ 𝑝(𝑜 2 ) FPS : 𝑔 𝑙 ⋅ log 𝑜 • Next slide gives proof for 3 - Girth… Note that 𝑙 -Girth is polynomial time solvable, but hard in terms of space ! 𝒍 -Girth, 𝒍 -Clique, 𝒍 -Dominating-Set
Parameterized Streaming Algorithms 𝛁(𝐨 𝟑 ) bit bits lower bou bound for or ch checkin ing if f girth rth of of a a grap aph is s ≤ 𝟒 INDEX problem requires Ω(𝑂) bits of one-way communication from Alice to Bob Alice has a string 𝑌 ∈ 0,1 𝑂 . Bob has an index 𝐽 ∈ 𝑂 and wants to find 𝐽 𝑢ℎ bit of X 𝑡 • Same set up as previously: Z Y • Let 𝑂 = 𝑠 2 and fix a bijection 𝜚: 𝑂 → 𝑠 × [𝑠] 𝑨 1 𝑧 1 • Alice adds edges between 𝑍 and 𝑎 according to string 𝑌 𝑨 ℓ • Then she sends her data structure to Bob • Bob’s index 𝐽 ∈ 𝑂 corresponds to some 𝑘, ℓ ∈ 𝑠 × 𝑠 𝑧 𝑘 𝑨 𝑠 𝑧 𝑠 • Bob adds a new vertex 𝑡 and the edges (𝑡, 𝑧 𝑘 ) and (𝑡, 𝑨 ℓ ) • Lower bound of Ω(𝑂) translates to Ω(𝑜 2 ) for 3 -girth on graphs with 𝑜 vertices The resulting graph has a triangle iff the edge (𝑧 𝑘 , 𝑨 ℓ ) is present, i.e., 𝐽 𝑢ℎ bit of X is 1
Recommend
More recommend