First Idea: Sketches f 1 t 1 Z f 2 t 2 = t k . . . f n • Algorithm uses a (random) projection matrix Z such that the relevant properties of f can be estimated from the sketch Zf. • Easy to Update: On seeing “i”, add i th column of Z to sketch
First Idea: Sketches f 1 t 1 Z f 2 t 2 = t k . . . f n • Algorithm uses a (random) projection matrix Z such that the relevant properties of f can be estimated from the sketch Zf. • Easy to Update: On seeing “i”, add i th column of Z to sketch • Store Matrix Implicitly: Need to be able to efficiently generate any entry of Z from a “small” random seed.
First Idea: Sketches f 1 t 1 Z f 2 t 2 = t k . . . f n • Algorithm uses a (random) projection matrix Z such that the relevant properties of f can be estimated from the sketch Zf. • Easy to Update: On seeing “i”, add i th column of Z to sketch • Store Matrix Implicitly: Need to be able to efficiently generate any entry of Z from a “small” random seed. • Gives Õ(k) space algorithm with seed & precision assumptions.
Algorithm for Estimating F 2 f 1 t 1 Z f 2 t 2 = t k . . . f n
Algorithm for Estimating F 2 f 1 t 1 Z f 2 t 2 = t k . . . f n Consider a row z of the projection matrix.
Algorithm for Estimating F 2 f 1 t 1 Z f 2 t 2 = t k . . . f n Consider a row z of the projection matrix. Let entries of z be uniform in {-1,1} chosen with 4-wise independence. Let t=z.f.
Algorithm for Estimating F 2 f 1 t 1 Z f 2 t 2 = t k . . . f n Consider a row z of the projection matrix. Let entries of z be uniform in {-1,1} chosen with 4-wise independence. Let t=z.f.
Algorithm for Estimating F 2 f 1 t 1 Z f 2 t 2 = Square of entry is concentrated around F 2 . t k . . . f n Consider a row z of the projection matrix. Let entries of z be uniform in {-1,1} chosen with 4-wise independence. Let t=z.f.
Algorithm for Estimating F 2 f 1 t 1 Z f 2 t 2 = Square of entry is concentrated around F 2 . t k . . . f n Consider a row z of the projection matrix. Let entries of z be uniform in {-1,1} chosen with 4-wise independence. Let t=z.f. Expectation: E(t 2 ) = ∑ i,j E(z i z j )f i f j = F 2
Algorithm for Estimating F 2 f 1 t 1 Z f 2 t 2 = Square of entry is concentrated around F 2 . t k . . . f n Consider a row z of the projection matrix. Let entries of z be uniform in {-1,1} chosen with 4-wise independence. Let t=z.f. Expectation: E(t 2 ) = ∑ i,j E(z i z j )f i f j = F 2 Variance: Var(t 2 ) ≤ ∑ i,j,k,l E(z i z j z k z l )f i f j f k f l < 6F 22
Algorithm for Estimating F 2 f 1 t 1 Z f 2 t 2 = Square of entry is concentrated around F 2 . t k . . . f n Consider a row z of the projection matrix. Let entries of z be uniform in {-1,1} chosen with 4-wise independence. Let t=z.f. Expectation: E(t 2 ) = ∑ i,j E(z i z j )f i f j = F 2 Variance: Var(t 2 ) ≤ ∑ i,j,k,l E(z i z j z k z l )f i f j f k f l < 6F 22 By Chebyshev, setting k=O( ε -2 log δ -1 ) ensures with prob. 1- δ , average of squared entries is (1± ε ) F 2 .
Second Idea: Sampling
Second Idea: Sampling • Let’s sample from S=[a 1 , a 2 , a 3 , ... , a m ] where each a i ∈ R [n]
Second Idea: Sampling • Let’s sample from S=[a 1 , a 2 , a 3 , ... , a m ] where each a i ∈ R [n] • Distribution Sampling: Return i with probability f i /m
Second Idea: Sampling • Let’s sample from S=[a 1 , a 2 , a 3 , ... , a m ] where each a i ∈ R [n] • Distribution Sampling: Return i with probability f i /m • Universe Sampling: Return (i,f i ) where i ∈ R [n]
Second Idea: Sampling • Let’s sample from S=[a 1 , a 2 , a 3 , ... , a m ] where each a i ∈ R [n] • Distribution Sampling: Return i with probability f i /m • Universe Sampling: Return (i,f i ) where i ∈ R [n] • AMS Sampling: Return (i,r) with i chosen w/p f i /m and r ∈ R [f i ]
Second Idea: Sampling • Let’s sample from S=[a 1 , a 2 , a 3 , ... , a m ] where each a i ∈ R [n] • Distribution Sampling: Return i with probability f i /m • Universe Sampling: Return (i,f i ) where i ∈ R [n] • AMS Sampling: Return (i,r) with i chosen w/p f i /m and r ∈ R [f i ] • Sample a j for j ∈ R [m], let i= a j and compute r=|{j ′≥ j : a j ′ =a j }|
Second Idea: Sampling • Let’s sample from S=[a 1 , a 2 , a 3 , ... , a m ] where each a i ∈ R [n] • Distribution Sampling: Return i with probability f i /m • Universe Sampling: Return (i,f i ) where i ∈ R [n] • AMS Sampling: Return (i,r) with i chosen w/p f i /m and r ∈ R [f i ] • Sample a j for j ∈ R [m], let i= a j and compute r=|{j ′≥ j : a j ′ =a j }| • Useful for estimating ∑ i g(f i ) because E[m(g(r)-g(r-1))] = ∑ i g(f i )
Second Idea: Sampling • Let’s sample from S=[a 1 , a 2 , a 3 , ... , a m ] where each a i ∈ R [n] • Distribution Sampling: Return i with probability f i /m • Universe Sampling: Return (i,f i ) where i ∈ R [n] • AMS Sampling: Return (i,r) with i chosen w/p f i /m and r ∈ R [f i ] • Sample a j for j ∈ R [m], let i= a j and compute r=|{j ′≥ j : a j ′ =a j }| • Useful for estimating ∑ i g(f i ) because E[m(g(r)-g(r-1))] = ∑ i g(f i ) • L p Sampling: Return i with probability f ik /F k
L 0 Sampling
L 0 Sampling Suppose we know F 0 . Pick hash function h:[n] → [F 0 ]
L 0 Sampling Suppose we know F 0 . Pick hash function h:[n] → [F 0 ] Algorithm: Maintain values c and id, initially 0.
L 0 Sampling Suppose we know F 0 . Pick hash function h:[n] → [F 0 ] Algorithm: Maintain values c and id, initially 0. For each j in stream: if h(j)=1, c ← c+1, id ← id+j
L 0 Sampling Suppose we know F 0 . Pick hash function h:[n] → [F 0 ] Algorithm: Maintain values c and id, initially 0. For each j in stream: if h(j)=1, c ← c+1, id ← id+j Return id/c if all elts hashing to 1 were same
L 0 Sampling Suppose we know F 0 . Pick hash function h:[n] → [F 0 ] Algorithm: Maintain values c and id, initially 0. For each j in stream: if h(j)=1, c ← c+1, id ← id+j Return id/c if all elts hashing to 1 were same Claim: This happens with constant probability.
L 0 Sampling Suppose we know F 0 . Pick hash function h:[n] → [F 0 ] Algorithm: Maintain values c and id, initially 0. For each j in stream: if h(j)=1, c ← c+1, id ← id+j Return id/c if all elts hashing to 1 were same Claim: This happens with constant probability. Claim: Need to check elts hashing to 1 were same.
L 0 Sampling Suppose we know F 0 . Pick hash function h:[n] → [F 0 ] Algorithm: Maintain values c and id, initially 0. For each j in stream: if h(j)=1, c ← c+1, id ← id+j Return id/c if all elts hashing to 1 were same Claim: This happens with constant probability. Claim: Need to check elts hashing to 1 were same. Run O(log n) copies guessing F 0 =2 i . At least one instantiation works with constant probability.
L 0 Sampling Suppose we know F 0 . Pick hash function h:[n] → [F 0 ] Algorithm: Maintain values c and id, initially 0. For each j in stream: if h(j)=1, c ← c+1, id ← id+j Return id/c if all elts hashing to 1 were same Claim: This happens with constant probability. Claim: Need to check elts hashing to 1 were same. Run O(log n) copies guessing F 0 =2 i . At least one instantiation works with constant probability. Algorithm is a sketch and works with deletions!
Third Idea: Lower Bounds
Third Idea: Lower Bounds x ∈ {0,1} n y ∈ {0,1} n • Many space lower bounds in data stream model use reductions from communication complexity.
Third Idea: Lower Bounds x ∈ {0,1} n y ∈ {0,1} n • Many space lower bounds in data stream model use reductions from communication complexity.
Third Idea: Lower Bounds x ∈ {0,1} n y ∈ {0,1} n • Many space lower bounds in data stream model use reductions from communication complexity. • Example: Alice and Bob have x,y ∈ {0,1} n and Bob wants to check DISJOINTNESS i.e., is there an i with x i =y i =1?
Third Idea: Lower Bounds x ∈ {0,1} n y ∈ {0,1} n • Many space lower bounds in data stream model use reductions from communication complexity. • Example: Alice and Bob have x,y ∈ {0,1} n and Bob wants to check DISJOINTNESS i.e., is there an i with x i =y i =1? • Thm: Any 1/3-error protocol for DISJOINTNESS requires Ω (n) bits of communication.
Third Idea: Lower Bounds x ∈ {0,1} n y ∈ {0,1} n • Many space lower bounds in data stream model use reductions from communication complexity. • Example: Alice and Bob have x,y ∈ {0,1} n and Bob wants to check DISJOINTNESS i.e., is there an i with x i =y i =1? • Thm: Any 1/3-error protocol for DISJOINTNESS requires Ω (n) bits of communication. • Corollary: Any 1/3-error stream algorithm that checks if a graph is triangle-free needs Ω (n 2 ) bits of memory.
Lower Bound for Triangle Detection
Lower Bound for Triangle Detection Alice and Bob have X,Y ∈ {0,1} nxn . For Bob to check if X ij =Y ij =1 for some i,j needs Ω (n 2 ) communication.
Lower Bound for Triangle Detection Alice and Bob have X,Y ∈ {0,1} nxn . For Bob to check if X ij =Y ij =1 for some i,j needs Ω (n 2 ) communication. Let A be an s-space alg that checks for triangles.
Lower Bound for Triangle Detection Alice and Bob have X,Y ∈ {0,1} nxn . For Bob to check if X ij =Y ij =1 for some i,j needs Ω (n 2 ) communication. Let A be an s-space alg that checks for triangles. Consider 3-layer graph (U,V ,W) with |U|=|V|=|W|=n
Lower Bound for Triangle Detection Alice and Bob have X,Y ∈ {0,1} nxn . For Bob to check if X ij =Y ij =1 for some i,j needs Ω (n 2 ) communication. Let A be an s-space alg that checks for triangles. Consider 3-layer graph (U,V ,W) with |U|=|V|=|W|=n
Lower Bound for Triangle Detection Alice and Bob have X,Y ∈ {0,1} nxn . For Bob to check if X ij =Y ij =1 for some i,j needs Ω (n 2 ) communication. Let A be an s-space alg that checks for triangles. Consider 3-layer graph (U,V ,W) with |U|=|V|=|W|=n Alice runs A on E 1 ={u i w i : 1 ≤ i ≤ n} and E 2 ={u i v j : X ij =1}
Lower Bound for Triangle Detection Alice and Bob have X,Y ∈ {0,1} nxn . For Bob to check if X ij =Y ij =1 for some i,j needs Ω (n 2 ) communication. Let A be an s-space alg that checks for triangles. Consider 3-layer graph (U,V ,W) with |U|=|V|=|W|=n Alice runs A on E 1 ={u i w i : 1 ≤ i ≤ n} and E 2 ={u i v j : X ij =1}
Lower Bound for Triangle Detection Alice and Bob have X,Y ∈ {0,1} nxn . For Bob to check if X ij =Y ij =1 for some i,j needs Ω (n 2 ) communication. Let A be an s-space alg that checks for triangles. Consider 3-layer graph (U,V ,W) with |U|=|V|=|W|=n Alice runs A on E 1 ={u i w i : 1 ≤ i ≤ n} and E 2 ={u i v j : X ij =1}
Lower Bound for Triangle Detection Alice and Bob have X,Y ∈ {0,1} nxn . For Bob to check if X ij =Y ij =1 for some i,j needs Ω (n 2 ) communication. Let A be an s-space alg that checks for triangles. Consider 3-layer graph (U,V ,W) with |U|=|V|=|W|=n Alice runs A on E 1 ={u i w i : 1 ≤ i ≤ n} and E 2 ={u i v j : X ij =1} Sends memory to Bob who runs A on E 3 ={v j w i :Y ij =1}
Lower Bound for Triangle Detection Alice and Bob have X,Y ∈ {0,1} nxn . For Bob to check if X ij =Y ij =1 for some i,j needs Ω (n 2 ) communication. Let A be an s-space alg that checks for triangles. Consider 3-layer graph (U,V ,W) with |U|=|V|=|W|=n Alice runs A on E 1 ={u i w i : 1 ≤ i ≤ n} and E 2 ={u i v j : X ij =1} Sends memory to Bob who runs A on E 3 ={v j w i :Y ij =1}
Lower Bound for Triangle Detection Alice and Bob have X,Y ∈ {0,1} nxn . For Bob to check if X ij =Y ij =1 for some i,j needs Ω (n 2 ) communication. Let A be an s-space alg that checks for triangles. Consider 3-layer graph (U,V ,W) with |U|=|V|=|W|=n Alice runs A on E 1 ={u i w i : 1 ≤ i ≤ n} and E 2 ={u i v j : X ij =1} Sends memory to Bob who runs A on E 3 ={v j w i :Y ij =1} Output of A resolves matrix question so s= Ω (n 2 ).
Useful Communication Results
Useful Communication Results • Indexing: • Alice has x ∈ {0,1} n , Bob has i ∈ [n]. Bob want’s to learn x i . • One-way communication requires Ω (n) bits even if Bob also knows first i-1 bits of x.
Useful Communication Results • Indexing: • Alice has x ∈ {0,1} n , Bob has i ∈ [n]. Bob want’s to learn x i . • One-way communication requires Ω (n) bits even if Bob also knows first i-1 bits of x. • Gap-Hamming: • Alice and Bob have x,y ∈ {0,1} n . Distinguish Δ (x,y)<n/2- √ n from Δ (x,y)>n/2+ √ n. • Requires Ω (n) communication.
Useful Communication Results • Indexing: • Alice has x ∈ {0,1} n , Bob has i ∈ [n]. Bob want’s to learn x i . • One-way communication requires Ω (n) bits even if Bob also knows first i-1 bits of x. • Gap-Hamming: • Alice and Bob have x,y ∈ {0,1} n . Distinguish Δ (x,y)<n/2- √ n from Δ (x,y)>n/2+ √ n. • Requires Ω (n) communication. • Multi-Party Disjointness: • t players have x 1 ,x 2 , ... , x t ∈ {0,1} n . Need to distinguish x 1i =x 2i = ... =x ti =1 for some i from all vectors orthogonal. • Requires Ω (n/t) communication.
Bonus! The Fourth Idea
Bonus! The Fourth Idea • Algorithmic tools will only get you so far, sometimes you need to come up with neat ad hoc solutions.
Bonus! The Fourth Idea • Algorithmic tools will only get you so far, sometimes you need to come up with neat ad hoc solutions. • Graph Distances: Given a stream of edges, approximate the shortest path distance between any two nodes.
Bonus! The Fourth Idea • Algorithmic tools will only get you so far, sometimes you need to come up with neat ad hoc solutions. • Graph Distances: Given a stream of edges, approximate the shortest path distance between any two nodes. • k-Center: Given a stream of points, find a set of centers that minimizes max distance from a point to nearest center.
Approximate Distances
Approximate Distances Edges define shortest path graph metric d G .
Approximate Distances Edges define shortest path graph metric d G . An α -spanner of G = (V ,E) is a subgraph H = (V ,E’) such that ∀ u,v: d G (u,v) ≤ d H (u,v) ≤ α d G (u,v)
Approximate Distances Edges define shortest path graph metric d G . An α -spanner of G = (V ,E) is a subgraph H = (V ,E’) such that ∀ u,v: d G (u,v) ≤ d H (u,v) ≤ α d G (u,v) Algorithm: Let E ′ be initially empty Add (u,v) to E ′ if d H (u,v) > 2t-1
Approximate Distances Edges define shortest path graph metric d G . An α -spanner of G = (V ,E) is a subgraph H = (V ,E’) such that ∀ u,v: d G (u,v) ≤ d H (u,v) ≤ α d G (u,v) Algorithm: Let E ′ be initially empty Add (u,v) to E ′ if d H (u,v) > 2t-1 Analysis: Each distance increase by at most factor 2t-1 |E ′ | = O(n 1+1/t ) because all cycles of length > 2t
k-Center Clustering
k-Center Clustering 2 approx in O(k) space if you already know OPT.
k-Center Clustering 2 approx in O(k) space if you already know OPT. (2+ ε ) approx in O(k ε -1 log Δ ) space if 1 ≤ OPT ≤ Δ
k-Center Clustering 2 approx in O(k) space if you already know OPT. (2+ ε ) approx in O(k ε -1 log Δ ) space if 1 ≤ OPT ≤ Δ Better Algorithm O(k ε -1 log ε -1 ): Instantiate basic algorithm with guesses 1, (1+ ε ), (1+ ε ) 2 , ... , 2 ε − 1
k-Center Clustering 2 approx in O(k) space if you already know OPT. (2+ ε ) approx in O(k ε -1 log Δ ) space if 1 ≤ OPT ≤ Δ Better Algorithm O(k ε -1 log ε -1 ): Instantiate basic algorithm with guesses 1, (1+ ε ), (1+ ε ) 2 , ... , 2 ε − 1 If guess r stops working at (j+1) th point: Let q 1 ,...,q k be centers chosen so far. Then p 1 ,...,p j are all at most 2r from some q i . OPT for {q 1 ,...,q k ,p j+1 ,...,p n } is at most OPT+2r.
Recommend
More recommend