Streaming Algorithms for Bin Packing and Vector Scheduling Graham Cormode and Pavel Vesel´ y University of Warwick WAOA 2019, Munich Powered by Beamer i k Z
First WAOA talk containing “streaming” . . . Pavel Vesel´ y Streaming Algs. for Bin Packing and Vector Scheduling 1 / 13
First WAOA talk containing “streaming” . . . Pavel Vesel´ y Streaming Algs. for Bin Packing and Vector Scheduling 1 / 13
First WAOA talk containing “streaming” . . . . . . but not the first one on “data streams” Pavel Vesel´ y Streaming Algs. for Bin Packing and Vector Scheduling 1 / 13
Overview Pavel Vesel´ y Streaming Algs. for Bin Packing and Vector Scheduling 2 / 13
Overview Connecting Big Data Algorithms & Combinatorial Optimization 1 . . . Pavel Vesel´ y Streaming Algs. for Bin Packing and Vector Scheduling 2 / 13
Overview Connecting Big Data Algorithms & Combinatorial Optimization 1 . . . This talk’s focus: streaming algorithms packing and scheduling Pavel Vesel´ y Streaming Algs. for Bin Packing and Vector Scheduling 2 / 13
Streaming Model of Computation Pavel Vesel´ y Streaming Algs. for Bin Packing and Vector Scheduling 3 / 13
Streaming Model of Computation • One pass over data w/ limited memory Pavel Vesel´ y Streaming Algs. for Bin Packing and Vector Scheduling 3 / 13
Streaming Model of Computation • One pass over data w/ limited memory Streaming Algorithm • receives data in a stream, item by item • uses memory sublinear in N = stream length Pavel Vesel´ y Streaming Algs. for Bin Packing and Vector Scheduling 3 / 13
Streaming Model of Computation • One pass over data w/ limited memory Streaming Algorithm • receives data in a stream, item by item • uses memory sublinear in N = stream length • at the end, computes approximate answer Pavel Vesel´ y Streaming Algs. for Bin Packing and Vector Scheduling 3 / 13
Streaming Model of Computation • One pass over data w/ limited memory Streaming Algorithm • receives data in a stream, item by item • uses memory sublinear in N = stream length • at the end, computes approximate answer Note: cannot output a packing / schedule ⇒ estimate optimal cost (+ output template of a solution) Pavel Vesel´ y Streaming Algs. for Bin Packing and Vector Scheduling 3 / 13
Streaming Model of Computation • One pass over data w/ limited memory Streaming Algorithm • receives data in a stream, item by item • uses memory sublinear in N = stream length • at the end, computes approximate answer Note: cannot output a packing / schedule ⇒ estimate optimal cost (+ output template of a solution) Challenges: • N very large • Stream ordered arbitrarily • No random access to data Pavel Vesel´ y Streaming Algs. for Bin Packing and Vector Scheduling 3 / 13
Streaming Model of Computation • One pass over data w/ limited memory Streaming Algorithm • receives data in a stream, item by item • uses memory sublinear in N = stream length • at the end, computes approximate answer Note: cannot output a packing / schedule ⇒ estimate optimal cost (+ output template of a solution) Challenges: • N very large • Stream ordered arbitrarily • No random access to data Trade-off: space vs. accuracy of the estimate Pavel Vesel´ y Streaming Algs. for Bin Packing and Vector Scheduling 3 / 13
Streaming Model of Computation • One pass over data w/ limited memory Streaming Algorithm • receives data in a stream, item by item • uses memory sublinear in N = stream length • at the end, computes approximate answer Note: cannot output a packing / schedule ⇒ estimate optimal cost (+ output template of a solution) Challenges: • N very large • Stream ordered arbitrarily • No random access to data Trade-off: space vs. accuracy of the estimate How to summarize the input? Pavel Vesel´ y Streaming Algs. for Bin Packing and Vector Scheduling 3 / 13
Streaming Model of Computation • One pass over data w/ limited memory Streaming Algorithm • receives data in a stream, item by item • uses memory sublinear in N = stream length • at the end, computes approximate answer Note: cannot output a packing / schedule ⇒ estimate optimal cost (+ output template of a solution) � = online Challenges: • N very large no need to make online • Stream ordered arbitrarily decisions about the solution • No random access to data Trade-off: space vs. accuracy of the estimate How to summarize the input? Pavel Vesel´ y Streaming Algs. for Bin Packing and Vector Scheduling 3 / 13
Streaming Algorithms known for . . . Pavel Vesel´ y Streaming Algs. for Bin Packing and Vector Scheduling 4 / 13
Streaming Algorithms known for . . . • most frequent items, • # of distinct items, Pavel Vesel´ y Streaming Algs. for Bin Packing and Vector Scheduling 4 / 13
Streaming Algorithms known for . . . • most frequent items, • # of distinct items, • approximate median = .5-quantile, 1 • or any φ -quantile for φ ∈ [0 , 1], ε ≥ { • = φ · N -th largest item, • approx. cumulative distribution function, cdf A ( x ) = { a ∈ A | a ≤ x } N 0 Pavel Vesel´ y Streaming Algs. for Bin Packing and Vector Scheduling 4 / 13
Streaming Algorithms known for . . . • most frequent items, • # of distinct items, • approximate median = .5-quantile, 1 • or any φ -quantile for φ ∈ [0 , 1], ε ≥ { • = φ · N -th largest item, • approx. cumulative distribution function, cdf A ( x ) = { a ∈ A | a ≤ x } N • some graph problems, 0 • submodular maximization, • . . . Pavel Vesel´ y Streaming Algs. for Bin Packing and Vector Scheduling 4 / 13
Streaming Algorithms known for . . . • most frequent items, • # of distinct items, • approximate median = .5-quantile, 1 • or any φ -quantile for φ ∈ [0 , 1], ε ≥ { • = φ · N -th largest item, • approx. cumulative distribution function, cdf A ( x ) = { a ∈ A | a ≤ x } N • some graph problems, 0 • submodular maximization, • . . . What about other basic problems in combinatorial optimization? Pavel Vesel´ y Streaming Algs. for Bin Packing and Vector Scheduling 4 / 13
Our Results: Streaming Algorithms for . . . 1 Bin Packing: . . . • Input: items of size in [0 , 1] • Goal: pack into min. number of bins of capacity 1 • Offline: OPT + O (log OPT) bins in poly-time [Hoberg, Rothvoss ’17] Pavel Vesel´ y Streaming Algs. for Bin Packing and Vector Scheduling 5 / 13
Our Results: Streaming Algorithms for . . . 1 Bin Packing: . . . • Input: items of size in [0 , 1] • Goal: pack into min. number of bins of capacity 1 • Offline: OPT + O (log OPT) bins in poly-time [Hoberg, Rothvoss ’17] Streaming Algorithm 1 + ε -approximation in space � O ( 1 ε ) Essentially best possible Pavel Vesel´ y Streaming Algs. for Bin Packing and Vector Scheduling 5 / 13
Our Results: Streaming Algorithms for . . . 1 Bin Packing: . . . • Input: items of size in [0 , 1] • Goal: pack into min. number of bins of capacity 1 • Offline: OPT + O (log OPT) bins in poly-time [Hoberg, Rothvoss ’17] Streaming Algorithm 1 + ε -approximation in space � O ( 1 ε ) Essentially best possible Makespan Scheduling • Input: jobs with processing time • Goal: assign jobs to machines to minimize makespan = maximum load over all machines Pavel Vesel´ y Streaming Algs. for Bin Packing and Vector Scheduling 5 / 13
Our Results: Streaming Algorithms for . . . 1 Bin Packing: . . . • Input: items of size in [0 , 1] • Goal: pack into min. number of bins of capacity 1 • Offline: OPT + O (log OPT) bins in poly-time [Hoberg, Rothvoss ’17] Streaming Algorithm 1 + ε -approximation in space � O ( 1 ε ) Essentially best possible Makespan Scheduling • Input: jobs with processing time • Goal: assign jobs to machines to minimize makespan = maximum load over all machines • 1 + ε -approximation (rounding & DP) Pavel Vesel´ y Streaming Algs. for Bin Packing and Vector Scheduling 5 / 13
Our Results: Streaming Algorithms for . . . 1 Bin Packing: . . . • Input: items of size in [0 , 1] • Goal: pack into min. number of bins of capacity 1 • Offline: OPT + O (log OPT) bins in poly-time [Hoberg, Rothvoss ’17] Streaming Algorithm 1 + ε -approximation in space � O ( 1 ε ) Essentially best possible Vector Scheduling: • Input: jobs characterized by d -dim. vectors • e.g.: processing time, memory or bandwidth requirements, etc. • Goal: assign jobs to m identical machines to minimize makespan = maximum load over all machines and dimensions Pavel Vesel´ y Streaming Algs. for Bin Packing and Vector Scheduling 5 / 13
Our Results: Streaming Algorithms for . . . 1 Bin Packing: . . . • Input: items of size in [0 , 1] • Goal: pack into min. number of bins of capacity 1 • Offline: OPT + O (log OPT) bins in poly-time [Hoberg, Rothvoss ’17] Streaming Algorithm 1 + ε -approximation in space � O ( 1 ε ) Essentially best possible Vector Scheduling: • Input: jobs characterized by d -dim. vectors • e.g.: processing time, memory or bandwidth requirements, etc. • Goal: assign jobs to m identical machines to minimize makespan = maximum load over all machines and dimensions Streaming Algorithm O ( d 2 · m 3 ) 2-approximation in space � Pavel Vesel´ y Streaming Algs. for Bin Packing and Vector Scheduling 5 / 13
Recommend
More recommend