Scheduling Bags of Non-identical Tasks Henri Casanova and Matthieu Gallet and Fr´ ed´ eric Vivien November 13, 2009
The Problem ◮ A master-worker platform ◮ Several bag-of-tasks applications (Each application is a collection of similar tasks) ◮ Objective: maximizing the throughput ◮ Bad news: a bag is made of similar but not identical tasks 1/36
Presentation outline Offline Case: Identical Tasks Offline Case: Tasks With Different Characteristics Online Case: Tasks With Different Characteristics Simulations 2/36
Presentation outline Offline Case: Identical Tasks Offline Case: Tasks With Different Characteristics Online Case: Tasks With Different Characteristics Simulations 3/36
Notation ◮ A master P 0 which has an output bandwidth of bw 0 ◮ n workers: P 1 , ..., P n ◮ Processor P i has ◮ a speed of s i ◮ an input bandwidth of bw i ◮ m bag-of-tasks applications ◮ Tasks of bag k have ◮ a volume of computation of V comp ( k ) ◮ a volume of computation of V comm ( k ) ◮ Communication model: bounded multi-port with linear communications times 4/36
Constraints 1. Cumulative throughput of T k : ρ ( k ) = ρ ( k ) � i 1 ≤ i < n 2. Throughput of T k proportional to its priority: ρ ( k ) = ρ (1) π k π 1 Objective ρ (1) Maximize 5/36
Constraints (continued) 3. Constraint on computation capabilities of worker P i V comp ( k ) ρ ( k ) � ≤ 1 i s i 1 ≤ k ≤ m 4. Constraint on communication capabilities of worker P i V comm ( k ) ρ ( k ) � ≤ 1 i bw i 1 ≤ k ≤ m 5. Constraint on communication capabilities of the master V comm ( k ) ρ ( k ) � � ≤ 1 i bw 0 1 ≤ i < n 1 ≤ k ≤ m 6/36
Complete Linear Program Maximize ρ (1) under the constraints ρ ( k ) � = ρ ( k ) ∀ k ∈ [1 , m ] , i 1 ≤ i < n ρ ( k ) = ρ (1) ∀ k ∈ [1 , m ] , π k π 1 V comp ( k ) ρ ( k ) � ∀ i ∈ [1 , n ] , ≤ 1 i s i 1 ≤ k ≤ m V comm ( k ) ρ ( k ) � ∀ i ∈ [1 , n ] , ≤ 1 i bw i 1 ≤ k ≤ m V comm ( k ) ρ ( k ) � � ≤ 1 i bw 0 1 ≤ i < n 1 ≤ k ≤ m 7/36
Presentation outline Offline Case: Identical Tasks Offline Case: Tasks With Different Characteristics Online Case: Tasks With Different Characteristics Simulations 8/36
Notation ◮ A master P 0 which has an output bandwidth of bw 0 ◮ n workers: P 1 , ..., P n ◮ Processor P i has ◮ a speed of s i ◮ an input bandwidth of bw i ◮ m bag-of-tasks applications ◮ Tasks of bag k have ◮ X ( k ) comm is a random variable the u -th instance has a communication volume of X ( k ) comm ( u ) comm ≤ X ( k ) comm ( u ) ≤ max ( k ) min ( k ) comm ◮ X ( k ) comp is a random variable the u -th instance has a computation volume of X ( k ) comp ( u ) comp ≤ X ( k ) comp ( u ) ≤ max ( k ) min ( k ) comp ◮ Communication model: bounded multi-port with linear communications times 9/36
An ε -approximation scheme Underlying principle: split each application into several virtual applications in which two instances only have small differences in term of communication and computation volumes. Communication volume Instances of T 1 Instances of T 2 Instances of T 3 0 Computation volume 0 10/36
Formal splitting max( k ) ! comp ln min( k ) = (1 + ε ) q min ( k ) γ ( k ) comp , with 0 ≤ q ≤ Q ( k ) = 1 + comp q ln(1+ ε ) max( k ) „ « comm ln min( k ) = (1 + ε ) r min ( k ) comm , with 0 ≤ r ≤ R ( k ) = 1 + δ ( k ) comm r ln(1+ ε ) Instance u of T k belongs to I ( k ) � γ ( k ) q ; γ ( k ) � � δ ( k ) ; δ ( k ) � q , r = × if r q +1 r +1 ◮ γ ( k ) ≤ X ( k ) comp ( u ) ≤ γ ( k ) q +1 and q ◮ δ ( k ) ≤ X ( k ) comm ( u ) ≤ δ ( k ) r r +1 11/36
Virtual applications ◮ Instances of T k in I ( k ) q , r define virtual application T k , q , r ◮ p ( k ) q , r probability of an instance of T k to belong to virtual application T k , q , r : � � p ( k ) γ ( k ) ≤ X ( k ) comp < γ ( k ) q +1 ; δ ( k ) ≤ X ( k ) comm < δ ( k ) q , r = P q r r +1 p ( k ) � ∀ k , q , r = 1 q , r ◮ ρ ( k ) i , q , r : contribution of processor P i to the throughput of virtual application T k , q , r ◮ Throughput of virtual application T k , q , r is related to the throughput of T k : ρ ( k ) i , q , r = p ( k ) ∀ k , ∀ q < Q ( k ) , ∀ r < R ( k ) , � q , r ρ ( k ) 1 ≤ i < n 12/36
Transposing the constraints ◮ Throughput of T k is still proportional to its priority: ∀ k ∈ [1 , m ] , ρ ( k ) = ρ (1) π k π 1 ◮ Constraint on computation capabilities of worker P i Problem: We do not know the execution time of instances Solution: We (conservatively) over-approximate them γ ( k ) m � � � � ρ ( k ) r +1 ∀ i ∈ [1 , n ] , ≤ 1 i , q , r s i k =1 q < Q ( k ) r < R ( k ) 13/36
Transposing the constraints (cont.) ◮ Constraint on communication capabilities of worker P i m δ ( k ) � � ρ ( k ) � � r +1 ∀ 1 ≤ i < n , ≤ 1 i , q , r bw i k =1 q < Q ( k ) r < R ( k ) ◮ Constraint on communication capabilities of the master δ ( k ) m � � ρ ( k ) r +1 � � ≤ 1 i , q , r bw 0 k =1 q < Q ( k ) r < R ( k ) 14/36
New linear program Maximize ρ = ρ (1) under the constraints n ρ ( k ) i , q , r = p ( k ) ∀ k ∈ [1 , m ] , ∀ q < Q ( k ) , ∀ r < R ( k ) , � q , r ρ ( k ) i =1 ρ ( k ) = ρ (1) ∀ k ∈ [1 , m ] , π k π 1 γ ( k ) m � � ρ ( k ) q +1 � � ∀ i ∈ [1 , n ] , ≤ 1 i , q , r s i k =1 q < Q ( k ) r < R ( k ) δ ( k ) m � � ρ ( k ) � � r +1 ∀ i ∈ [1 , n ] , ≤ 1 i , q , r bw i k =1 q < Q ( k ) r < R ( k ) δ ( k ) n m � � ρ ( k ) r +1 � � � ≤ 1 i , q , r bw 0 i =1 k =1 q < Q ( k ) r < R ( k ) 15/36
Performance Theorem. An optimal solution of the Linear Program describes a solution with a throughput ρ larger than ρ ∗ / (1 + ε ), where ρ ∗ is the optimal throughput. 16/36
Presentation outline Offline Case: Identical Tasks Offline Case: Tasks With Different Characteristics Online Case: Tasks With Different Characteristics Simulations 17/36
Aim ◮ Non-clairvoyant about computation volumes ◮ Communication volumes can be supposed to be known ◮ Underlying distributions are unknown Is there any hope? 18/36
Case with dominant computations Theorem. On-Demand policy is asymptotically optimal when ◮ Computations are always dominant: X ( k ) X ( k ′ ) comm ( u ′ ) comp ( u ) ∀ i ∈ [1 , n ] , min ≥ max s i bw i k , u k ′ , u ′ ◮ The master’s bandwidth is not constraining: n � bw 0 ≥ bw i i =1 ◮ Each worker as a limited number of buffers ( ∈ [2 , n buffers ]) 19/36
Principle of the proof (1/3) Notation ◮ Γ: worst computation time ◮ ∆: worst communication time ◮ R i : computation volume allocated to worker P i ◮ T i : completion time of worker P i We consider the scheduling of N tasks 20/36
Principle of the proof (2/3) ◮ t : time the first worker completes its work ◮ Makespan = max i T i ≤ t + ( b + 1)Γ Makespan − ( b + 1)Γ ≤ t (1) ◮ Dominating computations t ≤ T i ≤ ∆ + R i (2) s i ◮ Combining Equations 1 and 2 Makespan − ( b + 1)Γ ≤ ∆ + R i s i 21/36
Principle of the proof (3/3) ◮ Combining Equations 1 and 2 Makespan − ( b + 1)Γ ≤ ∆ + R i s i ◮ By summation �� � � � s i ( Makespan − ( b + 1)Γ) ≤ s i ∆ + R i i i i P i R i ◮ Trivial bound: i s i ≤ Makespan opt P ◮ Asymptotic optimality Makespan opt − ( b +1)Γ ≤ Makespan − ( b +1)Γ ≤ ∆+ Makespan opt 22/36
Case with dominant computations (extension) Theorem. On-Demand policy is asymptotically optimal when ◮ Processor P i is always granted at least a fraction α i of its input bandwidth when it requests data ◮ Computations are always dominant: ≥ max k ′ , u ′ X ( k ′ ) ∀ i ∈ [1 , n ] , min k , u X ( k ) comm ( u ′ ) comp ( u ) α i bw i s i ◮ Each worker as a limited number of buffers ( ∈ [2 , n buffers ]) 23/36
Case with infinite buffers Theorem. On-Demand has no constant competitive ratio ◮ 1 application with N tasks and unitary communication and computation volume, master’s bandwidth not constraining 1 ◮ bw 1 = 2 N ; bw 2 = ... = bw n = 1 ◮ s 1 = 2( n − 1) N ; s 2 = ... = s n = 1 ◮ Possible schedule: ignore worker P 1 : � � N makespan opt ≤ + 1 n − 1 ◮ solution of On-Demand 1 task each for P 2 , ..., P n , N − ( n − 1) tasks for P 1 . Makespan On-Demand ≥ ( N − ( n − 1)) s 1 ≥ N × Makespan opt (for N ≥ 4 n ). 24/36
Case with dominant communications Theorem. On-Demand policy is asymptotically optimal when ◮ Communications are always dominant: X ( k ) X ( k ′ ) comm ( u ′ ) comp ( u ) ∀ i ∈ [1 , n ] , max ≤ min s i bw i k , u k ′ , u ′ ◮ Each worker has a limited number of buffers ( ∈ [2 , n buffers ]) 25/36
Recommend
More recommend