a genetic algorithm with communiation costs to schedule
play

A Genetic Algorithm with Communiation Costs to Schedule Workflows on - PowerPoint PPT Presentation

A Genetic Algorithm with Communiation Costs to Schedule Workflows on a SOA-Grid Laurent PHILIPPE Co-authors: Lamiel Toch and Jean-Marc Nicod Laboratoire dInformatique de Franche-Comt Universit de Franche-Comt Besanon HETEROPAR -


  1. A Genetic Algorithm with Communiation Costs to Schedule Workflows on a SOA-Grid Laurent PHILIPPE Co-authors: Lamiel Toch and Jean-Marc Nicod Laboratoire d’Informatique de Franche-Comté Université de Franche-Comté Besançon HETEROPAR - August 2011 Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 1 / 28

  2. Context GA Scheduling Simulation General Dags Identical Intrees Workflow applications Combine several applications or application modules Precedence constraints (Files) Application domaine : Astronomy, Bioinformatics, Chemistry, Climate Modeling, Computer Science, Image Processing, etc. Batch processing Collection of workflows Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 2 / 28

  3. Context GA Scheduling Simulation General Dags Identical Intrees SOA Grids Provides applications access Execution on clusters Simple acess for scientists Tools : DIET or NINF-G Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 3 / 28

  4. Context GA Scheduling Simulation General Dags Identical Intrees Contents Context 1 GA Scheduling 2 Simulation 3 General Dags 4 Identical Intrees 5 Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 4 / 28

  5. Context GA Scheduling Simulation General Dags Identical Intrees Framework model Applicative framework Collection B = {J j , 1 ≤ j ≤ N } of N workflows to schedule Workflow J j is represented by a DAG J j = ( T j , D j ) T j = { T j 1 , . . . , T j n j } : the tasks D j : the precedence constraints F j k , i is the file sent between T j k and T j i when ( T j k , T j i ) ∈ D j j = 1 T j = { T j T = ∪ N i j , 1 ≤ i j ≤ n j and 1 ≤ j ≤ N } : set to schedule Typed tasks : t ( i , j ) as the type of task T j i . Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 5 / 28

  6. Context GA Scheduling Simulation General Dags Identical Intrees Framework model - 2 Target platform Platform PF : n machines modeled by an undirected graph PF = ( P , L ) The vertices in P = { p 1 , . . . , p n } represent the machines The edges of L are the communication links Each link ( p i , p j ) has a bandwidth bw ( p i , p j ) τ : set of task types available Each machine p i is able to perform a subset of τ . t ∈ τ is available on the machine p i , w ( t , p i ) is the time to perform a task of type t on p i . a ( i , j ) is the machine on which T j i is assigned. Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 6 / 28

  7. Context GA Scheduling Simulation General Dags Identical Intrees Framework model - 3 Communication model one-port model one data transmitted / communication link one reception and one transmission / node R ( p k , p i ) = { ( p j , p j ′ ) ∈ L} is a route from p k to p i . Problem definition Static scheduling Makespan optimization for the collection of worflows Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 7 / 28

  8. Context GA Scheduling Simulation General Dags Identical Intrees Related works Workflow Scheduling Makespan optimization : NP-Hard Problem List based heuristics : HEFT, Critical Path, etc. Difficult in heterogeneous contexts Advanced algorithms GA for scheduling GA give good results on complex systems But still a heuristic, distance to optimal ? Steady State : flow optimization identical intrees optimal results Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 8 / 28

  9. Context GA Scheduling Simulation General Dags Identical Intrees Steady-state Scheduling A B C D A B C D Period 1 Period 2 Period N A B C D A A A A A A B C B C D A C D D B A B C D A A B C D A B C D B B B B B C D ... A B A B A B A B C D C D C C C C C C D D A B C D A A D D A B C B C B C D D D D D Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 9 / 28

  10. Context GA Scheduling Simulation General Dags Identical Intrees Contents Context 1 GA Scheduling 2 Simulation 3 General Dags 4 Identical Intrees 5 Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 10 / 28

  11. Context GA Scheduling Simulation General Dags Identical Intrees GA without communication costs Classical GA for workflow : gene = task chromosome one row per processor phenotype = schedule fitness = 1 / makespan population, generation, crossover, mutation ... P0 P0 T0 T3 T4 T0 T4 P1 P1 T1 T1 T3 T2 P2 T2 P2 Do not take communication into account Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 11 / 28

  12. Context GA Scheduling Simulation General Dags Identical Intrees With Communication Costs Communications in the chromosome Communication task One row per communication link Dependencies to the source and target node -> inconsistent communications Poor efficiency Evaluation function Communications depends upon tasks placement Fitness evaluation with comunication costs Used solution Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 12 / 28

  13. Context GA Scheduling Simulation General Dags Identical Intrees Algorithm : fitness of a chromosome Data : T ToSched : remaining tasks, C ( T j i ) : completion time of T j i , σ ( T j i ) : start time of T j i on p a ( i , j ) , δ ( p u ) : next time p u is idle, w ( t , p i ) : the time to perform a task of type t on p i , CT ( F j k , i ) : the communication time to send F j k , i along route R ( p a ( k , j ) , p a ( i , j ) ) T ToSched ← T while T ToSched � = ∅ do choose a free task T j i ∈ T ToSched (EFT heuristic) T pred ← { T j k | ( T j k , T j i ) ∈ D j } σ ( T j and i ) ← 0 foreach task T j k ∈ T pred do σ ( T j i ) ← max ( σ ( T j i ) , C ( T j k ) + CT ( F j k , i )) σ ( T j i ) ← max ( δ ( p a ( i , j ) ) , σ ( T j i )) C ( T j i ) ← σ ( T j i ) + w ( t ( i , j ) , p a ( i , j ) ) δ ( p a ( i , j ) ) ← C ( T j T ToSched ← T ToSched \ { T j i ) and i } i ∈T ( C ( T k return fitness ( ch ) = 1 / C max = 1 / max T j i )) Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 13 / 28

  14. Context GA Scheduling Simulation General Dags Identical Intrees Contents Context 1 GA Scheduling 2 Simulation 3 General Dags 4 Identical Intrees 5 Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 14 / 28

  15. Context GA Scheduling Simulation General Dags Identical Intrees Experimental settings Simulations SimGrid-MSG GA = 200 individuals Platforms Random platform generation : uniform distribution Platform size : 4 to 10 nodes Homogeneous Heterogeneous CCR : communication to computation ratio Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 15 / 28

  16. Context GA Scheduling Simulation General Dags Identical Intrees Experimental settings - 2 Applications Batch sizes from 1 to 10.000 Applications : 4 to 12 tasks 1900 simulations of platform/application Heterogeneity : Execution from 1 to 10 Communications from 1 to 4 Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 16 / 28

  17. Context GA Scheduling Simulation General Dags Identical Intrees Contents Context 1 GA Scheduling 2 Simulation 3 General Dags 4 Identical Intrees 5 Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 17 / 28

  18. Context GA Scheduling Simulation General Dags Identical Intrees Communication Model No cost Static 1-route Bellman-Ford 3-route Bellman-Ford Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 18 / 28

  19. Context GA Scheduling Simulation General Dags Identical Intrees Communication Model - Results 50 50 LIST LIST GA no comm GA no comm Percentage of experiments with a RMO above 0.9 Percentage of experiments with a RMO above 0.8 GA static route GA static route GA 1−route Bellman−Ford GA 1−route Bellman−Ford 40 40 GA 3−routes Bellman−Ford GA 3−routes Bellman−Ford 30 30 20 20 10 10 0 0 10 100 1000 10000 10 100 1000 10000 Number of jobs Number of jobs F IGURE : Comparing different algorithms to choose the route Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 19 / 28

  20. Context GA Scheduling Simulation General Dags Identical Intrees GA Improvement (3-Bellman-Ford) 100 100 Percentage of GA experiments with a makespan Percentage of GA experiments with a makespan 0% 10% 20% improvement of X % relative to LIST improvement of X % relative to LIST 30% 80 80 60 60 40 40 20 20 0% 10% 20% 30% 0 0 10 100 1000 10000 10 100 1000 10000 Number of jobs Number of jobs a. Improvement for different b. Improvement for identical DAGs DAGs Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 20 / 28

  21. Context GA Scheduling Simulation General Dags Identical Intrees Contents Context 1 GA Scheduling 2 Simulation 3 General Dags 4 Identical Intrees 5 Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 21 / 28

  22. Context GA Scheduling Simulation General Dags Identical Intrees Relative Measure to Optimal Distance to optimal ? Algorithm improves the quality of the results Case of collection of intrees : Steady state algorithm gives optimal flow Lower bound Relative measure to Optimal (RMO) Optimal throughput ρ Lower bound L 0 = N ρ , N number of intrees L o RMO = makespan r , makespan r result of the algorithm Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 22 / 28

Recommend


More recommend