building a distributed genetic algorithm with the jini
play

Building a Distributed Genetic Algorithm with the Jini Network - PowerPoint PPT Presentation

Building a Distributed Genetic Algorithm with the Jini Network Technology Brian Zorman (Gregory M. Kapfhammer and Robert Roos) Sixth Annual Jini Community Meeting Boston June 17-20, 2002 Problem Analysis Genetic Algorithms: Pros:


  1. Building a Distributed Genetic Algorithm with the Jini Network Technology Brian Zorman (Gregory M. Kapfhammer and Robert Roos) Sixth Annual Jini Community Meeting Boston • June 17-20, 2002

  2. Problem Analysis • Genetic Algorithms: – Pros: robust and efficient – Cons: execution cost and Quality of Solution (QoS) • Possible solution: how can we harness the benefits of distributed computing frameworks? • Can we reduce cost of execution and improve quality of solution with a distributed genetic algorithm (DGA)?

  3. Bridging the Gap: Distributed Genetic Algorithms Genetic Algorithms: Distributed Systems: 1.) Resource Sharing 1.) Execution cost 2.) Concurrency 2.) Lack of diversity 3.) Scalability 4.) Openness

  4. Exploring Punctuated Equilibrium • The theory of punctuated equilibrium: – An isolated environment can reach a point of stability – The injection of new individuals could cause rapid evolution • Could we design a distributed system to simulate this theory? • How can the Jini network technology and the JavaSpaces object repository help us to build this distributed system?

  5. Designing the Models Master • Examined two popular models: parents parents master-worker and island evaluated offspring • Chose combination of master- Worker Worker . . . worker and island models – Master-worker: parallel I1 execution and simplicity I5 I2 – Island model (punctuated equilibrium): parallel execution and additional diversity I4 I3

  6. High Level Architecture: Entities in the “Simple” Model DistributionSpace RM1 RM2 RM3 RMn . . . Initial Machine DiversitySpace

  7. “Simple” Model: Distribution Phase DistributionSpace RM1 RM2 RM3 RMn . . . Initial Machine DiversitySpace

  8. “Simple” Model: Pre-migration DistributionSpace RM1 RM2 RM3 RMn . . . Initial Machine DiversitySpace

  9. “Simple” Model: Migration DistributionSpace RM1 RM2 RM3 RMn . . . Initial Machine DiversitySpace

  10. “Simple” Model: Post-convergence DistributionSpace RM1 RM2 RM3 RMn . . . Initial Machine DiversitySpace

  11. Simple Model Performance Bottleneck • No explicit synchronization between remote machines • Potentially, each remote machine could migrate with JavaSpace at the same time! • In some sense, this causes each worker to “wait in line” in order to perform migration! • While each worker is waiting there is no computation! • Designed “Complex” Distributed System Model (CDSM) in an attempt to reduce this bottleneck

  12. High Level Architecture: Entities in the “Complex” Model DistributionSpace Initial Machine RM1 MM1 MS1 MM2 MS2 RM2 . . . . . . . . . RMn MMn MSn

  13. “Complex” Model: Distribution Phase DistributionSpace Initial Machine RM1 MS1 MM1 RM2 MS2 MM2 . . . . . . . . . RMn MSn MMn

  14. “Complex” Model: Pre-migration Initial Machine DistributionSpace RM1 MM1 MS1 RM2 MM2 MS2 . . . . . . . . . RMn MSn MMn

  15. “Complex” Model: First Migration Phase DistributionSpace Initial Machine RM1 MM1 MS1 RM2 MM2 MS2 . . . . . . . . . RMn MSn MMn

  16. “Complex” Model: Subsequent Migration Phases Initial Machine DistributionSpace RM1 MM1 MS1 RM2 MM2 MS2 . . . . . . . . . RMn MMn MSn

  17. “Complex” Model: Post-convergence Initial Machine DistributionSpace RM1 MM1 MS1 RM2 MS2 MM2 . . . . . . . . . RMn MSn MMn

  18. “Complex” Model Observations • Maintains the functionality of the “Simple” model • Requires dedicated MigrationMachines and MigrationSpaces • Explicit synchronization mechanism used so that chances of more than one remote machine migrating with the same JavaSpace at the same time is greatly reduced • Multiple MigrationSpaces minimally reduce the overall diversity that any given remote machine has access to; however, this cost is small when compared to other gains!

  19. Experimental Framework • Goal: analyze the design and performance of the two models, and then compare the best version to sequential GA • Selected open source GA written in Java that “solves” the Knapsack Problem – Knapsack problem is provably NP-complete • Knapsack Problem Statement: Given a set of weights and knapsack capacity: find best combination of weights that fit inside the knapsack

  20. Testbench Description • 8 testsets of increasing levels of • GA parameters: difficulty – Termination condition: best solution remains constant after 75 • Range of weight values: generations 0 – 5000 – Crossover : at every generation • Number of weights : 500 – 1200 – Mutation: at every generation – Migration: 30% of population • Number of machines every 30 generations, starting at generation 60 – SDSM: {2,4,6,8} • Requires RemoteMachines – CDSM: {2,4,6,8} • Requires RemoteMachines, MigrationMachines, MigrationSpaces

  21. Measurements and General Observations • Execution time: The CDSM reduces the execution time of the DGA when compared to the SDSM. Generally, overall execution time increases as we add machines to the CDSM. • Computation–to–Communication ratio: CDSM increases this ratio when compared to the SDSM. The addition of machines to the CDSM reduces this ratio. • Diversity: The potential for a higher quality solution increases as we move from the SGA to the CDSM and then as we add more machines to the CDSM. • Quality of Solution: The QoS for the CDSM is always higher than the SGA. Generally, the QoS is higher in the CDSM as we add machines. • Generations–per–Second: The CDSM can compute more Gen/Sec than the SDSM. Generally, adding more machines to the CDSM increases the Gen/Sec.

  22. SDSM vs. CDSM: Execution time 2000000 1800000 1600000 1400000 1200000 SDSM 1000000 CDSM 800000 600000 400000 200000 0 2 4 6 8

  23. SDSM vs. CDSM: Computation-to-Communication Ratio 0.9 0.8 0.7 0.6 0.5 SDSM 0.4 CDSM 0.3 0.2 0.1 0 2 4 6 8

  24. SDSM vs. CDSM: Generations/Second 5 4.5 4 3.5 3 SDSM 2.5 CDSM 2 1.5 1 0.5 0 2 4 6 8

  25. CDSM vs. SGA: Quality of Solution 100 90 80 70 SGA 60 2 mach. 50 4 mach. 40 6 mach. 30 8 mach. 20 10 0 1 2 3 4 5 6 7 8

  26. CDSM vs. SGA: Execution Time 700000 600000 500000 SGA 400000 2 mach. 4 mach. 300000 6 mach. 200000 8 mach. 100000 0 1 2 3 4 5 6 7 8

  27. CDSM vs. SGA: Computation-to-Communication 1.6 1.4 1.2 1 2 mach. 4 mach. 0.8 6 mach. 0.6 8 mach. 0.4 0.2 0 1 2 3 4 5 6 7 8

  28. CDSM vs. SGA: Population Diversity 5000000 4500000 4000000 3500000 SGA 3000000 2 mach. 2500000 4 mach. 2000000 6 mach. 1500000 8 mach. 1000000 500000 0 1 2 3 4 5 6 7 8

  29. CDSM vs. SGA: Generations-per-Second 6 5 4 SGA 2 mach. 3 4 mach. 6 mach. 2 8 mach. 1 0 1 2 3 4 5 6 7 8

  30. Future Possibilities: Distributed GA Framework • Potential advantages of a DGA framework: – Could be integrated into existing Java GA frameworks – Java provides GA portability across operating systems – Jini and JavaSpaces offer openness, scalability, fault tolerance – GA developers could easily distribute their GA just to “see what happens” • DGA framework would require an approach for automatically and transparently starting and terminating remote workers • Various users should be able to donate their resources; our DGA can make use of “idle time” on various university machines • Potentially, we could develop simple applet for visibility and learning

  31. Concluding Remarks • Investigated feasibility of using Jini and JavaSpaces to build a distributed genetic algorithm • Proposed, implemented, and empirically evaluated a simple and a complex distributed system model (SDSM and CDSM) • SDSM bottleneck was a serious concern that prompted the investigation of a new model that removed JavaSpaces interaction bottlenecks • CDSM outperformed SGA in quality of solution, diversity, and generations per second • SGA only outperformed CDSM in execution time (mostly due to early convergence)

More recommend