easy deployment for jungle computing
play

Easy Deployment for Jungle Computing Niels Drost Computer Systems - PowerPoint PPT Presentation

Easy Deployment for Jungle Computing Niels Drost Computer Systems Group Department of Computer Science VU University, Amsterdam, The Netherlands Requirements Resource independence Transparent / easy deployment Middleware


  1. Easy Deployment for Jungle Computing Niels Drost Computer Systems Group Department of Computer Science VU University, Amsterdam, The Netherlands

  2. Requirements ● Resource independence ● Transparent / easy deployment ● Middleware independence & interoperability ● Jungle-aware middleware ● Jungle-aware communication ● Robust connectivity ● Globally unique naming ● System-support for malleability and fault-tolerance ● Transparent parallelism & application-level fault-tolerance ● Easy integration with external software ● MPI, OpenCL, CUDA, C, C++, scripts, … ComplexHPC Spring School 2011 2

  3. Requirements ● Resource independence ● Transparent / easy deployment ● Middleware independence & interoperability ● Jungle-aware middleware ● Jungle-aware communication ● Robust connectivity ● Globally unique naming ● System-support for malleability and fault-tolerance ● Transparent parallelism & application-level fault-tolerance ● Easy integration with external software ● MPI, OpenCL, CUDA, C, C++, scripts, … ComplexHPC Spring School 2011 3

  4. Deployment ● How to get your application running in the Jungle ● For each resource used: ● Find resource ● Reserve resource ● Copy input files (and possibly application itself) ● Configure/Compile application ● Run application ● Copy back output files ComplexHPC Spring School 2011 4

  5. Middleware ● Resources invariable use some sort of Middleware ● Provide remote access to resources ● File copy, running applications, etc ● Many different middleware available: ● Globus (de facto standard, in 4 Flavors) ● gLite, NAREGI, UNICORE, Legion ● SSH (poor man’s middleware) ComplexHPC Spring School 2011 5

  6. Problems (1): Too little Middleware ● All resources need to have some middleware ● Hard to install ● Hard to maintain ● Low Fault-Tolerance ● Assume very static setup A full fledged middleware on a resource may require an almost full-time maintainer ComplexHPC Spring School 2011 6

  7. Problems (2): Too much Middleware ● Jungle computing applications use multiple different resources ● With different middleware ● With wildly different interfaces ● Which are too low level Using multiple different resources at the same time is neigh impossible using middleware directly ComplexHPC Spring School 2011 7

  8. Problems (3): Too much everything ● Large number of steps required to deploy an application ● Middleware level interface too low level for users ● Deploying an application requires the user to write another application! ● Users want to simply “press a button” to deploy Deployment is not very user friendly ComplexHPC Spring School 2011 8

  9. Ibis Software Stack 3 2 1 ComplexHPC Spring School 2011 9

  10. Zorilla: A P2P Middleware ComplexHPC Spring School 2011 10

  11. Current middleware ● Hard to install and maintain ● Centralized implementation (not very fault -tolerant) ● Usually no global functionality ● No global file system ● No co-allocation (though Koala could also fix this) ● Not even possible unless exactly the same middleware everywhere ComplexHPC Spring School 2011 11

  12. Zorilla ● Alternative middleware developed at the VU ● Based on Peer-to-Peer (P2P) technology ● Little to no configuration  ● Highly fault-tolerant  ● Trust issues  ● Hardly any requirements (JVM) ● Easy to install, little to no maintenance ● Explicitly supports Jungle computing applications ● Plays nice with existing middleware ● Prototype ComplexHPC Spring School 2011 12

  13. Life of a Job (1) ComplexHPC Spring School 2011 13

  14. Life of a Job (2) ComplexHPC Spring School 2011 14

  15. Life of a Job (3) ComplexHPC Spring School 2011 15

  16. Life of a Job (4) ComplexHPC Spring School 2011 16

  17. Zorilla Overview Clouds ComplexHPC Spring School 2011 17

  18. Zorilla Components (1) ● Bootstrap ● Initial set of contact points ● UDP broadcast or provided by user ● Gossip overlay network ● Actualized Robust Random Gossip (ARRG) ● Withstands Firewalls et al. ● Clustering ● Nearest neighbor list ComplexHPC Spring School 2011 18

  19. Zorilla Components (2) ● Flood scheduling ● Incrementally search for resources at more and more distant nodes ● Job Management ● Status (scheduling, running, done, etc) ● File transfers ● Malleability / crashes ComplexHPC Spring School 2011 19

  20. Resource Discovery: ARRG ComplexHPC Spring School 2011 20

  21. Resource Discovery: Clustering ComplexHPC Spring School 2011 21

  22. Resource Discovery: Flood scheduling ComplexHPC Spring School 2011 22

  23. Conclusions ● Current Middleware are hard to install and maintain. ● …and do not offer the global functionality required by Jungle Computing applications ● Zorilla is a light-weight P2P alternative, offering zero maintenance, easy install, and explicit support for parallel applications. ComplexHPC Spring School 2011 23

  24. JavaGAT: Middleware independent API ComplexHPC Spring School 2011 24

  25. Requirements ● Resource independence ● Transparent / easy deployment ● Middleware independence & interoperability ● Jungle-aware middleware ● Jungle-aware communication ● Robust connectivity ● Globally unique naming ● System-support for malleability and fault-tolerance ● Transparent parallelism & application-level fault-tolerance ● Easy integration with external software ● MPI, OpenCL, CUDA, C, C++, scripts, … ComplexHPC Spring School 2011 25

  26. Requirements ● Resource independence ● Transparent / easy deployment ● Middleware independence & interoperability ● Jungle-aware middleware ● Jungle-aware communication ● Robust connectivity ● Globally unique naming ● System-support for malleability and fault-tolerance ● Transparent parallelism & application-level fault-tolerance ● Easy integration with external software ● MPI, OpenCL, CUDA, C, C++, scripts, … ComplexHPC Spring School 2011 26

  27. Typical Grid/Cloud Application Application submitJob(...) File.copy(...)

  28. Typical Grid/Cloud Application Application submitJob(...) File.copy(...) fork cp pbs ftp condor gridftp unicore scp globus http

  29. Typical Grid/Cloud Application Application submitJob(...) File.copy(...) ? fork ? cp pbs ftp condor gridftp unicore scp globus http

  30. Which Middleware do I use? ● A lot to choose from ● Some may not work on all sites ● Most are hard to use ● Interfaces change often ● Globus? (Obvious choice 3 years ago) ComplexHPC Spring School 2011 30

Recommend


More recommend