computational science of computer systems
play

Computational Science of Computer Systems M ethodologies dexp - PowerPoint PPT Presentation

Computational Science of Computer Systems M ethodologies dexp erimentation pour linformatique distribu ee ` a large echelle Martin Quinson March 8th, 2013 What is Science anyway? Doing Science = Acquiring Knowledge


  1. Computational Science of Computer Systems M´ ethodologies d’exp´ erimentation pour l’informatique distribu´ ee ` a large ´ echelle Martin Quinson March 8th, 2013

  2. What is Science anyway? Doing Science = Acquiring Knowledge Experimental Science Theoretical Science Computational Science ◮ Thousand years ago ◮ Last few centuries ◮ Nowadays ◮ Observations-based ◮ Equations-based ◮ Compute-intensive ◮ Can describe ◮ Can understand ◮ Can simulate ◮ Prediction tedious ◮ Prediction long ◮ Prediction easier Prediction is very difficult, especially about the future. – Niels Bohr CS2 Martin Quinson Computational Science of Computer Systems Introduction SimGrid PDES Formal Open Science Conclusion 2/30

  3. Observations still base Science Space telescope Large Hadron Collider Mars Explorer NMR Spectroscope Synchrotrons Turntable Earthquake vs. Bridge Climate vs. Ecosystems Tsunamis (who said that science is not fun??) CS2 Martin Quinson Computational Science of Computer Systems Introduction SimGrid PDES Formal Open Science Conclusion 3/30

  4. Computational Science CS2 Martin Quinson Computational Science of Computer Systems Introduction SimGrid PDES Formal Open Science Conclusion 4/30

  5. Computational Science Understanding the Climate Change with Predictions CS2 Martin Quinson Computational Science of Computer Systems Introduction SimGrid PDES Formal Open Science Conclusion 4/30

  6. Computational Science Understanding the Climate Change with Predictions Models complexity grows This requires large computers Upscale project: 15,000 computing-years in 2012! CS2 Martin Quinson Computational Science of Computer Systems Introduction SimGrid PDES Formal Open Science Conclusion 4/30

  7. Modern Computers are Large and Complex Massive Parallelism ◮ Cannot miniaturize further (atom limit) ◮ Cannot increase frequency (energy limit) ◮ Solution: Multiply compute cores! ◮ Sequoia, second fastest computer: 1,572,864 cores ExaScale Systems, used in Computational Science ◮ Systems computing 1 Exaflop per second arrive (with billions of cores) ◮ 1 Exaflop = 10 18 operations. One million million million operations. . . ◮ At humanly doable speed, that requires 10 times the age of the universe ◮ Each node: 20 millions lines of code (10 × Encyclopedia Britannica) Other very large computer systems in the wide ◮ Google computers dissipate 300MW on average (150,000 households, 1 3 reactor) ◮ Botnets: BredoLab estimated to control 30 millions of zombie computers ◮ In addition, these systems are heterogeneous and dynamic CS2 Martin Quinson Computational Science of Computer Systems Introduction SimGrid PDES Formal Open Science Conclusion 5/30

  8. Computational Science of Computer Systems This essential complexity mandates adapted scientific instruments Research Field: Methodologies of Experimentation ◮ Assessing the performance and correctness of large-scale computer systems ◮ Meta-research on producing scientifically sound results ◮ Main contribution: SimGrid, a large-scale computer systems simulator CS 2 Martin Quinson Computational Science of Computer Systems Introduction SimGrid PDES Formal Open Science Conclusion 6/30

  9. Computational Science of Computer Systems This essential complexity mandates adapted scientific instruments Research Field: Methodologies of Experimentation ◮ Assessing the performance and correctness of large-scale computer systems ◮ Meta-research on producing scientifically sound results ◮ Main contribution: SimGrid, a large-scale computer systems simulator First title (rejected) Simulating Applications for Research in Simulation Applications for Research CS 2 Martin Quinson Computational Science of Computer Systems Introduction SimGrid PDES Formal Open Science Conclusion 6/30

  10. Computational Science of Computer Systems This essential complexity mandates adapted scientific instruments Research Field: Methodologies of Experimentation ◮ Assessing the performance and correctness of large-scale computer systems ◮ Meta-research on producing scientifically sound results ◮ Main contribution: SimGrid, a large-scale computer systems simulator First title (rejected) La simulation d’applications pour la recherche en applications de simulation pour la recherche CS 2 Martin Quinson Computational Science of Computer Systems Introduction SimGrid PDES Formal Open Science Conclusion 6/30

  11. Computational Science of Computer Systems This essential complexity mandates adapted scientific instruments Research Field: Methodologies of Experimentation ◮ Assessing the performance and correctness of large-scale computer systems ◮ Meta-research on producing scientifically sound results ◮ Main contribution: SimGrid, a large-scale computer systems simulator First title (rejected) Simulating Applications for Research in Simulation Applications for Research CS 2 Martin Quinson Computational Science of Computer Systems Introduction SimGrid PDES Formal Open Science Conclusion 6/30

  12. Computational Science of Computer Systems This essential complexity mandates adapted scientific instruments Research Field: Methodologies of Experimentation ◮ Assessing the performance and correctness of large-scale computer systems ◮ Meta-research on producing scientifically sound results ◮ Main contribution: SimGrid, a large-scale computer systems simulator First title (rejected) Simulating Applications for Research in Simulation Applications for Research Epistemological Stance ◮ Empirically consider large-scale computer systems as natural objects ◮ Eminently artificial artifacts, but complexity reaches “natural” levels ◮ Other sciences routinely use computers to understand complex systems CS 2 Martin Quinson Computational Science of Computer Systems Introduction SimGrid PDES Formal Open Science Conclusion 6/30

  13. Assessing Distributed Applications Correctness Study � Formal Methods ◮ Tests: Unable to provide definitive answers Performance Study � Experimentation ◮ Maths: Often not sufficient to fully understand these systems CS 2 Martin Quinson Computational Science of Computer Systems Introduction SimGrid PDES Formal Open Science Conclusion 7/30

  14. Assessing Distributed Applications Correctness Study � Formal Methods ◮ Tests: Unable to provide definitive answers ◮ Model-Checking: Exhaustive and automated exploration of state space Performance Study � Experimentation ◮ Maths: Often not sufficient to fully understand these systems Courtesy of Lucas Nussbaum ◮ Experimental Facilities: Real applications on Real platform (in vivo) ◮ Simulation: Prototypes of applications on system’s Models (in silico) CS 2 Martin Quinson Computational Science of Computer Systems Introduction SimGrid PDES Formal Open Science Conclusion 7/30

  15. Assessing Distributed Applications Correctness Study � Formal Methods ◮ Tests: Unable to provide definitive answers ◮ Model-Checking: Exhaustive and automated exploration of state space Performance Study � Experimentation ◮ Maths: Often not sufficient to fully understand these systems Courtesy of Lucas Nussbaum ◮ Experimental Facilities: Real applications on Real platform (in vivo) ◮ Emulation: Real applications on Synthetic platforms (in vitro) ◮ Simulation: Prototypes of applications on system’s Models (in silico) CS 2 Martin Quinson Computational Science of Computer Systems Introduction SimGrid PDES Formal Open Science Conclusion 7/30

  16. Simulating Distributed Systems Big Idea: Simulation is the fastest path from idea to scientific results Experimental setup Idea to test Simulation Model Scientific results 6 + 100000 Root Default CPU Model 10000 Partial LMM Invalidation + ⇒ Lazy Action Management 1 2 2 execution time (s) 1000 Trace Integration 5 100 10 1 3 4 5 0.1 4 3 0.01 0.001 6 10 20 40 80 160 320 640 1280 2560 5120 10240 1 End number of simulated hosts Comfort to the user ◮ Get preliminary results from partial implementations ◮ Experimental campaign with thousands of runs within the week ◮ Test your scientific idea, ignore technical subtleties (for now) Challenges for the tools ◮ Validity: Get realistic results (controlled experimental bias) ◮ Scalability: Fast enough and Big enough ; Tooling: runner, post-processing Scientific practices sometimes unfortunate in this field ◮ Experimental settings not detailed enough in literature ◮ Many short-lived simulators; few sound and established tools CS 2 Martin Quinson Computational Science of Computer Systems Introduction SimGrid PDES Formal Open Science Conclusion 8/30

  17. SimGrid: Versatile Simulator of Distributed Apps Scientific Instrument ◮ Versatile: Grid, P2P, HPC, Volunteer Computing and others ◮ Sound: Validated, Scalable, Usable; Modular; Portable ◮ Community-driven: 30 contributors (5 not affiliated), 5 contributed tools, GPL Scientific Object ◮ Allows comparison of network models on non-trivial applications ◮ High-Performance Simulation on realistic workload ◮ Full model checker of distributed applications; Emulator under way Large Established Project ◮ Started in 1998; Collab. Loria / Inria Grenoble / CC-IN2P3 / U. Hawaii ◮ Impact: 120 publications (110 distinct authors, 5 continents) , 4 PhD ◮ Co-leader with A. Legrand (CNRS Grenoble) and F. Suter (CNRS IN2P3) CS2 Martin Quinson Computational Science of Computer Systems Introduction SimGrid PDES Formal Open Science Conclusion 9/30

Recommend


More recommend