automated tracking of computational experiments using

Automated tracking of computational experiments using Sumatra - PowerPoint PPT Presentation

Rumah Gadang Minangkabau in West Sumatra by CharlesFred Automated tracking of computational experiments using Sumatra Andrew Davison Unit de Neurosciences, Information et Complexit (UNIC)

  1. Rumah Gadang Minangkabau in West Sumatra by CharlesFred Automated tracking of computational experiments using Sumatra Andrew Davison Unité de Neurosciences, Information et Complexité (UNIC) CNRS, Gif sur Yvette, France Reproducible Research: Tools and Strategies for Scientific Computing AMP 2011, Vancouver. July14 2011

  2. This presentation is licenced under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 licence

  3. Reproducibility attack of the clone santas by slowburn ♪

  4. Replicability Reproducibility attack of the clone santas by slowburn ♪ Reproduction of the original results using the Reproduction Completely same tools using different independent software, but with reproduction based by the original by someone in the by someone access to the only on text author on the same lab/using a in a original code description, without same machine different machine different lab access to the original code

  5. Replicability Reproducibility attack of the clone santas by slowburn ♪ Reproduction of the original results using the Reproduction Completely same tools using different independent software, but with reproduction based by the original by someone in the by someone access to the only on text author on the same lab/using a in a original code description, without same machine different machine different lab access to the original code

  6. Replicability attack of the clone santas by slowburn ♪

  7. “I thought I used the same parameters but I’m getting different results” attack of the clone santas by slowburn ♪ “I can’t remember which version of the code I used to generate figure 6” “The new student wants to reuse that model I published three years ago but he can’t reproduce the figures” “It worked yesterday” “Why did I do that?”

  8. computational experiment exactly? Why isn’t it easy to reproduce a Cute clones by jurvetson

  9. Why isn’t it easy to reproduce a computational experiment exactly? complexity Cute clones by jurvetson dependence on small details, small changes have big effects entropy computing environment, library versions change over time memory limitations forgetting, implicit knowledge not passed on

  10. What can we do about it? Cute clones by jurvetson

  11. What can we do about it? complexity Cute clones by jurvetson use/teach good software-engineering practices (loose coupling, testing...) entropy plan for reproducibility from the start: run in different environments, write tests, record dependencies memory limitations record everything

  12. lab bench by proteinbiochemist

  13. What do we need to record? the code that was run lab bench by proteinbiochemist how it was run (parameter files, input data, command-line options) the platform on which it was run why was it run? what was the outcome? (output data, figures, qualitative interpretation)

  14. Recording the code that was run lab bench by proteinbiochemist store a copy of the executable or of the source code including that of any libraries used as well as the compiler used and the compilation procedure

  15. Recording the code that was run lab bench by proteinbiochemist the version of the interpreter and any options used in compiling it a copy of the simulation script and of any external modules or packages that are imported/included

  16. Recording the code that was run lab bench by proteinbiochemist instead of storing a copy of the code we can store the repository URL and version number

  17. Recording platform information lab bench by proteinbiochemist processor architecture operating system number of processors

  18. Recording all this by hand is tedious and error-prone lab notebook by benjaminlansky

  19. Recording all this by hand is tedious let’s automate it and error-prone lab notebook by benjaminlansky

  20. What should this automated lab notebook look like? lab notebook by benjaminlansky

  21. Different researchers, different workflows command-line lab notebook by benjaminlansky GUI batch jobs solo or collaborative any combination of these for different components and phases of the project

  22. Requirements automate as much as possible, prompt the user for the rest interact with version control systems (Subversion, Git, lab notebook by benjaminlansky Mercurial, Bazaar ...) support serial, distributed, batch simulations/analyses link to data generated by the simulation/analysis support all and any (command-line driven) simulation/analysis programs support both local and networked storage of simulation/analysis records

  23. conscientious will use it Be very easy to use, or only the very Requirements Kottke's Awesome Lab Notebook by Mouser NerdBot

  24. Sumatra lab notebook by benjaminlansky a Python package to enable systematic capture of the environment of numerical simulations/analyses can be used directly in your own code or as the basis for interfaces

  25. Current a command line interface, smt lab notebook by benjaminlansky a web interface, smtweb Future could be integrated into existing GUI-based tools or new desktop/web-based GUIs written from scratch

  26. Sumatra Sawahs in West Sumatra by CharlesFred

  27. Simulation Management Tool Sumatra Sawahs in West Sumatra by CharlesFred

  28. Sawahs in West Sumatra by CharlesFred Sumatra Simulation Management Tool ⁁ Computational Experiment

  29. Nothing to do with Java Sumatra Sumatra by smysnbrg

  30. Dependencies Python bindings for your preferred version control system ( pysvn , mercurial , PyGit, lab notebook by benjaminlansky bzrlib ) Django (only needed for web interface) mpi4py (if running distributed computations), httplib2

  31. easy_install sumatra Installation lab notebook by benjaminlansky

  32. smt $ cd myproject $ smt init MyProject

  33. $ python default.param $ smt configure --simulator=python $ smt run default.param $ smt run --simulator=python default.param

  34. has no create new the code find dependencies record changed? yes get platform information run simulation/analysis code raise change exception error policy record time taken diff find new files store diff add tags save record

  35. $ smt list 20110713-174949 20110713-175111 $ smt list -l -------------------------------------------------- Label : 20110713-174949 Timestamp : 2011-07-13 17:49:49.235772 Reason : Outcome : Duration : 0.0548920631409 Repository : MercurialRepository at /path/to/myproject Main file : Version : rf9ab74313efe Script arguments : <parameters> Executable : Python (version: 2.6.2) at /usr/bin/python Parameters : seed = 65785 : distr = "uniform" : n = 100 Input_Data : [] Launch_Mode : serial Output_Data :[example2.dat(43a47cb379df2a7008fdeb38c6172278d000fdc4)] Tags : . . .

  36. $ smt run --label=haggling --reason="determine whether the gourd is worth 3 or 4 shekels" romans.param

  37. $ smt comment "apparently, it is worth NaN shekels."

  38. $ smt comment 20110713-174949 "Eureka! Nobel prize here we come."

  39. $ smt tag “Figure 6”

  40. $ smt run --reason="test effect of a smaller time constant" default.param tau_m=10.0

  41. $ smt repeat haggling The new record exactly matches the original.


More recommend