worksheets
play

Worksheets Percy Liang UCI Reproducibility Symposium September 22, - PowerPoint PPT Presentation

Worksheets Percy Liang UCI Reproducibility Symposium September 22, 2020 The current research process 1 Problem 1: reproducibility Previous method New method Dataset 1 88% accuracy 92% accuracy 2 Problem 1: reproducibility Previous


  1. Worksheets Percy Liang UCI Reproducibility Symposium — September 22, 2020

  2. The current research process 1

  3. Problem 1: reproducibility Previous method New method Dataset 1 88% accuracy 92% accuracy 2

  4. Problem 1: reproducibility Previous method New method Dataset 1 88% accuracy 92% accuracy Dataset 2 72% accuracy 77% accuracy 2

  5. Problem 1: reproducibility Previous method New method Dataset 1 88% accuracy 92% accuracy Dataset 2 72% accuracy 77% accuracy Dataset 3 ? ? 2

  6. Problem 1: reproducibility Previous method New method Dataset 1 88% accuracy 92% accuracy Dataset 2 72% accuracy 77% accuracy Dataset 3 ? ? Dataset 4 ? ? ... ... ... 2

  7. Problem 2: efficiency Step 1: come up with a good idea 3

  8. Problem 2: efficiency Step 1: come up with a good idea Step 2: execute on it • Obtain data, clean it, convert between formats 3

  9. Problem 2: efficiency Step 1: come up with a good idea Step 2: execute on it • Obtain data, clean it, convert between formats • Try to reproduce results from previous work, email authors 3

  10. Problem 2: efficiency Step 1: come up with a good idea Step 2: execute on it • Obtain data, clean it, convert between formats • Try to reproduce results from previous work, email authors • Run experiments with different versions, keep track of provenance 3

  11. Problem 2: efficiency Step 1: come up with a good idea Step 2: execute on it • Obtain data, clean it, convert between formats • Try to reproduce results from previous work, email authors • Run experiments with different versions, keep track of provenance 3

  12. Tradeoff? efficiency reproducibility Folk wisdom: reproducibility slows down research. 4

  13. Tradeoff? efficiency reproducibility Folk wisdom: reproducibility slows down research. Our claim: reproducibility accelerates research (with the right tool). 4

  14. MLcomp.org (2008) 5

  15. MLcomp paradigm dataset algorithm 6

  16. MLcomp paradigm dataset algorithm accuracy metrics 6

  17. MLcomp paradigm dataset algorithm accuracy metrics Problem: too rigid, doesn’t help with the efficiency problem 6

  18. CodaLab Worksheets (2013-present) 7

  19. Bundles Worksheets 8

  20. Bundles Bundle : an arbitrary file/directory (code or data or results) 0x191aad8fa0ae4741b3123b15a8d59efa 9

  21. Bundles Uploaded by user (code or data): 10

  22. Bundles Uploaded by user (code or data): Derived by running an arbitrary command: 10

  23. Bundles cnn.py(0x45d17c) mnist(0x1ba223) - train.dat #!/usr/bin/python - test.dat import numpy as np ... data cnn.py exp2(0x2d4192) - stdout - stderr - stats.json exp ... 11

  24. Bundles cnn.py(0x45d17c) mnist(0x1ba223) - train.dat #!/usr/bin/python - test.dat import numpy as np ... data cnn.py - data/train.dat - data/test.dat exp2(0x2d4192) - cnn.py - stdout - stdout - stderr - stderr - stats.json - stats.json python cnn.py data/train.dat data/test.dat exp ... 11

  25. Command-line Interface (CLI) Search for existing code and data: $ cl search mnist 12

  26. Command-line Interface (CLI) Search for existing code and data: $ cl search mnist Upload new code or data: $ cl upload cnn.py 12

  27. Command-line Interface (CLI) Search for existing code and data: $ cl search mnist Upload new code or data: $ cl upload cnn.py Run experiments with arbitrary commands: $ cl run :cnn.py data:mnist "python cnn.py data/train.dat data/test.dat" 12

  28. Command-line Interface (CLI) Search for existing code and data: $ cl search mnist Upload new code or data: $ cl upload cnn.py Run experiments with arbitrary commands: $ cl run :cnn.py data:mnist "python cnn.py data/train.dat data/test.dat" Look at output of runs: $ cl cat exp2/stdout 12

  29. Command-line Interface (CLI) Search for existing code and data: $ cl search mnist Upload new code or data: $ cl upload cnn.py Run experiments with arbitrary commands: $ cl run :cnn.py data:mnist "python cnn.py data/train.dat data/test.dat" Look at output of runs: $ cl cat exp2/stdout Manage runs: $ cl kill exp2; cl rm exp2 12

  30. Command-line Interface (CLI) Search for existing code and data: $ cl search mnist Upload new code or data: $ cl upload cnn.py Run experiments with arbitrary commands: $ cl run :cnn.py data:mnist "python cnn.py data/train.dat data/test.dat" Look at output of runs: $ cl cat exp2/stdout Manage runs: $ cl kill exp2; cl rm exp2 Run an entire pipeline with a different dataset or newer version of your code: $ cl mimic mnist exp2 cifar -n exp3 12

  31. Command-line Interface (CLI) Search for existing code and data: $ cl search mnist Upload new code or data: $ cl upload cnn.py Run experiments with arbitrary commands: $ cl run :cnn.py data:mnist "python cnn.py data/train.dat data/test.dat" Look at output of runs: $ cl cat exp2/stdout Manage runs: $ cl kill exp2; cl rm exp2 Run an entire pipeline with a different dataset or newer version of your code: $ cl mimic mnist exp2 cifar -n exp3 Copy from one CodaLab instance to another: $ cl add bundle mnist stanford::pliang-demo main::pliang-demo 12

  32. Modularity Real-world problems require efforts of entire community 13

  33. Modularity Real-world problems require efforts of entire community People specialize, contribute in decentralized way 13

  34. Modularity Real-world problems require efforts of entire community People specialize, contribute in decentralized way 13

  35. Modularity Real-world problems require efforts of entire community People specialize, contribute in decentralized way 13

  36. Modularity Real-world problems require efforts of entire community People specialize, contribute in decentralized way 13

  37. Modularity Real-world problems require efforts of entire community People specialize, contribute in decentralized way 13

  38. Modularity Real-world problems require efforts of entire community People specialize, contribute in decentralized way 13

  39. Modularity Real-world problems require efforts of entire community People specialize, contribute in decentralized way 13

  40. Modularity Real-world problems require efforts of entire community People specialize, contribute in decentralized way 13

  41. Modularity Real-world problems require efforts of entire community People specialize, contribute in decentralized way 13

  42. Modularity Real-world problems require efforts of entire community People specialize, contribute in decentralized way 13

  43. Modularity Real-world problems require efforts of entire community People specialize, contribute in decentralized way 13

  44. Modularity Real-world problems require efforts of entire community People specialize, contribute in decentralized way 13

  45. Intermediate tasks • Old way: use intermediate metrics, rhetoric 14

  46. Intermediate tasks • Old way: use intermediate metrics, rhetoric • New way: plug in and see ramifications automatically 14

  47. Intermediate tasks • Old way: use intermediate metrics, rhetoric • New way: plug in and see ramifications automatically 14

  48. Intermediate tasks • Old way: use intermediate metrics, rhetoric • New way: plug in and see ramifications automatically 14

  49. Intermediate tasks • Old way: use intermediate metrics, rhetoric • New way: plug in and see ramifications automatically 14

  50. Immutability Inspiration: Git version control system 15

  51. Immutability Inspiration: Git version control system • All programs/datasets/runs are write-once • Enable collaboration without chaos • Capture the research process in a reproducible way 15

  52. Bundles Worksheets 16

  53. Literacy Bundle graphs are about truth ; what about interpretation ? 17

  54. Literacy Bundle graphs are about truth ; what about interpretation ? Worksheet : an arbitrary document with embedded bundles description description description 17

  55. Literacy Bundle graphs are about truth ; what about interpretation ? Worksheet : an arbitrary document with embedded bundles description description description Inspiration: Mathematica notebook, Jupyter notebook 17

  56. A worksheet We now train the classifier with more data. 18

  57. A worksheet We now train the classifier with more data. Program : SVMlight Arguments : -n 2000 Dataset : thyroid Error : 2.6% Time : 1 second 18

  58. A worksheet We now train the classifier with more data. Program : SVMlight Arguments : -n 2000 Dataset : thyroid Error : 2.6% Time : 1 second Notice that the error remains the same, suggesting that we’ve saturated our model. 18

  59. 19

  60. nanc-1m.txt(0xc19b66) Two New Orleans... run-count(0xd4815b) - stdout data data 1 1 2 4 run1(0xad3d69) run2(0x992ced) 3 9 - stdout - stdout 415 872 19

Recommend


More recommend