coding the ian foster
play

Coding the Ian Foster 1 "When the network is as fast as the - PowerPoint PPT Presentation

Coding the Ian Foster 1 "When the network is as fast as the computers internal links, the machine disintegrates across the net into a set of special- purpose appliances. -- George Gilder, 2001 2 "When the network is as fast


  1. Coding the Ian Foster 1

  2. "When the network is as fast as the computer’s internal links, the machine disintegrates across the net into a set of special- purpose appliances.” -- George Gilder, 2001 2

  3. "When the network is as fast as the computer’s internal links, the machine disintegrates across the net into a set of special- purpose appliances.” -- George Gilder, 2001 3

  4. “network is as fast as the computer’s internal links” Communication technologies continue to evolve https://doi.org/10.1007/978-3-319-31903-2_8 Innovation continues in the lab Hollow core fiber: 99.7% speed of light (1.46x faster than fiber) 73.7 terabits per second 5G is transforming communications Global IP traffic, wired and wireless 4 doi:10.1038/nphoton.2013.45

  5. We can compute anywhere! Greenest Cheapest Nearest to data

  6. But are we really free? Time = T compute + 2 T latency Uphill in all directions

  7. "When the network is as fast as the computer’s internal links, the machine disintegrates across the net into a set of special- purpose appliances .” -- George Gilder, 2001 7

  8. “a set of special - purpose appliances” FPGAs Source: http://bit.ly/2SDGHzT

  9. Tesla self-driving chip: 2.5 Gpixel/s, 72 Top/s, 72 W 9

  10. “a set of special - purpose appliances” “ Cloud computing 5x to 10x improved price point [relative to Enterprise]” — James Hamilton, http://bit.ly/2E78Wi1 Why? • Improved utilization • Economies of scale in operations • More power efficient • Optimized software LBNL-1005775

  11. Modular data center Google hyperscale data center, St. Ghislain, Belgium

  12. Zero-carbon cloud: Reduce energy cost and energy carbon footprint to 0 Andrew Chien DOI 10.1109/IPDPS.2016.96

  13. The performance landscape becomes peculiar A program can run on two computers C 1 takes 0.01 seconds C 2 takes 0.005 seconds Which is faster? 13

  14. The performance landscape becomes peculiar A program can run on two computers C 1 takes 0.01 seconds C 2 takes 0.005 seconds Which is faster? The answer depends on their location. Say C 1 is adjacent and C 2 is 500 km distant t(C 1 ) = T 1 = 0.01 sec t(C 2 ) = T 2 + 2 x 500 x 5 x 10 −6 = 0.01 sec 14

  15. The performance landscape becomes peculiar A program can run on two computers C 1 takes 0.01 seconds C 2 takes 0.005 seconds Which is faster? The answer depends on their location. Say C 1 is adjacent and C 2 is 500 km distant t(C 1 ) = T 1 = 0.01 sec t(C 2 ) = T 2 + 2 x 500 x 5 x 10 −6 = 0.01 sec The apparent speed of a computer depends on its location ; the apparent location of a computer depends on its speed 15

  16. Continuum A set of elements such that between any two of them there is a third element [dictionary.com] For example, the computing continuum: Fog IoT/Edge HPC/Cloud Size Nano Micro Milli Server Fog Campus Facility Example Adafruit Particle.io Array of Linux Box Co-located 1000-node Datacenter Trinket Boron Things Blades cluster Memory 0.5K 256K 8GB 32GB 256G 32TB 16PB Network BLE WiFi/LTE WiFi/LTE 1 GigE 10GigE 40GigE N*100GigE Cost $5 $30 $600 $3K $50K $2M $1000M Credit: Pete Beckman, beckman@anl.gov. See PAISE, Friday

  17. The space-time continuum “space by itself, and time by itself, are doomed to fade away into mere shadows, and only a kind of union of the two will preserve an independent reality …” H. Minkowski, 1908 Space-time diagram https://en.wikipedia.org/wiki/Spacetime 17

  18. The spacetime continuum in computational systems C 1 10 ms t(C 1 ) = T 1 = 0.01 sec C 2 t(C 2 ) = T 2 + 2 x 500 x 5 x 10 −6 = 0.01 sec 7.5 ms The behaviors of the two T 2 T 1 computers are indistinguishable 5 ms 2.5 ms 500 km 0 km Misquoting Minkowski: “ Henceforth, location for itself, and speed for itself shall completely reduce to a mere shadow, and only some sort of union of the two shall preserve independence ." 18

  19. A real example: High energy physics trigger analysis Local: 2000 msec Remote: 30 + 10 + 10 = 50 msec 40x acceleration T 1 = 2 seconds on CPU 50 ms (not to scale) 40 ms T 2 = 30 msec on FPGA 10 ms 2000 km 0km (Virginia) (Illinois) Nhan Tran, FermiLab, et al. arXiv:1904.08986 19

  20. Reasoning about the computing continuum (a) Assumptions X X A1 : N identical consumers, each of which requests one compute unit per sec, distributed X secs apart A2 : Infinite bandwidth: i.e., only latency A3 : A computer takes T secs to complete a compute unit A4 : A compute center containing Z computers is faster by a factor of √Z 20

  21. Reasoning about the computing continuum (b) Without response time bounds Max time is: 𝑼 𝑶 𝟑 𝐘 On N: 𝑶 + N N 2 X Local: T 2 X E.g., N = 100, T=0.01, X=0.0001: N 2 X 0.01 100 2 0.0001 On N: 10 + = 0.001 + 0.00071 = 0.00171 s Local: 0.01 sec 22

  22. Reasoning about the computing continuum (c) With response time bound, B From A1 , there are πD 2 /X 2 consumers We want to know D for which: T within distance D of a compute center 𝑡𝑗𝑨𝑓 + 2D ≤ B As size is πD 2 /X 2 , we want to solve: T π D 2 /X 2 + 2D = B D With B=0.01, T=0.001, X=0.0001 sec: D = 0.004964 sec (~1000 km) Then: Size = π D 2 /X 2 = 7854 Max processing time is 2 × 0.004964 + 0.001/ 7854 = 0.01 seconds 24

  23. Reasoning about the computing continuum (d) Discussion The model emphasizes the importance of aggregation The model can surely be improved: • Empirical data on scaling of cost and speed with size • Data transfer costs • Empirical data on workloads Optimal solutions will likely involve compute centers of multiple sizes

  24. Small and midsize data centers: Server intensity 26 Source: LBNL-2001025

  25. Coding the continuum Code: verb. 1) to arrange or enter in a code 27

  26. Coding the continuum Code: verb. 1) to arrange or enter in a code 2) to write code for 28

  27. Coding the continuum Code: verb. 1) to arrange or enter in a code 2) to write code for Now that the machine has disintegrated across the net, how do we program it? 29

  28. Coding the continuum Code: verb. Continuum-aware programming model 1) to arrange or enter in a code 2) to write code for Data Function fabric fabric Now that the machine has disintegrated across the net, Trust Cost how do we program it? fabric map 30

  29. Coding the continuum: Serial crystallography doi: 10.1038/nature09750 31

  30. Coding the continuum: Serial crystallography 6 MB, 5 msec For each sample: 1 image/20 msec • Image crystals at ~50 Hz: 6 GB, 1 sec • Validate each image 1K image/15 sec • After 1000, quality control 160 GB • After 26000, full analysis 60 sec 26K images/7 min • If good: 0.2-1 TB • Determine crystal structure 3000 sec Multiple chips @ 7 min each • Return crystal structure

  31. Coding the continuum: Serial crystallography 6 MB, 5 msec 1 image/20 msec 6 GB, 1 sec 1K image/15 sec 160 GB 60 sec 1 msec = 50 km 26K images/7 min 200 msec = 10 000 km 0.2-1 TB 3000 sec Multiple chips @ 7 min each 12 000 msec = 600 000 km [moon = 384 000 km] 600K msec = 30 Mkm [L1 = 1.5M km]

  32. Advanced Photon Source 1 km 10 μ sec RTT Argonne Leadership Computng Facility

  33. Similar needs arise across modern (AI-enabled) science AI Simulation codes Methods Scientific instruments Computational results Data Feature Major user facilities Function memoization enhancement selection Agile Services Laboratories … Automated labs Resource Model Authen/Access … Data mgmt creation QA/QC Prog. Languages envs . Sensors Industry, academia Models Model training New methods Environmental Data Agile Libraries Data mgmt Laboratories Open source codes Infrastructure Data ingest Mobile AI accelerators Runtime HPO Accelerators … Compute … system Portability Operating Inference Compilers Databases UQ system Reference data Workflow Scientists, engineers Experimental data Automation Model Surrogates Expert input Computed properties reduction Active/ Goal setting Scientific literature reinforcement … … learning

  34. Learned Function Accelerators (LFAs) 36

  35. Coding the continuum: Closed solution 37 https://read.acloud.guru/aws-greengrass-the-missing-manual-2ac8df2fbdf4

  36. Coding the continuum: Model DLHub registry Elements of an open solution Flows Automate Thanks to colleagues, especially: Write programs Cost SCRIMP map Rachana Yadu Babuji Ben Blaiszik Kyle Chard Ryan Chard func X Function Ananthakrishnan fabric Data Data services fabric Trust Auth fabric Zhuozhao Li Tyler Skluzacek Steve Tuecke Anna Woodard Logan Ward

  37. Coding the continuum: Model DLHub registry Elements of an open solution Flows Automate Write programs Cost SCRIMP map func X Function fabric Data Data services fabric Trust Auth fabric http://parsl-project.org https://arxiv.org/pdf/1905.02158

  38. Coding the continuum: Model DLHub registry Elements of an open solution Flows Automate Write programs Cost SCRIMP map func X Function fabric Data Data Portable code Any access Any computer services fabric SSH, Globus, Clusters, Python cluster or HPC clouds, HPC, Docker, Shifter, Trust Auth scheduler accelerators Singularity fabric

Recommend


More recommend