cs 744 ray
play

CS 744: RAY Shivaram Venkataraman Fall 2020 ADMINISTRIVIA late - PowerPoint PPT Presentation

a. 1 ! CS 744: RAY Shivaram Venkataraman Fall 2020 ADMINISTRIVIA late mall week - Assignment Grades? by next - Project proposal aka Introduction (10/16) Introduction Related Work Timeline (with eval plan) week . next -


  1. a. 1 ! CS 744: RAY Shivaram Venkataraman Fall 2020

  2. ADMINISTRIVIA late mall week - Assignment Grades? by next → - Project proposal aka Introduction (10/16) Introduction Related Work Timeline (with eval plan) week . next - Midterm: Oct 22 early →

  3. MACHINE LEARNING: STACK Data parallelism → = ypytorch train Performance → ,y → Portability TVM D ' m ⇒ \ Pipeline Pipe dream parallelism

  4. REINFORCEMENT LEARNING

  5. Reward Affeldt wt . mm by " algo

  6. tea " n ;÷:i

  7. RL SETUP .in#ja:hna :* → perforation -7 - ¥ . - - - re - do ow - - rein . !qm at Imitation :*

  8. ↳ O ¥ RL REQUIREMENTS O . static exeat 'm plan → flexibility grained computation Simulation fine → be or ms could - simulation ~ each hours processing , simeator state I stateless ,dm!nFu7e stated Training - proving ↳ data pre - latency Inane low execution Serving → very ↳ future computation ) on outfit of High throughput depends compute very past tasks I see IM

  9. ↳ " @yeifzI÷mg ↳ : qdfeadegr.am?t:aFIImtereqg-8osl9 a' ← RAY API Outbox 'kiEg¥e µ Tasks Actors Tfeu :EE That :b → tasks - variable I futures = f.remote(args) actor = Class.remote(args) - - - ,I¥mItfauI ¥ T Im futures = actor.method.remote(args) 2mg - Fwd name . remote Camp • handle = actor . method 't ) not 8. I I - objects = ray.get(futures) . handled will → - arap ready = ray.wait(futures, k,timeout) war # ( or before args1# I arguments tasks to futures be can within tasks wait ) for spawn ( or you can a task

  10. a•img÷ • o RAY API ' \ " " / tasks - Nested Tasks Actors " " futures = f.remote(args) actor = Class.remote(args) - futures = actor.method.remote(args) f Largs ) def - : Ito lo : i in for remote ( 443 ) fo . g. . remote Ci ) Hof . wait Cfo ) - ray objects = ray.get(futures) o - wait Frigg . - ' " EP ready = ray.wait(futures, k,timeout) rmthprtati to "

  11. ↳ ↳ ⇒ Lineage ! COMPUTATION MODEL . remote create - policy g. Dotted lines Control edges a task C > spawn - an actor - get C) ① spam @ lines Sold ✓ ✓ edges Data - y edges / X stateful arts action on sequentially happen

  12. = make Deterministic ARCHITECTURE key , had hash Mtn can tasks state .gr I 1 to ← idea : Fagin : o • ¥ " anger 01¥ Emir :& .

  13. ↳ ↳ ↳ Global control store Sort of Database Externalizes a \ Nam de Object table state ! donations ) all objets list r metadata and their ↳ shard Task table Replicate tasks of Lineage more Scale Function table easily , simplify corresponding blocks code 1 design ached tasks To tolerance fault

  14. RAY SCHEDULER Can local scheduler lags ) locality ? . remote take § C . / ✓ bcdi5 ( wait for timeout & fund if busy : Fasching , g. * ¥ Global Scheduler Global Control Store to → length - ← determine if locality node is busy

  15. FAULT TOLERANCE tasks - execution of replay Tasks ok lineage re → , periodically checkpoint actors Actors → cleft restore tree messages replay → 775ha drain 'd I replication GCS stateless ! Nothing ? Scheduler or launch → a new scheduler Re - spawn

  16. SUMMARY Ray: Unified system for ML training, serving, simulation Flexible API with support for Stateless tasks Stateful Actors Distributed scheduling, Global control store

  17. DISCUSSION https://forms.gle/PN5FSJB6vVkDjoih8

  18. Consider you are implementing two apps: a deep learning model training and - balancing a sorting application. When will use tasks vs actors and why ? bad stateless Tasks Actors state , location operations * + external Deterministic Sorting Does → it unity still have ! dependencies into smaller parts ? a. d ' ride weights Model are dependencies do can state , iterators ? Training between Multiple for - grained data parallel fine recovery

  19. replica YET ✓ node ;÷÷£ new if → better has :* :* . to replica ? - doin / n - goes we # ) after gift recovery ? v ? godwin ageing Minkowski

  20. NEXT STEPS scalability Linear Next class: Clipper sillier ¢ Last lecture on ML! a linear super ' I :::÷;L I hardware l2hB/mc

Recommend


More recommend