Hello ! CS 744: PYWREN Shivaram Venkataraman Fall 2020
ADMINISTRIVIA deadline Tonight → Friday Project checkins due Nov 20 th submitting for In-class project presentations about ! talks requests regrade Dec 8 th and Dec 10 th 5 min → project - - your Midterm I for soon ! Project grade breakdown Canvas → Intro: 5% Mid-semester checkin: 5% Presentation: 10% Final Report: 10%
analysis I ↳ Society Implications → Date of Big a NEW HARDWARE MODELS computation Engines evolution shed Data Syctems Big storage hardware New
Infiniband Networks Compute Accelerators Serverless Computing - Non-Volatile Memory
SERVERLESS COMPUTING 1 servers ? ? No
MOTIVATION: USABILITY etc Google Azure . , Scientist - - Data Analysis E- What instance type? What base image? How many to spin up? What price? Spot? - it difficult Makes the cloud to use
O
Snowflake ABSTRACTION LEVEL ? . .ae/totarinmisamneqn-,ouyIf.::.i-;:::e " ÷÷j Logistic Regression Avery Application Application → or SOL query spark Compute subset Spark a on wfm → Framework RDD machines - of ? Compute signing strains Framework Amazon EC2 . CloudLab server - VM Hardware Private Cluster → - …
⇒ STATELESS DATA PROCESSING - Intermediate Compute aerogel state in spark IMR resource state .biz f local disk on was Redis I ← local storage IAA is ephemeral so intermediate state remote ! S3 be needs to
Provided by “Serverless” computing Provider → cloud Y÷mqFydoad µ § function ( lambda ) submit a - executed to be 300 900 seconds single-core - → Time bound r I 512 MB in /tmp storage tgpsowds → 3GB RAM → memory → cloud database Python, Java, node.js =
PYWREN API / test foython - pg ' Integrated ! ! Language . py test dependencies martially captures ⇒ fat ← cloud ships and to them the libraries 2010 ] use [ cloud pickle → ~ like - - ← Pyspark to similar function map → - - - - to get Ray API similar ↳ block in
Distributed key : get put value PYWREN: how it works - Amazon # future = runner.map(fn, data) T.name get # Invoke In " ÷ ¥ → fetch fu & data often containers ) - toll future.result() - - . . fetch variable - your laptop ! # JUS in - < your laptop the cloud
how it works future = runner.map(fn, data) data data func data Serialize func and data Put on S3 pull job from s3 Invoke Lambda download anaconda runtime python to run code pickle result stick in S3 future.result() poll S3 result unpickle and return your laptop the cloud
STATELESS FUNCTIONS: WHY NOW ? - What are the trade-offs ? Need network 210 → more data is the All network ! read over f pretty is network BW But - → - local to good ! comparable Bw ! - - SSD Ss ? could be Bottleneck →
Shuffle phase in MAP and REDUCE ? now is MR benchmark Sort using done being paper MapReduce ↳ same Redi as key ? ,hey2 - - - - Co - - - soo ) top - - Goi - . - . → - Input Output = - Data Data . ! = - - keys - bucket ( red intoning - value key files - small - memory blob store like not good for store
↳ PARAMETER SERVERS prediction ) ML model compute models stored input → # sparse read ↳ Ad click get Redi Use lambdas to run “workers” or VMs etc . update Parameter server as a service ? Parameter Server - - profile measure do or How you requirements ? function function locally use Ran , I I profiler ? resume [ !] Recent work time limit ) and tolerance → checkpoint ( before Fault
↳ WHEN Should we use SERVERLESS ? Yes! Maybe not ? need elasticity semesters when Use you not we when me state ( actors ) need local don't need when Use you workloads ) need might Iterative across grained Comm fine . iteration poor state from . workers might lambdas all not the active at he time ! same
SUMMARY Motivation: Usability of big data analytics Approach: Language-integrated cloud computing Features - Breakdown computation into stateless functions - Schedule on serverless containers - Use external storage for state management Open question on scheduling, overheads
DISCUSSION https://forms.gle/PAMDKmwHepmPWDrBA
↳ ywjrkefpu.es?diforageindefedentY scale by K workers Increasing improvement ! ' Sx f - D - to Hard know is → compute to ← how very short compared to choose men I/O pavilions wards more read / write time to reduces Reds to
Consider you are a cloud provider (e.g., AWS) implementing support for serverless. What could be some of the new challenges in scheduling these workloads? How would you go about addressing them? lambda functions machines → Mapping - this ? do do How we Redi shard ? talk to some lambda ? Does one Locality - infer it ? we can ? container / when reuse do we schedule new to - when a " ML ? configuration ? use find to opt Need * I core are fixed ! 900 , requirements Resource ' - 3GB upto
OPEN QUESTIONS - Scalable scheduling: Low latency with large number of functions ? - Debugging: Correlate events across functions ? - Launch overheads: Fraction of time spent in setup (OpenLambda) - Resource limits: 15 minute AWS Lambda (Oct 2018) tu
↳ ↳ Stark told " m side btw " side ] App warm for 5 mins - be sued . ⇒ if ran you Swiss within TB%YiaAuw# one Azure paper policy - ÷÷÷i¥¥ ⇐ -1 :÷ : . 3h13
Recommend
More recommend