SEUSS: Skip Redundant Paths to Make Serverless Fast James Cadden , Thomas Unger, Yara Awad, Han Dong, Orran Krieger, Jonathan Appavoo Department of Computer Science Boston University The Proceedings of EuroSys, 2020 April 29th, 2020
Serverless Computing 1. event 1. Function-as-a-Service Platform API (FaaS) : on-demand execution of a client code 2. pull app snippet ( functions ) … f n f 1 f 2 f 3 .js .js .py .rb 2. Applications are deployed Application database and scaled automatically 3. setup & install E 1 3. Function start time is Isolated Execution dominated by deterministic Environments 4. Run! setup & install paths ⇝ Compute Resources
Serverless Computing 1. event Platform API 4. Functions deploy quickly … f n f 1 f 2 f 3 using a pre-initialized .js .js .py .rb environment! Application database E 1 Isolated Execution Environments 2. Run! ⇝ Compute Resources
Serverless Computing 5. FaaS platform utility FaaS API becomes a matter of cache efficiency! … f n f 1 f 2 f 3 .js .js .py .rb 6. Mechanism of the system- Function database level defines the security , cache density , and responsiveness P 1 EXECUTION CACHE Isolated Execution Environments ⇝ Compute Resources
FaaS Environment Caching more overhead Cache Primitive: VM Node.js “launcher” provides a REST API to import and run an less more C arbitrary JavaScript function secure secure f 1 P P .js less overhead Machine Density Start Time P Linux Process 4200 0.3 s C Docker Container 3000 0.5 to 4 s Container in a MicroVM 450 1 to 7 s VM Linux v4.15; Docker v18.09; [Xeon 2.20 GHz; 88GB]
Serverless Execution via Unikernel SnapShots Cadden, et al. “ Skip Redundant Paths to Make Serverless Fast ”, In The Proceedings of EuroSys ‘20 I s there… a method to better enable reuse across the entire memory footprint of the function? We want to… Shorten the setup time of new function invocations 1. Improve cache density for fast repeat invocations 2.
Serverless Execution via Unikernel SnapShots What is a unikernel? Cadden, et al. “ Skip Redundant Paths to Make Serverless Fast ”, In The Proceedings of EuroSys ‘20 Function #1 + Language Runtime In SEUSS, functions are deployed single address space inside of dedicated unikernels filesystem shared libs 1. Unikernels support strong TCP/IP VMM scheduler isolation semantics hardware-lvl interface Unikernels 2. Enable “black box” capture of environment’s memory footprint F1 F2 F3 F4 into an snapshot (object) 3. Page-level sharing can be lightweight kernel applied ubiquitously across the application and kernel layers
Environment Snapshots Function invocation times are dominated by deterministic import & initialization procedures Functions code cold & libraries start f 1 {`foo`} .js ⇝ 1 1 P P time construct initialize import generate import run start environment runtime code bytecode arguments run time cold 7.67 ms A warm B 2.95 ms hot 0.82 ms initialization on boot Snapshots captured at strategic points in time can be used as templates for deploying execution 0 2 3 4 1 snapshot 7.67 MB 115 MB 9 MB cache unikernel runtime 4 MB function binaries snapshots snapshots 211 MB 12 MB SEUSS operating system replies requests IO core core 1 core 2 core 3 core 4
Environment Snapshots Immutable snapshot images acts as a reusable launch point for new function invocations Functions code warm & libraries start f 1 f 1 {`foo`} .js .js ⇝ 1 1 P P time construct initialize import generate import run start environment runtime code bytecode arguments run time cold 7.67 ms warm 2.95 ms hot 0.82 ms initialization on boot runtime snapshot used for new invocations 0 2 3 4 1 snapshot 7.67 MB 115 MB 9 MB cache unikernel runtime 4 MB function binaries snapshots snapshots 211 MB 12 MB SEUSS operating system replies requests IO core core 1 core 2 core 3 core 4
Environment Snapshots Function-specific snapshots provide the near-immediate execution of function bytecode Functions code & libraries f 1 f 1 hot start {`foo`} .js .js ⇝ 1 1 P P time construct initialize import generate import run start environment runtime code bytecode arguments run time cold 7.67 ms warm 2.95 ms hot 0.82 ms initialization on boot function snapshot used for repeat invocations 0 2 3 4 1 snapshot 7.67 MB 115 MB 9 MB cache unikernel runtime 4 MB function binaries snapshots snapshots 211 MB 12 MB SEUSS operating system replies requests IO core core 1 core 2 core 3 core 4
Snapshot Lineages Page-level sharing & copy-on-write (COW) can be applied to drastically reduce replicated state Page refs Written page Pages Registers time construct initialize import generate import run start environment runtime code bytecode arguments run time cold 7.67 ms warm 2.95 ms hot 0.82 ms initialization on boot runtime snapshot function snapshot 0 2 3 4 1 snapshot 7.67 MB 115 MB 9 MB cache unikernel runtime 4 MB function binaries snapshots snapshots 211 MB 12 MB SEUSS operating system replies requests IO core core 1 core 2 core 3 core 4
Snapshot Lineages Child snapshots contain only a memory ‘ diff’ of written pages Page refs Written page Pages Registers time construct initialize import generate import run start environment runtime code bytecode arguments run time cold 7.67 ms warm 2.95 ms hot Anticipatory optimization enabled by 0.82 ms initialization on boot accumulating state within the origin snapshot 0 2 3 4 1 snapshot 7.67 MB 115 MB 9 MB cache unikernel runtime 4 MB function binaries snapshots snapshots 211 MB 12 MB SEUSS operating system replies requests IO core core 1 core 2 core 3 core 4
SEUSS OS OS specialized for FaaS compute plane • Foundation event-driven Snapshot capture region multi-core kernel (x86_64 native) invocation <*.js> driver.js V8 JavaScript engine • Per-core job scheduler & Node.js network (NAT) layer Rumprun unikernel Ring 3 Solo5 • In-memory snapshot cache Ring 0 SEUSS operating system • Unprivileged unikernel guest: EbbRT LibraryOS virtio KVM-QEMU • POSIX0 ish unikernel (Rumprun) TX/RX VCPU queues • Minimal domain interface (Solo5)
FaaS Performance Control Plane Compute Plane Benchmark (single node) API Server Controller B • Functions run Internal Databases in Docker containers • 12-core, 88GB nodes • 3-node OpenWhisk cluster • custom benchmark tool
FaaS Platform Cache using Docker containers 1400 Sequential invocation requests to an Apache OpenWhisk compute node 1050 cache thrashing f(): x = 1 Start Time (ms) time 700 f(),g(), x = 4 h(),i(): time 350 Report the average start time 1 4 cache limit 16 64 256 1024 4096 No. of functions (Linux v4.15; Xeon 2.20 GHz; 88GB)
FaaS Platform Cache using Docker containers vs. unikernel snapshots 1400 Sequential invocation requests to an Apache OpenWhisk compute node 1050 f(): x = 1 Start Time (ms) time 700 f(),g(), x = 4 h(),i(): time 350 Report the average start time 1 4 16 64 256 1024 4096 No. of functions
FaaS Platform Throughput Invocation Goodput (Req/s) • 64 concurrent requests • NOP (‘hello world) invocations No. of function (log scale)
Resiliency to Traffic Bursts 32-second Burst Intervals 16-second Burst Intervals blue/purple: Blocking IO requests to an external HTTP host (~250ms) red: 128 concurrent CPU-bound functions (~150 ms)
Final Thoughts • Unikernel snapshots promote reuse Unikernels contexts in a safe, simple, and effective way F1 F2 A F2 B … • Prototype demonstrates a major advantage for serverless applications models snap2 read-only snap1 • In the end, high-performance cloud snapshots computing will continue to snap0 challenged our infrastructure software in new lightweight kernel • It will be the operating system (design, mechanisms & techniques) SEUSS OS that will address challenge and enable new workloads
Recommend
More recommend