June 26, 2017 The Paved PaaS to Microservices Yunong Xiao, Principal Software Engineer, Netflix yunong@netflix.com, @yunongx, http://yunong.io
100 million customers in over 190 countries streaming 125 million hrs/day
What is a Platform as a Service (Paas), Anyway? “Platform as a service (PaaS)… allows customers to develop, run, and manage applications without the complexity of building and maintaining the infrastructure and platform …” –Wikipedia
Our Use Case
Netflix Edge API Backend Service A Script A Client Library A Backend Script B Client Service B Library B Script C Client Backend Library C Service C Script D … … … Client Library N Backend Script N Service N 100 1000+
Owned by client teams Clients Standalone Services Edge API Backend Services TV Backend Service A Discovery iOS Backend Service B Playback Backend Android Service C Non- member … … Windows Backend Service N Browsers Mostly
Know Your Customer
Goals Velocity Reliability
Our PaaS to Achieving Both 1. Standardized components 2. Preassembled platform 3. Automation and tooling
1. Standardized Components
What’s Inside a Microservice? RPC Discovery Registration Runtime OS Configuration Stream Metrics Logging Tracing Dashboards Alerts Processing
Why Standards?
Mélange of RPC
Microservice Interactions µService … uService µService Client µService µService … µService µService
Failure: When not If
Mean Time to Detection (MTTD) Mean Time to Repair (MTTR) “Is it fixed yet?” – Managers everywhere
N Flavors of RPC µService … uService µService Client µService µService … µService µService
One Standard RPC µService … uService µService Client µService µService … µService µService
Benefits of Standardizing RPC Discovery Registration Runtime OS Configuration Stream Metrics Logging Tracing Dashboards Alerts Processing Consistency Leverage Interoperability Quality Support
But I'm a Snowflake... Freedom & Responsibility O ff -road Innovation Reintegrate New Vacuum Burden
2. Preassembled Platform
Assembly Required OS Discovery Runtime Configuration RPC Registration Logging Dashboards Stream Processing Metrics Tracing Alerts
Getting Out of the Blocks Docs Copy/paste Which versions? Initialization Configuration Missing components Days or weeks
Velocity Reliability Product Innovation Not a single line of business logic!
Assembly Required OS Discovery Runtime Configuration RPC Registration Metrics Logging Dashboards Stream Processing Tracing Alerts
Preassembled Platform RPC Discovery Registration Runtime OS Configuration Stream Metrics Logging Tracing Dashboards Alerts Processing
Out of the Box
Component Management
Insights
Application, system, and runtime metrics & logs
Consistent application, system, and runtime metrics & logs Reduces MTTD & MTTR
Configures and Initializes Correctly BOEING 747-400 NORMAL PROCEDURES CHECKLIST TRANSPONDER…………………………………SET POWER UP / SAFETY CHECK First Officer Captain SOURCE SELECTORS…………………...…….SET CLOCKS…………………………………………..SET CIRCUITBREAKERS...………………...CHECKED CRT SELECTORS…………………………NORMAL BATTERY……………...………………….………ON PFD………………………………………CHECKED STANDBY POWER………………...………..AUTO ND…………………………………………CHECKED HYDRAULIC DEMAND PUMPS….………..…..OFF AUTOBRAKES…………………………………RTO WINDSHIELD WIPERS....……………….……..OFF EIU SELECTOR………………………………AUTO ALTERNATE FLAPS AND GEAR…………….OFF HDG REFERENCE SWITCH….…………NORMAL GEAR LEVER…….………….....…………..DOWN FMC MASTER SELECTOR…………………LEFT FLAPS…..……………………………....CHECKED GROUND PROX SYSTEM……………CHECKED APU….………………………………...….RUNNING ELECTRICAL SYSTEM....……SET/APU AVAIL ON APU BLEED AIR…...……………………………ON BEFORE STARTING ISOLATION VALVES…………………………OPEN First Officer Captain PACKS………………...……………………NORMAL HYDRAULIC DEMAND PUMPS……………………. .............................................AUTO, AUX (1 AND 4) PREFLIGHT BRAKE PRESSURE……..……………….NORMAL First Officer Captain FUEL QUANTITY………………..…………. ____KG FUEL SYSTEM………………………………….SET EMERGENCY EQUIPMENT…….…….CHECKED X-FEEDS…………………OPEN (1 & 4 CLOSED) FIRE PROTECTION…………...………CHECKED SEAT BELTS SIGN………………………………ON INTERRUPT SWITCHES………………………ON NOTOC………………………………….CHECKED PASSENGER OXYGEN……………..……NORMAL SHIPS PAPERS…………………………ON BOARD STAB TRIM CUTOUT SWITCHES………….AUTO PERFORMANCE DATA......…CHECKED AND SET NAV EQUIPMENT………………………..CHECKED V2……………………………….…………………SET
Versions Updates Compatibility
Important Questions
What’s in and out?
Maintenance vs Convenience RPC Discovery Registration Metrics Logging Tracing Runtime OS Configuration Stream Dashboards Alerts Processing Base Platform
Solution: Layers & Flavors Base platform Data access Rendering Backend …
How to Ensure Platform Correctness?
Test, Test, Test Unit Integration Functional Every PR Cloud
Dog Food with your own Service
Component Correctness RPC Discovery Registration Metrics Logging Tracing Runtime OS Configuration Stream Customers Dashboards Alerts Processing Gate Keeper Lock down Test Updates require versions components PRs
How Locked Down is it?
Tradeoffs Reliability vs Flexibility Consistency Support
Stay on paved path!
Season to Taste Config overrides Startup & shutdown hooks Access to 3rd party libs Swap, disable, or configure components Raw component access
Platform Versioning?
API Semantic Versioning 1.2.3 ^1.0.0 ~1.3.0
Use Conventional Changelog
3. Automation and Tooling
Ship a Feature
Steps Development Deployment Operations Testing
Development
CLI for common dev experience Env bootstrap Integrate tooling & services Run local & cloud
Local Development Live reload localhost PROD Attach debugger
Testing
Testing
Testing Preassembled Platform RPC Discovery Registration Runtime OS Configuration Stream Metrics Logging Tracing Dashboards Alerts Processing
Preassembled Platform RPC Discovery Registration Runtime OS Configuration Stream Metrics Logging Tracing Dashboards Alerts Processing Pre-Prod
Provide First Class Mocks Preassembled Platform RPC Discovery Registration Runtime OS Configuration Stream Metrics Logging Tracing Dashboards Alerts Processing
Mock Data Generation + RPC
Mock Ownership Stream Configuration Alerts Processing RPC Discovery Registration Metrics Logging Tracing
Platform Testing API • Just like a runtime API, need a testing API • Provide mocks interface for components • Gets platform out of the loop for providing mocks
Deployment
“Production is war!”
Experience Differences
Deploy and Manage Services • Pre-configured pipelines for deployment and rollback • Single command deploy to any stack • Integration for automated canary analysis • Pre-configured autoscaling
Operations
Consolidated View
Generated Dashboards & Alerts RPS CPU Latency Memory Error rates
Automated Analytics & Tooling CPU profiling Core dump analysis
Our PaaS to Velocity & Reliability 1. Standardized components 2. Preassembled platform 3. Automation and tooling
June 26, 2017 The Paved PaaS to Microservices Yunong Xiao, Principal Software Engineer, Netflix yunong@netflix.com, @yunongx, http://yunong.io
Recommend
More recommend