Microservices reativos usando a stack do Netfmix na AWS Diego Pacheco Principal Software Architect at ilegra.com @diego_pacheco
www.ilegra.com
NetfmixOSS Stack
Why Netfmix? Netfmix My Problem Billions Requests Per Day Social Product 1/3 US internet Social Network bandwidth Video ~10k EC2 Instances Docs Multi-Region Apps 100s Microservices Chat Innovation + Solid Scalability Service Distributed T eams SOA, Microservices and Could reach some DevOps Benchmark Web Scale
AWS
Cloud Native
Principles Stateless Services SOA Ephemeral Instances Microservices Everything fails all the No Central Database NoSQL time Auto Scaling / Down Lightweight Serializable Scaling Objects Multi AZ and multi Latency tolerant Region protocols No SPOF DevOps Enabler Immutable Infrastructure Design for Failure Anti-Fragility (expected)
Right Set of Assumptons
Microservices
Reactive
Java Drivers X REST X
Simple View of the Architecture UI Zuul Microservice Cassandra Cluster
Stack
OSS
Zuul
Zuul
Karyon: Microbiology - Nucleus
RxNetty Reactive Extensions + Netty Server Lower Latency under Heavy Load Fewer Locks, Fewer Thread Migrations Consumes Less CPU Lower Object Allocation Rate
Karyon: CODE
Karyon: Reactive
Karyon: Reactive
Eureka and Service Discovery http://microservices.io/patterns/server-side-discovery.html
Eureka AWS Service Registry for Mid-tier Load balancing and Failover REST based Karyon and Ribbon Integration
Eureka
Eureka and Service Discovery
Availability
Histryx
Ribbon IPC Library Client Side Load Balancing Multi-Protocol (HTTP, TCP, UDP) Caching* Batching Reactive
Ribbon CODE
Ribbon CODE
RX-Java Reactive Extension of the JVM Async/Event based programming Observer Pattern Less 1mb Heavy usage by Netfmix OSS Stack
Archaius Confjguration Management Solution Dynamic and T yped Properties High Throughtput and Thread Safety Callbacks: Notifjcations of confjg changes JMX Beans Dynamic Confjg Sources: File, Db, DynamoDB, Zookeper Based on Apache Commons Confjguration
Archaius + Git Central Property Internal GIT Files Microservice Microservice Slave Side Car Microservice Microservice Slave Side Car Microservice Microservice Slave Side Car File File File System System System
Asgard
Asgard
Deploys Bake/Provision JOB Create Packer Launch
Dynomite: Distributed Cache https://github.com/Netfmix/dynomite
Dynomite Implements the Amazon Dynamo Similar to Cassandra, Riak and DynamoDB Strong Consistency – Quorum-like – No Data Loss Pluggable Scalable Redis / Memcached Multi-Clients with Dyno Can use most of redis commands Integrated with Eureka via Prana
Dynomite: Distributed Cache Isolate Failure – Avoid cascading Redundancy – NO SPOF Auto-Scaling Fault T olerance and Isolation Recovery Fallbacks and Degraded Experience Protect Customer from failures – Don’t throw Failures -> Failures VS Errors
Dynomite: Internals
Multi-Region Cluster Eureka Server Oregon D1 Prana Eureka Server Oregon D2 Prana Prana N California D3
Dynomite: CODE
Dynomite Contributions https://github.com/Netflix/dynomite/pull/200 https://github.com/Netflix/dynomite/pull/207 https://github.com/Netfmix/dynomite
Caos Engineering
Gatling Stress T esting T ool Scala DSL Run on top of Akka Simple to use
Chaos Arch ELB Eureka Zuul Zuul Microservice N1 Microservice N2 Cassandra Cluster
Running…
Chaos Results and Learnings Retry confjguration and Timeouts in Ribbon Right Class in Zuul 1.x (default retry only SocketException) RequestSpecifjcRetryHandler (Httpclient Exceptions) zuul.client.ribbon.MaxAutoRetries=1 zuul.client.ribbon.MaxAutoRetriesNextServer=1 zuul.client.ribbon.OkT oRetryOnAllOperations=true Eureka Timeouts It Works Everything needs to have redudancy ASG is your friend :-) Stateless Service FTW
Kafka / Storm :: Event System Microservice Producer
Chaos Results and Learnings Before: Data was not in Elastic Search Producers was loosing data After: No Data Loss It Works Changes: No logging on Microservice :( (Log was added) Code that publish events on a try-catch Retry confjg in kafka producer from 0 to 5
Main Challenges
Hacker Mindset
Next Steps IPC Spinnaker Containers Client side Aggregation DevOps 2.0 -> Remediation / Skynet
Pocs https://github.com/diegopacheco/netflixoss-pocs http://diego-pacheco.blogspot.com.br/search/label/netflix?max-results=30
Microservices reativos usando a stack do Netfmix na AWS Obrigado! Diego Pacheco Principal Software Architect at ilegra.com @diego_pacheco
Recommend
More recommend