HEROKU CAMINHO ATÉ A ALTA ESCALA E DISPONIBILIDADE fabio@heroku.com
QCONSP 2014 FABIO KUNG Tech Lead , Runtime Systems at Heroku
heroku scale web= 3 worker= 2
alta-escala-disponibilidade.herokuapp.com
milhões de aplicações (web)
um dos maiores deployments de Linux containers (LXC) do mundo
> 60k requisições por segundo
> 5G requisições por dia
12FACTOR.NET portáveis plataformas modernas (cloud) elasticidade
duas regiões em produção: us-east and eu-west
diversas Availability Zones
2008
2009
2010
CRESCIMENTO facebook/heroku
2011
2013
VIBE
starving-samurai-42.herokuapp.com
https://www.flickr.com/photos/timriley/9361949580
cultura hacker flickr/dominicotine
TIMES E COMPONENTES
TOTAL OWNERSHIP Dependências? Autonomia Poliglota Full stack
CORE -> MICROSERVICES no free lunch
INTERFACES IMPLÍCITAS documentação pobre, informal manifest-driven APIs evolução, updates, releases coordenados
SISTEMAS DISTRIBUÍDOS retry circuit breaker rate limiting rollback (transações distribuidas) state replication cache ...
HEROKU SCALE WEB=3 WORKER=5
HEROKU SCALE WEB=3 WORKER=5
TROUBLESHOOTING assincronicidade distributed tracing visibilidade!
TESTES
DEPLOYS
DUPLICAÇÃO!
EPHEMERALIZATION Do more with less.
DOGFOODING
TOOLS TEAM
DEVCLOUDS boot your own Heroku @merman boot my cloud
KERNEL PLATFORM
DIREWOLF
POSTGRESQL contra exemplo: RabbitMQ
ORG ACCOUNTS
MÚLTIPLAS TECNOLOGIAS diretrizes service toolkits produto poliglota
#OPSLIFE
plantões semanais
ESCALATION PATH time todo na rotação gerente do time Incident Commander
TRANSPARÊNCIA status.heroku.com
csquared's Heroku Outage Lights System
TIME DE OPS Total ownership?
SRE SITE RELIABILITY ENGINEERS confiabilidade global capacity planning reviews retrospectivas de incidentes tools, dashboards fardo do plantão
MUDANÇAS atualizar instâncias existentes vs. substituir por novas instâncias
AVERSÃO A RISCO mudanças simples de uma linha -> catástrofe
RIGOR
"Hackers write Too Much Software. Need to change Process. Heroes mask Too Many Problems. Need to change Teamwork." -- Noah , Engineering Manager
REVISÃO DE CÓDIGO async, membros remotos
DOCUMENTAÇÃO
DIAGRAMAS
DESIGN
BLOG DRIVEN DEVELOPMENT
CFP grandes decisões difîceis
CHECKLISTS
Example: production checklist ✓ Has ops docs with executable instructions ✓ Has a high-fidelity staging setup with production parity ✓ Requested audit from the security team ✓ Alerts a human if it is down ✓ Simulated failures ✓ Uses structured logging ✓ Enforces SSL access ✓ Creds and rotation procedures are documented ✓ Send a launch email to engineering@ ✓ Move to Production on the Engineering Lifecycle board
SUPORTE embutido
BUS FACTOR Total ownership?
BENEVOLENT DICTATORSHIP BDFL
COOPETIÇÃO COMPETIÇÃO COOPERATIVA
LXC ex.: DotCloud, container-rfc We lost the standards game for virtual machine images, but it feels like this community is tight nit enough we might be able to do something for Linux Containers. -- Alex Polvi (coreos.com)
GIT $ git push heroku master Counting objects: 1, done. Writing objects: 100% (1/1), 181 bytes | 0 bytes/s, done. Total 1 (delta 0), reused 0 (delta 0) -----> Ruby app detected -----> Compiling Ruby ... To git@heroku.com:myapp.git 91dfe0b..f251ba7 master -> master ex.: GitHub
2012
CLOUD ex.: AWS, AppEngine
PESSOAS política de "não jerks"
CORE -> TIMES INDEPENDENTES
TOTAL OWNERSHIP
FOCO? SRE produto - heróis + coordenação
HEROKU NA EUROPA Furacão Sandy (2012) -> us-east -> us-west
GERÊNCIA
mdz 's Scaling Human Systems
SLACK always too busy
O QUE MUDOU? valores (Adam Wiggins)
EPHEMERALIZATION Do more with less.
MAKE IT REAL Ideas are cheap.
SHIP IT Nothing is real until it's being used by a real user.
DO IT WITH STYLE Aesthetic matters.
INTUITION-DRIVEN | DATA-DRIVEN Users don't really know what they want.
... PROVE COM DADOS bikeshed@heroku.com
DIVIDE AND CONQUER If it's hard, cut scope.
TIMING MATTERS Maybe now isn't the right time.
THROW THINGS AWAY Never be afraid to throw something away and do it again.
https://www.flickr.com/photos/teich/9427507382/
SMALL SHARP TOOLS Composability. The Art of Unix Programming . Also teams. Several small, autonomous, focused teams.
PUT IT IN THE CLOUD Services, not software.
Recommend
More recommend