heroku
play

HEROKU CAMINHO AT A ALTA ESCALA E DISPONIBILIDADE fabio@heroku.com - PowerPoint PPT Presentation

HEROKU CAMINHO AT A ALTA ESCALA E DISPONIBILIDADE fabio@heroku.com QCONSP 2014 FABIO KUNG Tech Lead , Runtime Systems at Heroku heroku scale web= 3 worker= 2 alta-escala-disponibilidade.herokuapp.com milhes de aplicaes (web) um dos


  1. HEROKU CAMINHO ATÉ A ALTA ESCALA E DISPONIBILIDADE fabio@heroku.com

  2. QCONSP 2014 FABIO KUNG Tech Lead , Runtime Systems at Heroku

  3. heroku scale web= 3 worker= 2

  4. alta-escala-disponibilidade.herokuapp.com

  5. milhões de aplicações (web)

  6. um dos maiores deployments de Linux containers (LXC) do mundo

  7. > 60k requisições por segundo

  8. > 5G requisições por dia

  9. 12FACTOR.NET portáveis plataformas modernas (cloud) elasticidade

  10. duas regiões em produção: us-east and eu-west

  11. diversas Availability Zones

  12. 2008

  13. 2009

  14. 2010

  15. CRESCIMENTO facebook/heroku

  16. 2011

  17. 2013

  18. VIBE

  19. starving-samurai-42.herokuapp.com

  20. https://www.flickr.com/photos/timriley/9361949580

  21. cultura hacker flickr/dominicotine

  22. TIMES E COMPONENTES

  23. TOTAL OWNERSHIP Dependências? Autonomia Poliglota Full stack

  24. CORE -> MICROSERVICES no free lunch

  25. INTERFACES IMPLÍCITAS documentação pobre, informal manifest-driven APIs evolução, updates, releases coordenados

  26. SISTEMAS DISTRIBUÍDOS retry circuit breaker rate limiting rollback (transações distribuidas) state replication cache ...

  27. HEROKU SCALE WEB=3 WORKER=5

  28. HEROKU SCALE WEB=3 WORKER=5

  29. TROUBLESHOOTING assincronicidade distributed tracing visibilidade!

  30. TESTES

  31. DEPLOYS

  32. DUPLICAÇÃO!

  33. EPHEMERALIZATION Do more with less.

  34. DOGFOODING

  35. TOOLS TEAM

  36. DEVCLOUDS boot your own Heroku @merman boot my cloud

  37. KERNEL PLATFORM

  38. DIREWOLF

  39. POSTGRESQL contra exemplo: RabbitMQ

  40. ORG ACCOUNTS

  41. MÚLTIPLAS TECNOLOGIAS diretrizes service toolkits produto poliglota

  42. #OPSLIFE

  43. plantões semanais

  44. ESCALATION PATH time todo na rotação gerente do time Incident Commander

  45. TRANSPARÊNCIA status.heroku.com

  46. csquared's Heroku Outage Lights System

  47. TIME DE OPS Total ownership?

  48. SRE SITE RELIABILITY ENGINEERS confiabilidade global capacity planning reviews retrospectivas de incidentes tools, dashboards fardo do plantão

  49. MUDANÇAS atualizar instâncias existentes vs. substituir por novas instâncias

  50. AVERSÃO A RISCO mudanças simples de uma linha -> catástrofe

  51. RIGOR

  52. "Hackers write Too Much Software. Need to change Process. Heroes mask Too Many Problems. Need to change Teamwork." -- Noah , Engineering Manager

  53. REVISÃO DE CÓDIGO async, membros remotos

  54. DOCUMENTAÇÃO

  55. DIAGRAMAS

  56. DESIGN

  57. BLOG DRIVEN DEVELOPMENT

  58. CFP grandes decisões difîceis

  59. CHECKLISTS

  60. Example: production checklist ✓ Has ops docs with executable instructions ✓ Has a high-fidelity staging setup with production parity ✓ Requested audit from the security team ✓ Alerts a human if it is down ✓ Simulated failures ✓ Uses structured logging ✓ Enforces SSL access ✓ Creds and rotation procedures are documented ✓ Send a launch email to engineering@ ✓ Move to Production on the Engineering Lifecycle board

  61. SUPORTE embutido

  62. BUS FACTOR Total ownership?

  63. BENEVOLENT DICTATORSHIP BDFL

  64. COOPETIÇÃO COMPETIÇÃO COOPERATIVA

  65. LXC ex.: DotCloud, container-rfc We lost the standards game for virtual machine images, but it feels like this community is tight nit enough we might be able to do something for Linux Containers. -- Alex Polvi (coreos.com)

  66. GIT $ git push heroku master Counting objects: 1, done. Writing objects: 100% (1/1), 181 bytes | 0 bytes/s, done. Total 1 (delta 0), reused 0 (delta 0) -----> Ruby app detected -----> Compiling Ruby ... To git@heroku.com:myapp.git 91dfe0b..f251ba7 master -> master ex.: GitHub

  67. 2012

  68. CLOUD ex.: AWS, AppEngine

  69. PESSOAS política de "não jerks"

  70. CORE -> TIMES INDEPENDENTES

  71. TOTAL OWNERSHIP

  72. FOCO? SRE produto - heróis + coordenação

  73. HEROKU NA EUROPA Furacão Sandy (2012) -> us-east -> us-west

  74. GERÊNCIA

  75. mdz 's Scaling Human Systems

  76. SLACK always too busy

  77. O QUE MUDOU? valores (Adam Wiggins)

  78. EPHEMERALIZATION Do more with less.

  79. MAKE IT REAL Ideas are cheap.

  80. SHIP IT Nothing is real until it's being used by a real user.

  81. DO IT WITH STYLE Aesthetic matters.

  82. INTUITION-DRIVEN | DATA-DRIVEN Users don't really know what they want.

  83. ... PROVE COM DADOS bikeshed@heroku.com

  84. DIVIDE AND CONQUER If it's hard, cut scope.

  85. TIMING MATTERS Maybe now isn't the right time.

  86. THROW THINGS AWAY Never be afraid to throw something away and do it again.

  87. https://www.flickr.com/photos/teich/9427507382/

  88. SMALL SHARP TOOLS Composability. The Art of Unix Programming . Also teams. Several small, autonomous, focused teams.

  89. PUT IT IN THE CLOUD Services, not software.

Recommend


More recommend