where s the fire aka my site is down now what
play

Wheres the fire? AKA: My site is down now what? Kristen Pol - PowerPoint PPT Presentation

Wheres the fire? AKA: My site is down now what? Kristen Pol answers@hook42.com My name is Kristen. Kristen Pol Hook 42 CTO / Architect Drupal for 12 years! kristen@hook42.com @kristen_pol answers@hook42.com answers@hook42.com Who


  1. Where’s the fire? AKA: My site is down … now what? Kristen Pol answers@hook42.com

  2. My name is Kristen. Kristen Pol Hook 42 CTO / Architect Drupal for 12 years! kristen@hook42.com @kristen_pol answers@hook42.com answers@hook42.com

  3. Who are you? Builder? Developer? All the roles? PM? Drupal Drupal Newbie? Veteran? Themer? Drupal Intermediate? answers@hook42.com

  4. What are some website disasters? Site down Site very slow Files directory deleted Code deleted Database deleted Email not working 3 rd party services not working answers@hook42.com answers@hook42.com

  5. What are some causes? Increased tra ffi c Application Legitimate Slow queries Nefarious Slow crons CDN/WAF Hit edge case Hosting Insu ffi cient caching Router Security breach Network Human error File system Drop database or tables Security breach Remove code or fi les Mail server Delete via UI 3 rd party services … answers@hook42.com answers@hook42.com

  6. How can you handle website disasters? ü Planning ü Monitoring ü Diagnostics ü Support ü Recovery ü Prevention answers@hook42.com answers@hook42.com

  7. Don’t panic! answers@hook42.com answers@hook42.com

  8. PL PLANNIN ING answers@hook42.com answers@hook42.com

  9. What is disaster planning? “A disaster recovery plan (DRP) is a documented process or set of procedures to recover and protect a business IT infrastructure in the event of a disaster .” answers@hook42.com answers@hook42.com

  10. Create process that works for you & your “client”. Example: ü Check other websites ü Check status pages ü Run traceroute ü Email urgent@example.com ü Check urgent coverage calendar ü Ping developer(s) via chat, text, phone ü Open internal support ticket answers@hook42.com answers@hook42.com

  11. Make sure to document and train devs how to… ü Access all the services ü Diagnosis issues ü Open support tickets ü Deploy a hot fi x ü Access backups ü Recover site from backups ü Log urgent issues answers@hook42.com answers@hook42.com

  12. MONIT ITORIN ING answers@hook42.com answers@hook42.com

  13. What is website monitoring? “Website monitoring is the process of testing and verifying that end-users can interact with a website or web application as expected .” answers@hook42.com answers@hook42.com

  14. Here are a few popular monitoring tools. answers@hook42.com answers@hook42.com

  15. You can con fi gure checks. answers@hook42.com answers@hook42.com

  16. You can track uptime. answers@hook42.com answers@hook42.com

  17. You can get alerts! answers@hook42.com answers@hook42.com

  18. DIA IAGNOSTIC ICS answers@hook42.com answers@hook42.com

  19. What is diagnostics? “Software diagnostics refers to concepts, techniques, and tools that allow for obtaining fi ndings, conclusions, and evaluations about software systems .” answers@hook42.com answers@hook42.com

  20. Here are some diagnostic tools. Traceroutes Status pages Logs Application Performance Management (APM) Software Drupal modules answers@hook42.com answers@hook42.com

  21. Traceroute shows round- trip times between you and destination server. Source: ¡h*p://www.maxcdn.com/one/assets/post-­‑images/trace.png ¡ answers@hook42.com answers@hook42.com

  22. Here’s an example of a bad traceroute. answers@hook42.com answers@hook42.com

  23. Check service status pages. Figure out which ones your site uses! Hosting CDN/WAF Acquia CloudFlare Pantheon CloudFront Platform.sh EdgeCast Blackmesh Fastly Rackspace MaxCDN … … Mail Services Others MailGun Analytics Mandrill Marketing Automation SendGrid … … answers@hook42.com answers@hook42.com

  24. Check service status pages. Many look similar. Some are location-based. answers@hook42.com answers@hook42.com

  25. Check the server logs. Server logs are hosting dependent. Acquia error.log php-errors.log drupal-watchdog.log Pantheon nginx-error.log php-error.log answers@hook42.com answers@hook42.com

  26. Check the Drupal logs. Drupal logs depend on site configuration. Database Logging module (core) File Logger module Logging and Alerts module O ff -site logging via RabbitMQ Logs, Monolog, Logstash, etc. answers@hook42.com answers@hook42.com

  27. Here are a few APM tools. answers@hook42.com answers@hook42.com

  28. You can analyze the app. answers@hook42.com answers@hook42.com

  29. You can analyze the db. answers@hook42.com answers@hook42.com

  30. You can analyze the db. answers@hook42.com answers@hook42.com

  31. You can analyze the db. answers@hook42.com answers@hook42.com

  32. And drill down into code. answers@hook42.com answers@hook42.com

  33. And drill down into queries. answers@hook42.com answers@hook42.com

  34. Drupal modules to help diagnose issues. ü Blame ü Hacked ü Security Review ü Logging and Alerts (emaillog) answers@hook42.com answers@hook42.com

  35. SUPPO PPORT answers@hook42.com answers@hook42.com

  36. What is tech support? “Technical support refers to a plethora of services by which enterprises provide de assistance to users of assistance to users of technology gy produ ducts such as as mobile phones, televisions, computers, software products or other electronic or mechanical goods.” answers@hook42.com answers@hook42.com

  37. Opening a support ticket. ü First try to make sure it’s not the Drupal site that is the problem ü Determine where to open ticket(s) ü Is site down or severely impacted? Open emergency ticket! ü Be polite ü Thank them for their help answers@hook42.com answers@hook42.com

  38. Give tech support what they need. ü Detailed explanation of problem ü Level of impact ü Traceroute(s) ü Location(s) (if relevant) ü Steps to reproduce ü Diagnostic data when available ü Actions taken to remedy (if any) answers@hook42.com answers@hook42.com

  39. RECOVERY RECOVERY answers@hook42.com answers@hook42.com

  40. What is disaster recovery? “Disaster recovery involves a set of policies and procedures to enable the recovery or continuation of vital technology infrastructure and systems following a natural or human-induced disaster .” answers@hook42.com answers@hook42.com

  41. How do you recover? It depends! answers@hook42.com answers@hook42.com

  42. Is it hackers? Block IPs. answers@hook42.com answers@hook42.com

  43. Is it hosting, CDN, 3 rd party services, or too much good tra ffi c? Open support tickets. answers@hook42.com answers@hook42.com

  44. Is it bad code or con fi g? Update and push hot fi x. answers@hook42.com answers@hook42.com

  45. Is it completely un fi xable? Recover from backups! answers@hook42.com answers@hook42.com

  46. PR PREVENTIO ION answers@hook42.com answers@hook42.com

  47. What is prevention? “ Measures taken to detect, contain, and forestall events or circumstances which , if left unchecked, could result in a disaster .” answers@hook42.com answers@hook42.com

  48. Some prevention tips… ü Managed hosting (if possible) ü Check automated daily backups ü Use code repository ü Track and tag releases ü Dev => Test => Live ü Test & backup before updating live! ü Monitor APM trends regularly ü Monitor long-term load time trends regularly answers@hook42.com answers@hook42.com

  49. And more tips… ü Con fi gure caching ü Spread out cron jobs ü Reduce number of modules ü Update core and modules regularly ü Proactively fi x errors in logs ü Auto-block bad IP addresses ü Peer review code ü Limit access answers@hook42.com answers@hook42.com

  50. Any questions? answers@hook42.com answers@hook42.com

  51. THANKS! THANKS! Have more questions? Email us at: answers@hook42.com answers@hook42.com answers@hook42.com

  52. Join us for Sprints ¡ ¡ Friday, ¡May ¡13 ¡at ¡the ¡ConvenMon ¡Center ¡ First-­‑Time ¡Sprinter ¡Workshop ¡-­‑ ¡9am-­‑12pm ¡in ¡Room ¡271-­‑273 ¡ Mentored ¡Core ¡Sprint ¡-­‑ ¡9am-­‑6pm ¡in ¡Room ¡275-­‑277 ¡ General ¡Sprints ¡-­‑ ¡9am-­‑6pm ¡in ¡Room ¡278-­‑282 ¡ answers@hook42.com

  53. So How Was It? Tell Us What You Think Evaluate this session - https://events.drupal.org/neworleans2016/sessions/wheres-fire- aka-my-site-down-now-what Thanks! answers@hook42.com

Recommend


More recommend