lecture 08 when disaster strikes and all else fails
play

Lecture 08: When disaster strikes and all else fails Hands-on Unix - PowerPoint PPT Presentation

Lecture 08: When disaster strikes and all else fails Hands-on Unix system administration DeCal 2012-10-22 1 / 27 Projects groups of four people Projects Tools of the submit one form per group with trade Disasters proposed


  1. Lecture 08: When disaster strikes and all else fails Hands-on Unix system administration DeCal 2012-10-22 1 / 27

  2. Projects groups of four people ❖ Projects ● Tools of the submit one form per group with trade ● Disasters proposed project ideas and SSH public Alleviating the pain keys we’ll be provisioning VMs and sending ● out an announcement 2 / 27

  3. ❖ Projects Tools of the trade ❖ What’s up? ❖ What’s hosing? ❖ What’s in Tools of the trade use? ❖ Too much traffic ❖ Too many files ❖ Low-level “files” ❖ Too many terminals ❖ sudo ❖ Other tools Disasters Alleviating the pain 3 / 27

  4. What’s up? uptime : how long continuously ❖ Projects ● Tools of the running, what’s the load average trade ❖ What’s up? ❖ What’s 1, 5, 15 min average number of hosing? ✦ ❖ What’s in processes waiting for CPU (or IO) use? ❖ Too much traffic ❖ Too many w , who : who’s logged in on machine ● files ❖ Low-level “files” write : write to a logged-in user ❖ Too many ✦ terminals wall : write to all logged-in users ❖ sudo ✦ ❖ Other tools Disasters Alleviating the pain 4 / 27

  5. What’s hosing? ❖ Projects top , htop (Linux), ps ( ps aux , ● Tools of the ps elf ) trade ❖ What’s up? similarly iftop for network i nter f ace ❖ What’s ● hosing? bandwidth, iotop (Linux) for disk IO ❖ What’s in use? ❖ Too much traffic ❖ Too many files ❖ Low-level “files” ❖ Too many terminals ❖ sudo ❖ Other tools Disasters Alleviating the pain 5 / 27

  6. What’s in use? “The action can’t be completed. . . in use” ❖ Projects Tools of the (Windows) trade ❖ What’s up? “The operation can’t be completed. . . in ❖ What’s hosing? use” (Mac OS X) ❖ What’s in use? ❖ Too much lsof for files traffic ● ❖ Too many files lsof -i for network ports ● ❖ Low-level “files” see also : netstat -pant , fuser ● ❖ Too many terminals ❖ sudo ❖ Other tools Disasters Alleviating the pain 6 / 27

  7. Too much traffic ❖ Projects netcat : “pipe” over TCP/UDP ● Tools of the wireshark , tshark , tcpdump : trade ● ❖ What’s up? packet sniffer/analyzer ❖ What’s hosing? nmap : network scanner ❖ What’s in ● use? ❖ Too much traffic ❖ Too many files ❖ Low-level “files” ❖ Too many terminals ❖ sudo ❖ Other tools Disasters Alleviating the pain 7 / 27

  8. Too many files du , df : directory, filesystem disk ❖ Projects ● Tools of the space usage trade ❖ What’s up? scp ( s ecure c o p y): transfer files over ❖ What’s ● hosing? SSH ❖ What’s in use? ❖ Too much rsync ( r emote sync ): intelligently ● traffic ❖ Too many transfer files (often over SSH) files ❖ Low-level tar ( t ape ar chiver): combine files “files” ● ❖ Too many terminals into a tarball ❖ sudo ❖ Other tools Disasters Alleviating the pain 8 / 27

  9. Low-level “files” ❖ Projects fdisk , parted (Linux): edit ● Tools of the partition table trade ❖ What’s up? fsck : check filesystem for errors ❖ What’s ● hosing? dd : copy block devices ❖ What’s in ● use? ❖ Too much traffic ❖ Too many files ❖ Low-level “files” ❖ Too many terminals ❖ sudo ❖ Other tools Disasters Alleviating the pain 9 / 27

  10. Too many terminals screen , tmux ❖ Projects ● Tools of the “metaterminal” ● trade ❖ What’s up? ❖ What’s access multiple terminal sessions ✦ hosing? ❖ What’s in inside a single terminal session use? ❖ Too much traffic ❖ Too many other features: persistence (after ● files ❖ Low-level logging off), session sharing (between “files” ❖ Too many users) terminals ❖ sudo ❖ Other tools Disasters Alleviating the pain 10 / 27

  11. sudo ❖ Projects sudo : s witch u ser do (usually used ● Tools of the to give your command root powers) trade ❖ What’s up? ❖ What’s hosing? ❖ What’s in use? ❖ Too much traffic ❖ Too many files ❖ Low-level “files” ❖ Too many terminals ❖ sudo ❖ Other tools via xkcd.com Disasters Alleviating the pain 11 / 27

  12. Other tools ❖ Projects ldd (shared library dependencies), ● Tools of the truss or strace (trace system trade ❖ What’s up? calls) ❖ What’s hosing? md5sum : file checksum ❖ What’s in ● use? ❖ Too much watch : execute command and ● traffic ❖ Too many repeatedly show output files ❖ Low-level “files” seq : print sequence of numbers ● ❖ Too many terminals ❖ sudo ❖ Other tools Disasters Alleviating the pain 12 / 27

  13. ❖ Projects Tools of the trade Disasters ❖ Software meltdowns ❖ Hardware Disasters meltdowns ❖ Criminals on the loose ❖ Escalation of problems ❖ 2003 Northeast blackout ❖ 2003 Northeast blackout Alleviating the pain 13 / 27

  14. Software meltdowns ❖ Projects system load ( uptime command) too ● Tools of the damn high trade Disasters remote access (networking, firewall, ● ❖ Software meltdowns SSH) broken ❖ Hardware meltdowns ❖ Criminals on the loose ❖ Escalation of problems ❖ 2003 Northeast blackout ❖ 2003 Northeast blackout Alleviating the pain 14 / 27

  15. Hardware meltdowns failed hard drives ❖ Projects ● Tools of the failed fans, power supplies, CPU, RAM trade ● Disasters ❖ Software meltdowns ❖ Hardware meltdowns ❖ Criminals on the loose ❖ Escalation of problems ❖ 2003 Northeast blackout ❖ 2003 Northeast blackout Alleviating the pain 15 / 27

  16. Criminals on the loose crackers will do Bad Things ❖ Projects ● Tools of the compromised accounts trade ● Disasters looks can be deceiving, uncertain what ● ❖ Software meltdowns to trust ❖ Hardware meltdowns ❖ Criminals on the loose ❖ Escalation of problems ❖ 2003 Northeast blackout ❖ 2003 Northeast blackout Alleviating the pain 16 / 27

  17. Escalation of problems we like to build systems on top of each ❖ Projects ● Tools of the other trade Disasters if one thing fails, it may break other ● ❖ Software meltdowns things, causing other things to fail ❖ Hardware meltdowns ❖ Criminals on the loose ❖ Escalation of problems ❖ 2003 Northeast blackout ❖ 2003 Northeast blackout Alleviating the pain 17 / 27

  18. 2003 Northeast blackout August 13, 2003, 9:21pm EDT (via en.wikipedia.org ) 18 / 27

  19. 2003 Northeast blackout August 14, 2003, 9:03pm EDT (via en.wikipedia.org ) 19 / 27

  20. ❖ Projects Tools of the trade Disasters Alleviating the pain Alleviating the pain ❖ Be Prepared ❖ Power management ❖ Out-of-band management ❖ Redundancy ❖ Monitoring ❖ Security ❖ Backups 20 / 27

  21. Be Prepared Boy Scout motto ❖ Projects ● Tools of the Murphy’s Law: “Anything that can go trade ● Disasters wrong, will go wrong.” Alleviating the pain s— happens ● ❖ Be Prepared ❖ Power management ❖ Out-of-band management ❖ Redundancy ❖ Monitoring ❖ Security ❖ Backups 21 / 27

  22. Power management ❖ Projects Uninterruptible Power Supply (UPS) ● Tools of the many UPSes can remotely power cycle trade ● Disasters servers Alleviating the pain ❖ Be Prepared ❖ Power management ❖ Out-of-band management ❖ Redundancy ❖ Monitoring ❖ Security ❖ Backups 22 / 27

  23. Out-of-band management separate hardware that can be ❖ Projects ● Tools of the remotely accessed trade Disasters independent from rest of hardware, ● Alleviating the pain dedicated NIC ❖ Be Prepared ❖ Power can access BIOS, power cycle, provide ● management ❖ Out-of-band visual display management ❖ Redundancy e.g., IPMI, Dell DRAC, Sun LOM ● ❖ Monitoring ❖ Security ❖ Backups 23 / 27

  24. Redundancy dual redundant power supplies typical ❖ Projects ● Tools of the RAID trade ● Disasters failover servers for high availability ● Alleviating the pain spare parts (hard drives!) for swapping ● ❖ Be Prepared ❖ Power management ❖ Out-of-band management ❖ Redundancy ❖ Monitoring ❖ Security ❖ Backups 24 / 27

  25. Monitoring ❖ Projects many large scale operations (Google, ● Tools of the Facebook) have many failed servers at trade Disasters any point in time, monitoring servers Alleviating the pain reroute traffic appropriately ❖ Be Prepared ❖ Power monitor syslog ● management ❖ Out-of-band SNMP traps ● management ❖ Redundancy alarm notification by email, text ● ❖ Monitoring ❖ Security message ❖ Backups 25 / 27

  26. Security subscribe to OS security ❖ Projects ● Tools of the announcements trade Disasters Intrusion Detection Software (e.g., ● Alleviating the pain snort, bro) ❖ Be Prepared ❖ Power be wary of lax permissions ● management ❖ Out-of-band limit root access ● management ❖ Redundancy ❖ Monitoring ❖ Security ❖ Backups 26 / 27

  27. Backups user data, system configuration ❖ Projects ● Tools of the ideally daily, weekly, monthly rotations trade ● Disasters RAID is not a backup ● Alleviating the pain e.g., rsync , cron , rsnapshot ● ❖ Be Prepared ❖ Power management ❖ Out-of-band management ❖ Redundancy ❖ Monitoring ❖ Security ❖ Backups 27 / 27

Recommend


More recommend