fine grained fault tolerance using device checkpoints
play

Fine-Grained Fault Tolerance using Device Checkpoints Asim Kadav - PowerPoint PPT Presentation

Fine-Grained Fault Tolerance using Device Checkpoints Asim Kadav with Matthew Renzelmann and Michael M. Swift University of Wisconsin-Madison 1 The (old) elephant in the room device + drivers OS (majority of kernel code) kernel 3rd


  1. Outline Introduction Fine-grained isolation Checkpoint-based recovery Evaluation and Conclusions 11 11

  2. Unit of fault tolerance: Driver entry point probe network xmit driver config network card 12 12

  3. Unit of fault tolerance: Driver entry point whole driver isolation probe network xmit driver config network card 12 12

  4. Unit of fault tolerance: Driver entry point probe network xmit driver config network card 12 12

  5. Unit of fault tolerance: Driver entry point FGFT isolation probe network xmit driver config network card 12 12

  6. Unit of fault tolerance: Driver entry point FGFT isolation probe network xmit driver config network card ★ Provide fault tolerance to specific driver entry points 12 12

  7. Unit of fault tolerance: Driver entry point FGFT isolation probe network xmit driver config network card ★ Provide fault tolerance to specific driver entry points ★ Can be applied to untested code or code marked suspicious by static or runtime tools 12 12

  8. Transactional support through code generation netdev netdev get ringparam network driver 13 13

  9. Transactional support through code generation netdev netdev s s get ringparam SFI t network t u u network driver b b driver s s 13 13

  10. Transactional support through code generation netdev netdev s s get ringparam SFI t network t u u network driver b b driver s s 13 13

  11. Transactional support through code generation Range Table netdev netdev Address Access rights s 0xffffa000 s Read get ringparam SFI t network t 0xffffa008 Write 0xffffa00a Read u u network driver b b driver s s 13 13

  12. Transactional support through code generation Range Table netdev netdev Address Access rights s 0xffffa000 s Read get ringparam SFI t network t 0xffffa008 Write 0xffffa00a Read u u network driver b b driver s s ★ Detects and recovers from: ★ Memory errors like invalid pointer accesses ★ Structural errors like malformed structures ★ Processor exceptions like divide by zero, stack corruption 13 13

  13. Transactional support through code generation Range Table netdev netdev netdev Address Access rights s 0xffffa000 s Read get ringparam SFI t network t 0xffffa008 Write 0xffffa00a Read u u network driver b b driver s s result ★ Detects and recovers from: ★ Memory errors like invalid pointer accesses ★ Structural errors like malformed structures ★ Processor exceptions like divide by zero, stack corruption 13 13

  14. Outline Introduction Fine-grained isolation Checkpoint-based recovery Conclusion 14 14

  15. Checkpointing drivers is hard ★ Easy to capture memory state network driver network card 15 15

  16. Checkpointing drivers is hard ★ Easy to capture memory state checkpoint network driver network card 15 15

  17. Checkpointing drivers is hard ★ Easy to capture memory state checkpoint network driver network card ★ Device state is not captured ★ Device configuration space 15 15

  18. Checkpointing drivers is hard ★ Easy to capture memory state checkpoint network driver network card ★ Device state is not captured ★ Device configuration space ★ Internal device registers and counters 15 15

  19. Checkpointing drivers is hard ★ Easy to capture memory state checkpoint network driver network card ★ Device state is not captured ★ Device configuration space ★ Internal device registers and counters ★ Memory buffer addresses used for DMA 15 15

  20. Checkpointing drivers is hard ★ Easy to capture memory state checkpoint network driver network card ★ Device state is not captured ★ Device configuration space ★ Internal device registers and counters ★ Memory buffer addresses used for DMA ★ Unique for every device 15 15

  21. Checkpointing drivers is hard ★ Easy to capture memory state checkpoint network Intuition: Operating systems already capture driver device state during power management network card ★ Device state is not captured ★ Device configuration space ★ Internal device registers and counters ★ Memory buffer addresses used for DMA ★ Unique for every device 15 15

  22. Intuition with power management ★ Refactor power management code for device checkpoints ★ Correct: Developer captures unique device semantics ★ Fast: Avoids probe and latency critical for applications ★ Ask developers to export checkpoint/restore in their drivers 16 16

  23. Device checkpoint/restore from PM code Suspend Resume Save config state Restore config state Save register state Restore register state Restore or reset Disable device DMA state Re-attach/Enable Save DMA state device Suspend device Device Ready 17 17

  24. Device checkpoint/restore from PM code Suspend Resume Save config state Restore config state Save register state Restore register state Restore or reset DMA state Re-attach/Enable Save DMA state device Suspend device Device Ready 17 17

  25. Device checkpoint/restore from PM code Suspend Resume Save config state Restore config state Save register state Restore register state Restore or reset DMA state Re-attach/Enable Save DMA state device Device Ready 17 17

  26. Device checkpoint/restore from PM code Suspend Resume Save config state Restore config state Save register state Restore register state Restore or reset Save DMA state DMA state Re-attach/Enable device Device Ready 17 17

  27. Device checkpoint/restore from PM code Checkpoint Resume Save config state Restore config state Save register state Restore register state Restore or reset Save DMA state DMA state Re-attach/Enable device Device Ready 17 17

  28. Device checkpoint/restore from PM code Checkpoint Resume Save config state Restore config state Save register state Restore register state Restore or reset Save DMA state DMA state Re-attach/Enable device 17 17

  29. Device checkpoint/restore from PM code Checkpoint Resume Save config state Restore config state Save register state Restore register state Restore or reset Save DMA state DMA state 17 17

  30. Device checkpoint/restore from PM code Checkpoint Restore Save config state Restore config state Save register state Restore register state Restore or reset Save DMA state DMA state 17 17

  31. Device checkpoint/restore from PM code Checkpoint Restore Save config state Restore config state Save register state Restore register state Restore or reset Save DMA state DMA state Suspend/resume code provides device checkpoint functionality 17 17

  32. Synergy of isolation and fast checkpoints netdev netdev network driver 18 18

  33. Synergy of isolation and fast checkpoints netdev netdev network driver xmit 18 18

  34. Synergy of isolation and fast checkpoints netdev netdev get ringparam network driver 18 18

  35. Synergy of isolation and fast checkpoints C netdev netdev get ringparam network driver 18 18

  36. Synergy of isolation and fast checkpoints C netdev netdev s s get ringparam SFI t t network u u network driver b b driver s s 18 18

  37. Synergy of isolation and fast checkpoints C netdev netdev s s get ringparam SFI t t network u u network driver b b driver s s 18 18

  38. Synergy of isolation and fast checkpoints C netdev netdev netdev s s get ringparam SFI t t network u u network driver b b driver s s 18 18

  39. Synergy of isolation and fast checkpoints C netdev netdev netdev s s get ringparam SFI t t network u u network driver b b driver s s 18 18

  40. Synergy of isolation and fast checkpoints C Range Table netdev netdev netdev Address Access rights s 0xffffa000 s Read get ringparam SFI t t 0xffffa008 Write network 0xffffa00a Read u u network driver b b driver s s 18 18

  41. Synergy of isolation and fast checkpoints C Range Table netdev netdev netdev Address Access rights s 0xffffa000 s Read get ringparam SFI t t 0xffffa008 Write network 0xffffa00a Read u u network driver b b driver s s 18 18

  42. Synergy of isolation and fast checkpoints C Range Table netdev netdev netdev Address Access rights s 0xffffa000 s Read get ringparam SFI t t 0xffffa008 Write network 0xffffa00a Read u u network driver b b driver s s 18 18

  43. Synergy of isolation and fast checkpoints C Range Table netdev netdev netdev Address Access rights s 0xffffa000 s Read get ringparam SFI t t 0xffffa008 Write network 0xffffa00a Read u u network driver b b driver s s R err 18 18

  44. Synergy of isolation and fast checkpoints C Range Table netdev netdev netdev Address Access rights s 0xffffa000 s Read get ringparam SFI t t 0xffffa008 Write network 0xffffa00a Read u u network driver b b driver s s R err FGFT provides transactional execution of driver entry points 18 18

  45. How does this give us transactional execution? 19 19

  46. How does this give us transactional execution? ★ Atomicity: All or nothing execution ★ Driver state: Run code in SFI module ★ Device state: Explicitly checkpoint/restore state 19 19

  47. How does this give us transactional execution? ★ Atomicity: All or nothing execution ★ Driver state: Run code in SFI module ★ Device state: Explicitly checkpoint/restore state ★ Isolation: Serialization to hide incomplete transactions ★ Re-use existing device locks to lock driver ★ Two phase locking 19 19

  48. How does this give us transactional execution? ★ Atomicity: All or nothing execution ★ Driver state: Run code in SFI module ★ Device state: Explicitly checkpoint/restore state ★ Isolation: Serialization to hide incomplete transactions ★ Re-use existing device locks to lock driver ★ Two phase locking ★ Consistency: Only valid (kernel, driver and device) states ★ Higher level mechanisms to rollback external actions ★ At most once device action guarantee to applications 19 19

  49. Outline Introduction Fine-grained isolation Checkpoint-based recovery Evaluation & Conclusions 20 20

  50. Evaluation platform ★ Criterion : ★ Latency of recovery: How fast is it? ★ Correctness of recovery: How well does it work? ★ Incremental effort: How much work is it? ★ Performance: How much does it cost? 21 21

  51. Evaluation platform ★ Criterion : ★ Latency of recovery: How fast is it? ★ Correctness of recovery: How well does it work? ★ Incremental effort: How much work is it? ★ Performance: How much does it cost? Driver Class Bus 8139too net PCI ★ Platform : e1000 net PCI ★ Implemented in Linux 2.6.29 r8169 net PCI ★ 2.5 GHz Intel Core 2 Quad pegasus net USB core w/ 4 GB DDR2 DRAM ★ Six drivers across three classes psmouse sound PCI ens1371 input serio 21 21

  52. Recovery speedup Recovery times 2,000ms Restart recovery FGFT recovery 1,500ms 1,000ms 500ms 0ms 8139too e1000 pegasus r8169 ens1371 psmouse 22 22

  53. Recovery speedup Recovery times 2,000ms Restart recovery 1800.00 FGFT recovery 1,500ms 1030.00 1,000ms 680.00 500ms 310.00 150.00 120.00 0ms 8139too e1000 pegasus r8169 ens1371 psmouse 22 22

  54. Recovery speedup Recovery times 2,000ms Restart recovery 1800.00 FGFT recovery 1,500ms 1030.00 1,000ms 680.00 500ms 410.00 310.00 295.00 150.00 120.00 115.00 5.00 0.07 0.04 0ms 8139too e1000 pegasus r8169 ens1371 psmouse 22 22

  55. Recovery speedup Recovery times 2,000ms Restart recovery 1800.00 FGFT recovery 1,500ms 1030.00 1,000ms 680.00 500ms 410.00 310.00 295.00 150.00 120.00 115.00 5.00 0.07 0.04 0ms 8139too e1000 pegasus r8169 ens1371 psmouse FGFT provides significant speedup in driver recovery and improves system availability 22 22

  56. Static and dynamic fault injection Driver Injected Native Faults Crashes 8139too 43 43 e1000 47 47 r8169 36 36 pegasus 34 33 ens1371 22 21 psmouse 46 46 TOTAL 258 256 23 23

  57. Static and dynamic fault injection Driver Injected Native FGFT Faults Crashes Crashes 8139too 43 43 NONE e1000 47 47 NONE r8169 36 36 NONE pegasus 34 33 NONE ens1371 22 21 NONE psmouse 46 46 NONE TOTAL 258 256 NONE 23 23

  58. Static and dynamic fault injection Driver Injected Native FGFT Faults Crashes Crashes 8139too 43 43 NONE e1000 47 47 NONE r8169 36 36 NONE pegasus 34 33 NONE ens1371 22 21 NONE psmouse 46 46 NONE TOTAL 258 256 NONE FGFT recovers from multiple failures : 1) restores non-class state and 2) does not affect other threads 23 23

  59. Programming e ff ort Driver LOC Isolation ann annotations Recovery ad y additions Driver Kernel LOC Moved LOC annotations annotations Added 8139too 1, 904 15 20 26 4 e1000 13, 973 32 32 10 r8169 2, 993 10 17 5 pegasus 1, 541 26 12 22 5 ens1371 2, 110 23 66 16 6 psmouse 2, 448 11 19 19 6 24 24

  60. Programming e ff ort Driver LOC Isolation ann annotations Recovery ad y additions Driver Kernel LOC Moved LOC annotations annotations Added 8139too 1, 904 15 20 26 4 e1000 13, 973 32 32 10 r8169 2, 993 10 17 5 pegasus 1, 541 26 12 22 5 ens1371 2, 110 23 66 16 6 psmouse 2, 448 11 19 19 6 FGFT requires a loadable kernel module (1200 LOC) and 38 lines of kernel changes to trap processor exceptions 24 24

  61. Throughput with isolation and recovery Native FGFT-­‑I/O-­‑all FGFT-­‑off-­‑I/O FGFT-­‑I/O-­‑1/2 netperf on Intel quad-core machines 25 25

  62. Throughput with isolation and recovery Throughput %age (Baseline 844 Mbps) 100 75 Native 50 FGFT-­‑I/O-­‑all FGFT-­‑off-­‑I/O 25 FGFT-­‑I/O-­‑1/2 0 e1000 Network Card netperf on Intel quad-core machines 25 25

  63. Throughput with isolation and recovery CPU : 2.4% Throughput %age (Baseline 844 Mbps) 100 100 75 Native 50 FGFT-­‑I/O-­‑all FGFT-­‑off-­‑I/O 25 FGFT-­‑I/O-­‑1/2 0 e1000 Network Card netperf on Intel quad-core machines 25 25

  64. Throughput with isolation and recovery CPU : 2.4% 2.4% Throughput %age (Baseline 844 Mbps) 100 100 93 75 Native 50 FGFT-­‑I/O-­‑all FGFT-­‑off-­‑I/O 25 FGFT-­‑I/O-­‑1/2 0 e1000 Network Card netperf on Intel quad-core machines 25 25

  65. Throughput with isolation and recovery CPU : 2.4% 2.4% 3.4% Throughput %age (Baseline 844 Mbps) 100 100 100 93 75 Native 50 FGFT-­‑I/O-­‑all FGFT-­‑off-­‑I/O 25 FGFT-­‑I/O-­‑1/2 0 e1000 Network Card netperf on Intel quad-core machines 25 25

  66. Throughput with isolation and recovery CPU : 2.4% 2.4% 3.4% 2.9% Throughput %age (Baseline 844 Mbps) 100 100 100 96 93 75 Native 50 FGFT-­‑I/O-­‑all FGFT-­‑off-­‑I/O 25 FGFT-­‑I/O-­‑1/2 0 e1000 Network Card netperf on Intel quad-core machines 25 25

  67. Throughput with isolation and recovery CPU : 2.4% 2.4% 3.4% 2.9% Throughput %age (Baseline 844 Mbps) 100 100 100 96 93 75 Native 50 FGFT-­‑I/O-­‑all FGFT-­‑off-­‑I/O 25 FGFT-­‑I/O-­‑1/2 0 FGFT can isolate and recover high bandwidth devices at low overhead without adding kernel subsystems e1000 Network Card netperf on Intel quad-core machines 25 25

  68. Summary 26 26

  69. Summary ★ FGFT runs driver code as transactions ★ Provides fault tolerance at incremental performance and programmer efforts ★ Introduced device checkpoints ★ Provides fast and complete recovery semantics ★ Fast device checkpoints should be explored in other domains like fast reboot, upgrade etc. 26 26

  70. Questions Asim Kadav ★ http://cs.wisc.edu/~kadav ★ kadav@cs.wisc.edu ★ Graduating in spring! 27

  71. Extra slides ★ Unlike suspend, devices continue to be accessed after a checkpoint ★ Rely on drivers following ACPI specifications for correctness 28

  72. Latency for device checkpoint/restore Driver Class Bus Checkpoint Restore Times Times 8139too net PCI 33 μ s 62 μ s e1000 net PCI 280ms 32 μ s r8169 net PCI 26 μ s 30 μ s pegasus net USB 0 μ s 4ms ens1371 sound PCI 111ms 33 μ s psmouse input serio 0 μ s 390ms Fast checkpoint/restore using suspend/resume 29 29

  73. Transforming drivers to run as FGFT If ¡(c==0) ¡{ . print ¡(“Driver ¡ init”); } . . Driver with annotations Static modifications 30 30

Recommend


More recommend