Drive-Thru: Drive-Thru: Fast, Accurate Evaluation of Fast, Accurate Evaluation of Storage Power Management Storage Power Management Daniel Peek Jason Flinn University of Michigan 1
Power Management Needed Power Management Needed • Battery lifetime is limited • Storage is an energy hog • Power management effective • Many competing techniques [Helmbold 96], [Weissel 02], [Douglis 94], [Papthanasiou 03] How can we evaluate possible power management strategies? 2
Design Goals Design Goals Storage power management evaluation should be: Fast – Want to explore many policies/parameters/workloads Accurate – Both time and energy predictions are important Portable – Easy to apply technique to other file systems/OSs 3
Trace Replay Trace Replay Write (file1) Trace Replay Tool Stat (file2) Delay 5s Prerecorded Trace Fast Accurate Portable 4
Trace Replay Without Idle Time Trace Replay Without Idle Time Write (file1) Trace Replay Tool Stat (file2) Delay 5s Prerecorded Trace Fast Accurate Portable 5
Simulation Simulation Write (file1) Trace Replay Tool Stat (file2) Delay 5s Prerecorded Trace Simulator Estimated Time & Energy ? Fast Accurate Portable 6
Outline Outline • Motivation • Design • Validation • Case Study • Related Work • Conclusion 7
The Big Idea The Big Idea • Time Dependent – Disk spindown – Writeback of dirty blocks in the buffer cache • Time Independent – Mapping reads to disk blocks – Satisfying accesses from the buffer cache 8
Layers of Power Management Layers of Power Management Time-Independent Time-Dependent Activity Activity Mapping reads Send modifications to File System to disk blocks network server Satisfying Writeback of dirty Buffer Cache Accesses blocks Accesses Disk Spindown 9
A Hybrid Methodology A Hybrid Methodology The accuracy of trace replay without the cost • Replay time independent behavior without idle time • Simulate time dependent behavior Drive-Thru Replay Tool Write (file1) Stat (file2) File System Log Device Log Delay 5s Prerecorded Trace Base Trace Estimated Time & Energy Simulator 10
Base Traces Base Traces File System Log Device Log Base Trace Write (file1) Write (file1) Write sector 8 Write sector 8 Sync Stat (file2) Stat (file2) Read sector 4 Read sector 4 Sync Stat (file2) Stat (file2) 11
Merging Replay and Simulation Merging Replay and Simulation Write Device Stat Device Delay File1 Write File2 Read Drive-Thru Simulator I/O Simulator Time & Energy • Fast: All idle time is simulated • Portable: Drive-Thru < 1000 lines 12
Drive-Thru Operations Drive-Thru Operations • Delay: Drive-Thru Simulator • Coalesce: Drive-Thru Simulator • Reorder: Drive-Thru Simulator 13
Outline Outline • Motivation • Design • Validation • Case Study • Related Work • Conclusion 14
Validation Setup Validation Setup How accurate is Drive-Thru? – Compare to most accurate method, replay with idle times • Ran six 45-minute trace segments – iPAQ 3870 – Ext2 file system – Hitachi 1 GB Microdrive 15
Validating with ext2 Validating with ext2 5000 4000 Energy (J) 3000 2000 1000 0 Purcell Berlioz Messiaen NFS15 Trace Replay with Idle Times Drive-Thru • Drive-Thru prediction on average within 0.21% 16
Validating with ext2 Validating with ext2 5000 4000 Energy (J) 3000 2000 1000 0 Purcell Berlioz Messiaen NFS15 Trace Replay with Idle Times Drive-Thru • Drive-Thru prediction on average within 0.21% 17
Validating with ext2 Validating with ext2 400 File System Energy (J) 300 200 100 0 Purcell Berlioz Messiaen NFS15 Trace Replay with Idle Times Drive-Thru • Drive-Thru prediction on average within 3% 18
Network File System Validation Network File System Validation How Accurate is Drive-Thru for network? – Compare to trace replay with idle times – Compare over 3 power management modes • Blue file system [Nightingale 04] • 802.11b card 19
802.11b Power Management 802.11b Power Management 802.11b power management modes: • Continuously Aware Mode (CAM) – High performance, high power • Power-Saving Mode (PSM) – Low performance, low power • Self-Tuning Power Mangement (STPM) [Anand 03] – Adaptively toggles between CAM and PSM 20
Validating Network Predictions Validating Network Predictions File System Energy (J) 4000 3000 2000 1000 0 Purcell Purcell Purcell NFS15 NFS15 NFS15 (CAM) (PSM) (STPM) (CAM) (PSM) (STPM) Trace Replay with Idle Times Drive-Thru • Drive-Thru prediction on average within 7% 21
Outline Outline • Motivation • Design • Validation • Case Study • Related Work • Conclusion 22
Case Study: Local Storage Case Study: Local Storage Proposed power management strategies: • Flush on write [Papathanasiou 02, Weissel 02] – When a dirty block is written, writeback all dirty blocks • Flush on spindown [Weissel 02] – Before the disk spins down, writeback all dirty blocks • Increase writeback delay [Papathanasiou 03, Weissel 02] – Increase opportunities for write aggregation Ran 6 traces on ext2 with Drive-Thru (8-28 hours each) 23
Case Study: Local Storage Case Study: Local Storage 250 File System Energy (J) 200 150 100 Default Flush on Write 50 Flush on Spindown Flush on Both 0 0 10 20 30 40 50 60 Writeback Delay (s) • Complete evaluation over 40,000x faster than trace replay 24
The 2-Second Peak The 2-Second Peak 250 Trace Replay Tool File System Energy (J) 200 Buffer Cache 150 100 Device Default Flush on Write 50 Flush on Spindown 0 1 2 Flush on Both 0 Time (s) 0 10 20 30 40 50 60 Spinup Spinup Writeback Delay (s) Spindown Read Write 25
Microbenchmarks: Danger! : Danger! Microbenchmarks 900 File System Energy (J) Default 800 Flush on Write 700 Flush on Spindown Flush on Both 600 500 400 300 200 100 0 0 10 20 30 40 50 60 Writeback Delay (s) • Need to evaluate over representative activity 26
Case Study: Network File System Case Study: Network File System Proposed power management strategies: • Flush on write – When any data is sent to server, flush all data to server • Flush on PSM – Before network card transitions, flush all data to server • Increase writeback delay – Increase opportunities for bulk transfer Ran 6 traces on BlueFS with Drive-Thru (8-28 hours) 27
Case Study: Network Storage Case Study: Network Storage File System Energy (J) 600 500 400 300 Default 200 Flush on Write Flush on PSM 100 Flush on Both 0 0 10 20 30 40 50 60 Writeback Delay (s) • Complete evaluation over 13,000x faster than trace replay 28
Improving BlueFS BlueFS Improving User App Local Storage BlueFS BlueFS Network Server 29
Improving BlueFS BlueFS Improving Flush on Spindown User App Local Storage BlueFS BlueFS Network Server 30
Improving BlueFS BlueFS Improving Flush on Spindown User App Local Storage BlueFS BlueFS Network Server Reduced Writeback Delay From 30 to 2 seconds 31
Improving BlueFS BlueFS Improving Modified BlueFS and ran 45-minute Purcell trace • Implemented flush on spindown for disk – File system energy reduced 12.4% – Interactive delay reduced 11.0% • Reduced network writeback delay from 30 to 2 secs – Safety improved for 1.8% file system energy cost 32
Related Work Related Work • Disksim [Bucy 03] / Dempsey [Zedlewski 03] – Detailed disk performance and power model • QualNet [http://www.scalablenetworks.com] NS2 [http://www.isi.edu/nsnam/ns] – Detailed network performance model • File-Cache-Content Detector [Arpaci-Dusseau 01] 33
Conclusion Conclusion Drive-Thru is: • Fast: Ran ext2 evaluation 40,000x faster • Accurate: 3% error for disk file system energy • Portable: < 1,000 lines of code Case Study Insights: • Increasing writeback delay a meager improvement • Avoid writeback delay = disk spindown delay • Flush on spindown effective for disk 34
Recommend
More recommend