midas
play

MIDAS: An Execution-Driven Simulator for Active Storage - PowerPoint PPT Presentation

MIDAS: An Execution-Driven Simulator for Active Storage Architectures Shahrukh R. Tarapore Lockheed Martin Advanced Technologies Lab Clinton W. Smullen, IV Sudhanva Gurumurthi Department of Computer Science, University of Virginia Outline


  1. MIDAS: An Execution-Driven Simulator for Active Storage Architectures Shahrukh R. Tarapore Lockheed Martin Advanced Technologies Lab Clinton W. Smullen, IV Sudhanva Gurumurthi Department of Computer Science, University of Virginia

  2. Outline • Growth of unstructured data-processing • Embarrassingly data-parallel • Active storage architectures • Small-scale data-parallel systems • Need a simulation infrastructure 2

  3. Growth of Data 3

  4. Growth of Data • Cost of storage continuously dropping 3

  5. Growth of Data • Cost of storage continuously dropping • Growth of devices producing content 3

  6. Growth of Data • Cost of storage continuously dropping • Growth of devices producing content • Study from IDC on “digital universe”: Source: IDC 3

  7. What is the data? 4

  8. What is the data? • Majority of data is unstructured • Images • Audio • Video • Free-form text (books, email) 4

  9. What is the data? • Majority of data is unstructured • Images • Audio • Video • Free-form text (books, email) • Need the ability to process this data 4

  10. What is the difference? • Unstructured data can be given metadata • Labor intensive (difficult to automate) • Imperfect - users want freedom to search • Unstructured search is data-intensive 5

  11. What is the difference? • Unstructured data can be given metadata • Labor intensive (difficult to automate) • Imperfect - users want freedom to search • Unstructured search is data-intensive 5

  12. Workloads • Move data • Scan data - nearest neighbor search • Transform data - image edge detection • Developed benchmark suite • Presented at SNAPI ‘07 6

  13. Unstructured-Data Processing Opportunities • Unstructured data processing has heavy I/O • Must scan large amounts of data • Embarrassingly data parallel • Parallel operations, MapReduce, etc. • Multi/manycore increases I/O demands further 7

  14. Unstructured-Data Processing Problems • Emerging workloads are very data parallel • Still moving data from storage to CPU • Processors consume lots of power • Data movement is ‘wasted’ power • Can systems target these workloads? 8

  15. Typical Approaches 9

  16. Typical Approaches Cluster+SAN: 9

  17. Typical Approaches Cluster+SAN: GFS+MapReduce: 9

  18. Storage-centric Computing Can we use disk drive and array controller processors to execute workloads? 10

  19. Storage-centric Computing Can we use disk drive and array controller processors to execute workloads? 10

  20. Active-Storage Architectures • Move computation to disk drives and array controllers • What is the performance? • What is their power consumption? • What are the microarchitectural tradeoffs? 11

  21. MIDAS • Both in-order and out-of-order cores • Hard disk timing model • Interconnect modeling • RPC programming model [Sivathanu02] 12

  22. Programming Model PE Requesting PE Performing Computation Computation AS_COMPUTE_REQUEST Time AS_DATA_READY AS_COMPUTE_DONE 13

  23. Modeling • Network connects Processing Elements • PE consists of up to one core and disk • SimpleScalar for cores • DiskSim for disks • Finite amount of local memory • Full-duplex, point-to-point network links 14

  24. Space Manager • Standard UNIX syscalls for file I/O • Abstracts use of disks • Simulates FAT -like filesystem • Handles file address translation • Models sequential and random layouts • Also handles swap space 15

  25. Complete PE SimpleScalar DiskSim 16

  26. Complete PE Computation request SimpleScalar DiskSim 16

  27. Complete PE Computation request SimpleScalar Disk block request DiskSim 16

  28. Complete PE Latency Computation request Computation latency + SimpleScalar Disk access latency Disk block request DiskSim 16

  29. Disk-only SimpleScalar DiskSim 17

  30. Disk-only Latency Disk block request + Disk access latency Disk block request DiskSim 17

  31. Experimental Setup • Host is 1.6 GHz, 8-wide, out-of-order, with 512 MB of RAM • Vary the number of disks • Vary DPU frequency (200/300/400 MHz) • Vary data layout (sequential/random) • Vary DPU superscalar width (1/2/4 wide) 18

  32. Image Edge Detection Image Edge Detection Normalized Speedup of Active Storage Sequential Data Layout 0.8 0.7 0.6 Normalized Speedup 0.5 2 Disks 0.4 4 Disks 8 Disks 0.3 0.2 0.1 0 200 300 400 DPU Frequency (mhz) 19

  33. Image Edge Detection Data Layout Image Edge Detection Normalized Speedup of Active Storage Sequential Data Layout 0.8 0.7 Image Edge Detection 0.6 Normalized Speedup of Active Storage Normalized Speedup Random Data Layout 0.5 2 Disks 0.4 3 4 Disks 8 Disks 0.3 2.5 0.2 0.1 Normalized Speedup 2 0 2 Disks 200 300 400 1.5 4 Disks DPU Frequency (mhz) 8 Disks 1 0.5 0 200 300 400 DPU Frequency (mhz) 19

  34. Image Edge Detection 19

  35. Image Edge Detection 20

  36. Image Edge Detection Superscalar Width Image Edge Detection Effect of Processor Width - 8 Disk Active Storage Sequential Data Layout 2.5 2 Normalized Speedup 1.5 200 Mhz 300 Mhz 400 Mhz 1 0.5 0 1 way 2 way 4 way Processor Width 20

  37. Active Storage Architecture Host Core Disk Drive Processing at the host 21

  38. Active Storage Architecture Host Core Array Controller Disk Drive Processing at the array controller 21

  39. Active Storage Architecture Host Core Array Controller Disk Processor Disk Drive Processing at the disk drives 21

  40. Disk Processors [Computing Frontiers ‘08] Image Edge Detection Nearest Neighbor Search Effect of Disk Processor Width - 8 Disk System Effect of Disk Processor Width - 8 Disk System 3.5 7 3 6 Normalized Speedup Normalized Speedup 2.5 5 200 MHz 200 MHz 2 4 300 MHz 300 MHz 1.5 3 400 MHz 400 MHz 1 2 0.5 1 0 0 1-way 2-way 4-way 1-way 2-way 4-way Processor Width Processor Width 22

  41. Conclusion • Unstructured data-processing is a growing • Need smaller scale systems than Google • Shift data-parallel computation to storage • Need the ability to model them 23

  42. Questions?

Recommend


More recommend