MIDAS: An Execution-Driven Simulator for Active Storage Architectures Shahrukh R. Tarapore Lockheed Martin Advanced Technologies Lab Clinton W. Smullen, IV Sudhanva Gurumurthi Department of Computer Science, University of Virginia
Outline • Growth of unstructured data-processing • Embarrassingly data-parallel • Active storage architectures • Small-scale data-parallel systems • Need a simulation infrastructure 2
Growth of Data 3
Growth of Data • Cost of storage continuously dropping 3
Growth of Data • Cost of storage continuously dropping • Growth of devices producing content 3
Growth of Data • Cost of storage continuously dropping • Growth of devices producing content • Study from IDC on “digital universe”: Source: IDC 3
What is the data? 4
What is the data? • Majority of data is unstructured • Images • Audio • Video • Free-form text (books, email) 4
What is the data? • Majority of data is unstructured • Images • Audio • Video • Free-form text (books, email) • Need the ability to process this data 4
What is the difference? • Unstructured data can be given metadata • Labor intensive (difficult to automate) • Imperfect - users want freedom to search • Unstructured search is data-intensive 5
What is the difference? • Unstructured data can be given metadata • Labor intensive (difficult to automate) • Imperfect - users want freedom to search • Unstructured search is data-intensive 5
Workloads • Move data • Scan data - nearest neighbor search • Transform data - image edge detection • Developed benchmark suite • Presented at SNAPI ‘07 6
Unstructured-Data Processing Opportunities • Unstructured data processing has heavy I/O • Must scan large amounts of data • Embarrassingly data parallel • Parallel operations, MapReduce, etc. • Multi/manycore increases I/O demands further 7
Unstructured-Data Processing Problems • Emerging workloads are very data parallel • Still moving data from storage to CPU • Processors consume lots of power • Data movement is ‘wasted’ power • Can systems target these workloads? 8
Typical Approaches 9
Typical Approaches Cluster+SAN: 9
Typical Approaches Cluster+SAN: GFS+MapReduce: 9
Storage-centric Computing Can we use disk drive and array controller processors to execute workloads? 10
Storage-centric Computing Can we use disk drive and array controller processors to execute workloads? 10
Active-Storage Architectures • Move computation to disk drives and array controllers • What is the performance? • What is their power consumption? • What are the microarchitectural tradeoffs? 11
MIDAS • Both in-order and out-of-order cores • Hard disk timing model • Interconnect modeling • RPC programming model [Sivathanu02] 12
Programming Model PE Requesting PE Performing Computation Computation AS_COMPUTE_REQUEST Time AS_DATA_READY AS_COMPUTE_DONE 13
Modeling • Network connects Processing Elements • PE consists of up to one core and disk • SimpleScalar for cores • DiskSim for disks • Finite amount of local memory • Full-duplex, point-to-point network links 14
Space Manager • Standard UNIX syscalls for file I/O • Abstracts use of disks • Simulates FAT -like filesystem • Handles file address translation • Models sequential and random layouts • Also handles swap space 15
Complete PE SimpleScalar DiskSim 16
Complete PE Computation request SimpleScalar DiskSim 16
Complete PE Computation request SimpleScalar Disk block request DiskSim 16
Complete PE Latency Computation request Computation latency + SimpleScalar Disk access latency Disk block request DiskSim 16
Disk-only SimpleScalar DiskSim 17
Disk-only Latency Disk block request + Disk access latency Disk block request DiskSim 17
Experimental Setup • Host is 1.6 GHz, 8-wide, out-of-order, with 512 MB of RAM • Vary the number of disks • Vary DPU frequency (200/300/400 MHz) • Vary data layout (sequential/random) • Vary DPU superscalar width (1/2/4 wide) 18
Image Edge Detection Image Edge Detection Normalized Speedup of Active Storage Sequential Data Layout 0.8 0.7 0.6 Normalized Speedup 0.5 2 Disks 0.4 4 Disks 8 Disks 0.3 0.2 0.1 0 200 300 400 DPU Frequency (mhz) 19
Image Edge Detection Data Layout Image Edge Detection Normalized Speedup of Active Storage Sequential Data Layout 0.8 0.7 Image Edge Detection 0.6 Normalized Speedup of Active Storage Normalized Speedup Random Data Layout 0.5 2 Disks 0.4 3 4 Disks 8 Disks 0.3 2.5 0.2 0.1 Normalized Speedup 2 0 2 Disks 200 300 400 1.5 4 Disks DPU Frequency (mhz) 8 Disks 1 0.5 0 200 300 400 DPU Frequency (mhz) 19
Image Edge Detection 19
Image Edge Detection 20
Image Edge Detection Superscalar Width Image Edge Detection Effect of Processor Width - 8 Disk Active Storage Sequential Data Layout 2.5 2 Normalized Speedup 1.5 200 Mhz 300 Mhz 400 Mhz 1 0.5 0 1 way 2 way 4 way Processor Width 20
Active Storage Architecture Host Core Disk Drive Processing at the host 21
Active Storage Architecture Host Core Array Controller Disk Drive Processing at the array controller 21
Active Storage Architecture Host Core Array Controller Disk Processor Disk Drive Processing at the disk drives 21
Disk Processors [Computing Frontiers ‘08] Image Edge Detection Nearest Neighbor Search Effect of Disk Processor Width - 8 Disk System Effect of Disk Processor Width - 8 Disk System 3.5 7 3 6 Normalized Speedup Normalized Speedup 2.5 5 200 MHz 200 MHz 2 4 300 MHz 300 MHz 1.5 3 400 MHz 400 MHz 1 2 0.5 1 0 0 1-way 2-way 4-way 1-way 2-way 4-way Processor Width Processor Width 22
Conclusion • Unstructured data-processing is a growing • Need smaller scale systems than Google • Shift data-parallel computation to storage • Need the ability to model them 23
Questions?
Recommend
More recommend