Imagine: Media Processing with Streams Brucek Khailany et al. and a little bit of “Evaluating the Imagine Stream Architecture” Jung Ho Ahn et al. Presented by Dan Amelang
Background ● Digital media processing has become pervasive ● Real-time processing requires large amounts of computation and bandwidth ● Over time, as resources increase, workloads increase to match
Stream Processing ● Good fit for media processing ● Computationally intensive ● Highly parallel and independent data ● High latency tolerance ● Little data reuse ● Simple control ● Communication and parallelism explicit
Architecture Options ● General purpose architecture – Caches optimized for latency and data reuse – Don't provide enough functional units – Large multiported register file inefficient ● ASIC – Efficient and fast – Limited use ● Stream Processor – Trade-off between programmability and efficiency
Streams and Kernels
Imagine
Programming Model ● StreamC for stream and kernel interaction ● KernelC for VLIW kernel code
Memory System ● DRAM <-> SRF controlled by host ● SRF <-> LRF at the request of the kernel ● LRF <-> LRF statically scheduled by the compiler ● Streams are composed of 32 word blocks ● SRF transfers go through stream buffers of 2 blocks
6 Arithmetic Clusters
Simulation vs. Prototype ● 500 MHz vs. 200 MHz ● 20 GFLOPS vs. 8 GFLOPS ● Cut bandwidths in half ● Double power consumption ● Halve performance
Kernel Performance Breakdown
Application Performance Breakdown
"Where are they now?" ● Imagine became basis of new company "Stream Processors, Inc" ● Merrimac ● StreamIt ● BrookGPU ● Last month, Bill Dally was appointed VP of Research at NVIDIA
Recommend
More recommend