Spiral Computer Generation of Performance Libraries Applications - PowerPoint PPT Presentation

Carnegie Mellon Performance Spiral Computer Generation of Performance Libraries Applications José M. F. Moura Markus Püschel Franz Franchetti Platforms & the Spiral Team

Carnegie Mellon What is Spiral? Traditionally Spiral Approach Spiral Comparable High performance library High performance library performance optimized for given platform optimized for given platform

Carnegie Mellon Main Idea: Program Generation Model: common abstraction = spaces of matching formulas abstraction abstraction ν p defines rewriting search μ pick algorithm architecture space space Architectural parameter: Kernel: optimization Vector length, problem size, #processors, … algorithm choice

Carnegie Mellon How Spiral Works Problem specification (“DFT 1024” or “DFT”) Complete automation of controls Algorithm Generation the implementation and optimization task Algorithm Optimization algorithm Basic ideas: controls • Declarative representation Implementation Search of algorithms Code Optimization • Rewriting systems to C code generate and optimize algorithms at a high level Compilation performance of abstraction Compiler Optimizations Spiral Fast executable

Carnegie Mellon Algorithms: Rules in Domain Specific Language Linear Transforms Viterbi Decoding convolutional 11 10 01 01 10 10 11 00 Viterbi 010001 11 10 00 01 10 01 11 00 010001 encoder decoder £ Matrix-Matrix Multiplication Synthetic Aperture Radar (SAR) matched preprocessing interpolation 2D iFFT = £ filtering

Carnegie Mellon One Approach for all Types of Parallelism  Multithreading (Multicore)  Vector SIMD (SSE, VMX/Altivec ,…)  Message Passing (Clusters, MPP)  Streaming/multibuffering (Cell)  Graphics Processors (GPUs)  Gate-level parallelism (FPGA)  HW/SW partitioning (CPU + FPGA)

Carnegie Mellon Example: Code Generation for Multicore CPUs  Hardware abstraction: shared cache with cache lines  Tensor product: embarrassingly parallel operator A Processor 0 A Processor 1 A Processor 2 A Processor 3 x y  Permutation: problematic; may produce false sharing x y

Carnegie Mellon Spiral: Meta-Tool to Autotuning Libraries Input:  Transform :  Algorithms :  Vectorization : 2-way SSE  Threading : Yes Output:  Optimized library (10,000 lines of C++) Spiral Library Generator  For general input size ( not collection of fixed sizes)  Vectorized High-Performance Library “FFTW - like”  Multithreaded  With runtime adaptation mechanism  Performance competitive with hand-written code

Carnegie Mellon Verification and Testing  Verify algorithms symbolically = ?  Check rules through verification of instances = ?  Check code empirically = ? DFT4([0,1,0,0]) DFT4([0.1,1.77,2.28,-55.3]) = ? DFT4_rnd([0.1,1.77,2.28,-55.3]))

Carnegie Mellon Range: Cell Phone To Supercomputer Global FFT (1D FFT, HPC Challenge) performance [Gflop/s] 6.4 Tflop/s BlueGene/P Samsung i9100 Galaxy S II BlueGene/P at Argonne National Laboratory Dual-core ARM at 1.2GHz with NEON ISA 128k cores (quad-core CPUs) at 850 MHz SIMD vectorization + multi-threading SIMD vectorization + multi-threading + MPI G. Almási, B. Dalton, L. L. Hu, F. Franchetti, Y. Liu, A. Sidelnik, T. Spelce, I. G. Tānase , E. Tiotto, Y. Voronenko, X. Xue: 2010 IBM HPC Challenge Class II Submission. Winner of the 2010 HPC Challenge Class II Award (Most Productive System).

Carnegie Mellon More Results: Spiral Outperforms Humans FFT on Multicore SAR SDR improvement FFT on FPGA

Carnegie Mellon More Information: www.spiral.net www.spiralgen.com

Spiral Computer Generation of Performance Libraries Applications - PowerPoint PPT Presentation

Carnegie Mellon Performance Spiral Computer Generation of Performance Libraries Applications Jos M. F. Moura Markus Pschel Franz Franchetti Platforms & the Spiral Team Carnegie Mellon What is Spiral? Traditionally Spiral

SPIRAL Trial Switch from Protease Inhibitor to Raltegravir SPIRAL Trial: Study Design Study

Spiral Laminar Flow: A Revolution in Understanding? Professor Graeme Houston University of

Spiral Structure and Mass Inflows In Spiral Galaxies Yonghwi Kim, Woong-Tae Kim CEOU, Astronomy

High Assurance Spiral 18-847E Spiral: Formal Approaches to Hardware & Software Design &

Spiral-CT Benjamin Keck 21. March 2006 1 Motivation Spiral-CT offers reconstruction of long

Spiral Content Mapping Combinational Sequential System Level Implementation Spiral Theory

SPIRAL, FFTX, and the Path to SpectralPACK Franz Franchetti Carnegie Mellon University

Spiral 1 / Unit 1 Combinational vs. Sequential Logic Latency vs. Throughput (Pipelining) Digital

Spiral 2-1 Datapath Components: Counters Adders Design Example: Crosswalk Controller 2-1.2

Testing Density Wave Theory with Stellar Populations around Spiral Arms in M81 Yumi Choi 1

The Ekman Spiral We consider now an elegant method of describing the wind in the boundary layer,

POWER COUPLERS FOR Spiral 1 & 2 at GANIL (France) Yolanda GOMEZ MARTINEZ LPSC, UJF /

Code developments developments for for ray ray- -tracing tracing simulations simulations

Alternans and period-2 oscillatory cardiac spiral waves : properties and underlying mechanisms

Is the Structure of the Vessel Wall a Generator of Spiral Flow? A Cadaveric Histological Study

Integration Spiral Results Wende Peters, JH-APL wende.peters@jhuapl.edu iacd@jhuapl.edu

The Impact of Domain Knowledge on the Effectiveness of Requirements Idea Generation during

Transformation at the NRC: Innovation Commission Meeting March 28, 2019 Executive Director for

Introducing a Heterogeneous Execution Engine for LLVM Chris Margiolas chrmargiolas@gmail.com

SCADA deep inside: protocols and security mechanisms Aleksandr Timorin

Retrosp specti ctive: Feedback ack-directed Ran andom Test Ge Generation on Carlos Pacheco,

Bridging Local Systems Strategies for Mental Health and Social Services Collaboration

1 Distinguishing Upper and Lower Bounds Triangular Iteration Space Example Simple Algorithm (

Decision Aid Methodologies In Transportation Lecture 2: Duality and Column generation Chen Jiang

Spiral Computer Generation of Performance Libraries Applications - PowerPoint PPT Presentation

Carnegie Mellon Performance Spiral Computer Generation of Performance Libraries Applications Jos M. F. Moura Markus Pschel Franz Franchetti Platforms & the Spiral Team Carnegie Mellon What is Spiral? Traditionally Spiral

SPIRAL Trial Switch from Protease Inhibitor to Raltegravir SPIRAL Trial: Study Design Study

Spiral Laminar Flow: A Revolution in Understanding? Professor Graeme Houston University of

Spiral Structure and Mass Inflows In Spiral Galaxies Yonghwi Kim, Woong-Tae Kim CEOU, Astronomy

High Assurance Spiral 18-847E Spiral: Formal Approaches to Hardware &amp; Software Design &amp;

Spiral-CT Benjamin Keck 21. March 2006 1 Motivation Spiral-CT offers reconstruction of long

Spiral Content Mapping Combinational Sequential System Level Implementation Spiral Theory

SPIRAL, FFTX, and the Path to SpectralPACK Franz Franchetti Carnegie Mellon University

Spiral 1 / Unit 1 Combinational vs. Sequential Logic Latency vs. Throughput (Pipelining) Digital

Spiral 2-1 Datapath Components: Counters Adders Design Example: Crosswalk Controller 2-1.2

Testing Density Wave Theory with Stellar Populations around Spiral Arms in M81 Yumi Choi 1

The Ekman Spiral We consider now an elegant method of describing the wind in the boundary layer,

POWER COUPLERS FOR Spiral 1 &amp; 2 at GANIL (France) Yolanda GOMEZ MARTINEZ LPSC, UJF /

Code developments developments for for ray ray- -tracing tracing simulations simulations

Alternans and period-2 oscillatory cardiac spiral waves : properties and underlying mechanisms

Is the Structure of the Vessel Wall a Generator of Spiral Flow? A Cadaveric Histological Study

Integration Spiral Results Wende Peters, JH-APL wende.peters@jhuapl.edu iacd@jhuapl.edu

The Impact of Domain Knowledge on the Effectiveness of Requirements Idea Generation during

Transformation at the NRC: Innovation Commission Meeting March 28, 2019 Executive Director for

Introducing a Heterogeneous Execution Engine for LLVM Chris Margiolas chrmargiolas@gmail.com

SCADA deep inside: protocols and security mechanisms Aleksandr Timorin

Retrosp specti ctive: Feedback ack-directed Ran andom Test Ge Generation on Carlos Pacheco,

Bridging Local Systems Strategies for Mental Health and Social Services Collaboration

1 Distinguishing Upper and Lower Bounds Triangular Iteration Space Example Simple Algorithm (

Decision Aid Methodologies In Transportation Lecture 2: Duality and Column generation Chen Jiang

High Assurance Spiral 18-847E Spiral: Formal Approaches to Hardware & Software Design &

POWER COUPLERS FOR Spiral 1 & 2 at GANIL (France) Yolanda GOMEZ MARTINEZ LPSC, UJF /