Efficient Program Compilation through Machine Learning Techniques - PowerPoint PPT Presentation

Efficient Program Compilation through Machine Learning Techniques Gennady Pekhimenko IBM Canada Angela Demke Brown University of Toronto

Motivation Unroll My cool Compiler Unroll Executable program -O2 Inline Inline Peephole Peephole few seconds DCE DCE Unroll But what to do if executable is slow? Unroll Unroll Unroll Unroll Replace – O2 with – O5 Unroll Optimization New 100 Fast Executable 1-10 minutes

Motivation (2) Compiler Executable -O2 1 hour Too slow! Our cool Operating System Compiler New Executable -O5 20 hours We do not have that much time Why did it happen?

Basic Idea Do we need all these optimizations for every function? Unroll Unroll Unroll Probably not. Optimization 100 Compiler writers can typically solve this problem , but how ? 1. Description of every function 2. Classification based on the description 3. Only certain optimizations for every class Machine Learning is good for solving this kind of problems

Overview  Motivation  System Overview  Experiments and Results  Related Work  Conclusions  Future Work

Initial Experiment 3X difference on average

Initial Experiment (2) SPEC2000 execution time at – O3 and – qhot – O3 Time, secs 500 "-O3" 400 "-qhot -O3" 300 200 100 0 bzip2 applu crafty eon gap gzip mcf vortex vpr ammp art equake facerec fma3d galgel lucas mgrid sixtrack swim wupwise mesa Benchmarks

Our System Gather Training Data Prepare • extract features • modify heuristic values • choose transformations Measure Compile • find hot methods run time Best feature Online Offline settings Learn Deploy Logistic Regression Classifier TPO/XL Compiler Classification set heuristic values parameters

Data Preparation • Existing XL compiler is Three key elements: missing functionality • Extension was made to the  Feature extraction existing Heuristic Context approach  Heuristic values modification  Target set of transformations • Unroll • Total # of insts • Wandwaving • Loop nest level • If-conversion • # and % of Loads, Stores, • Unswitching Branches • CSE • Loop characteristics • Index Splitting …. • Float and Integer # and %

Gather Training Data  Try to “cut” transformation backwards (from last to first) Late Wandwaving Unroll Inlining  If run time not worse than before, transformation can be skipped  Otherwise we keep it  We do this for every hot function of every test The main benefit is linear complexity.

Learn with Logistic Regression Function Classifier Descriptions Input Best Heuristic • Logistic Regression Values • Neural Networks • Genetic Programming Compiler + .hpredict Output Heuristic Values files

Deployment Online phase, for every function:  Calculate the feature vector  Compute the prediction  Use this prediction as heuristic context Overhead is negligible

Experiments Benchmarks: SPEC2000 Others from IBM customers Platform: IBM server, 4 x Power5 1.9 GHz, 32GB RAM Running AIX 5.3

Results: compilation time Normalized Time 2x 1 average 0.9 speedup Oracle 0.8 0.7 Classifer 0.6 0.5 0.4 0.3 0.2 0.1 0 mcf vortex vpr bzip2 crafty eon gap gzip ammp applu art equake fma3d galgel lucas mesa mgrid sixtrack swim wupwise facerec GeoMean Benchmarks

Results: execution time Time, secs 100 150 200 250 300 350 50 0 bzip2 crafty eon gap gzip mcf vortex vpr ammp Benchmarks applu art equake facerec fma3d galgel lucas mesa mgrid sixtrack Classifer Oracle Baseline swim wupwise

New benchmarks: compilation time Normalized Time 1 0.8 Classifier 0.6 0.4 0.2 0 Benchmarks

New benchmarks: execution time Time, secs 350 Baseline 300 Classifer 250 200 150 4% speedup 100 50 0 apsi parser twolf dmo argonne Benchmarks

Related Work  Iterative Compilation  Pan and Eigenmann  Agakov, et al.  Single Heuristic Tuning  Calder, et al.  Stephenson, et al.  Multiple Heuristic Tuning  Cavazos, et al.  MILEPOST GCC

Conclusions and Future Work  2x average compile time decrease  Future work  Execution time improvement  -O5 level  Performance Counters for better method description  Other benefits  Heuristic Context Infrastructure  Bug Finding

Thank you  Raul Silvera, Arie Tal, Greg Steffan, Mathew Zaleski  Questions?

Efficient Program Compilation through Machine Learning Techniques - PowerPoint PPT Presentation

Efficient Program Compilation through Machine Learning Techniques Gennady Pekhimenko IBM Canada Angela Demke Brown University of Toronto Motivation Unroll My cool Compiler Unroll Executable program -O2 Inline Inline Peephole

JIT Compilation Module Overview JIT Compilation Native vs. Managed Compilation Managed

The Java Virtual Machine The Java Virtual Machine interpret compile Native Binary Code Michael

December 12, 2018 Luis Ceze Welcome to the 1st TVM and Deep Learning Compilation Conference!

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

The Compilation Process Preprocessing: o processes include-files, conditional compilation and

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

MACHINE LEARNING Kernel Canonical Correlation Analysis 1 ADVANCED MACHINE LEARNING ADVANCED

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

Human and Machine Learning Tom Mitchell Machine Learning Department Carnegie Mellon University

Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification

mer VP, MC United) (prepared with assistance from m Mehrdad Nazari, forme Changing Trends in

Samba Witness Protection Programming Samuel Cabrero scabrero@suse.com David Disseldorp

Machines running on random tapes and the probabilities of events by George Barmpalias joint work

Mireia Toledano Pinedo, Hristo Anatliev, Teresa Martnez del Campo, Pedro Almendros Grupo de

Discrete convexity and packages Gleb Koshevoy IITP(RAS) and Poncelet Center (CNRS) 12/05/2020,

On the existence of universal numberings for families of d.c.e. sets Kuanysh Abeshev Al-Farabi

Remote Procedure Calls (RPC) Technique allowing an application to invoke a procedure whose