Does Training Input Selection Matter for Feedback-Directed Optimizations? Paul Berube berube@cs.ualberta.ca University of Alberta CDP05, October 17, 2005
Outline • Background and motivation • Aestimo : an FDO evaluation tool • Workload Selection • Results CASCON: October 17, 2005 Paul Berube 2
What Is FDO? Feedback-Directed Optimization: compile train compile evaluate CASCON: October 17, 2005 Paul Berube 3
What Is FDO? Feedback-Directed Optimization: compile train compile evaluate training eval profile input input CASCON: October 17, 2005 Paul Berube 4
Performance Evaluation Space Static optimization Evaluation Inputs Programs CASCON: October 17, 2005 Paul Berube 5
Performance Evaluation Space FDO Evaluation Inputs Programs Training Inputs CASCON: October 17, 2005 Paul Berube 6
Performance Evaluation Space SPEC Evaluation Inputs Usually 1 Ref input Programs Training Inputs Only 1 Train input CASCON: October 17, 2005 Paul Berube 7
The Big Question • Does the selection of training inputs matter for feedback-directed optimization? – Different transformation decisions? – Different performance? CASCON: October 17, 2005 Paul Berube 8
Aestimo • An FDO evaluation tool • Automates training and evaluating on a large number of inputs • Isolates individual transformations – Fewer experiment variables – Results vary by transformation • Measures: – Differences in transformation decisions – Performance differences CASCON: October 17, 2005 Paul Berube 9
An Overview of Aestimo Compile Program Workload Binaries Optimization Logs Execute Analyze CASCON: October 17, 2005 Paul Berube 10
An Overview of Aestimo One Per One Per Compile Input Input Program Workload Binaries Optimization Logs Execute Analyze CASCON: October 17, 2005 Paul Berube 11
An Overview of Aestimo Compile Program Workload Binaries Optimization Logs Execute Binary X Input 5 times each Analyze CASCON: October 17, 2005 Paul Berube 12
An Overview of Aestimo Compile Program Workload Binaries Optimization Logs Execute Workload Performance Analyze CASCON: October 17, 2005 Paul Berube 13
An Overview of Aestimo Compile Program Workload Binaries Optimization Logs Execute Workload Transformation Performance Analyze Differences CASCON: October 17, 2005 Paul Berube 14
An Overview of Aestimo Compile Program Workload Binaries Optimization Logs Execute Workload Transformation Performance Analyze Differences FDO vs. Static CASCON: October 17, 2005 Paul Berube 15
An Overview of Aestimo Compile Program Workload Binaries Optimization Logs Execute Workload Transformation Performance Analyze Differences FDO vs. Static Resubstitution CASCON: October 17, 2005 Paul Berube 16
Compilation Process Source Static Compile Static Optimization Binary Log CASCON: October 17, 2005 Paul Berube 17
Compilation Process Instrumented Instr. Compile Source Binary Training Input Profile Training Run Static Compile FDO FDO Compile Binary Static Optimization Binary Log CASCON: October 17, 2005 Paul Berube 18
Compilation Process Instrumented Instr. Compile Source Binary Training Input Profile Training Run Static Compile FDO FDO Compile Binary Static Optimization Binary Log Optimization Log Final Static Compile Binary CASCON: October 17, 2005 Paul Berube 19
Compilation Process Instrumented Instr. Compile Source Binary Training Input Profile Training Run Static Compile FDO FDO Compile Binary Static Optimization Binary Log Optimization Log Final Static Compile Binary CASCON: October 17, 2005 Paul Berube 20
Workload Selection • SPEC CINT2000 Benchmark inputs – 8 programs, 32 input • 84 Additional Inputs – Contacted benchmark authors – Varied representative inputs – Existing collections – Synthetic input generator CASCON: October 17, 2005 Paul Berube 21
Results • ORC compiler • Inlining and if conversion • Itanium and Itanium 2 processors CASCON: October 17, 2005 Paul Berube 22
Workload Performance: bzip2 Inlining Itanium CASCON: October 17, 2005 Paul Berube 23
Workload Performance: bzip2 Training Input Selection Matters! Inlining Itanium CASCON: October 17, 2005 Paul Berube 24
Summary of Contributions • Training input selection does impact optimization decisions and performance • Aestimo : – Automates training and evaluating on a large number of inputs – Isolates individual transformations • A large collection of representative inputs for SPEC CINT2000 programs CASCON: October 17, 2005 Paul Berube 25
Thank You Questions? CASCON: October 17, 2005 Paul Berube 26
Performance: bzip2 trained on xml 12 10 8 % Faster than Static 6 4 2 0 -2 -4 -6 g p f s g l 3 g c d s e m o m m d c a e p d i e r p c l e o g h p e m p o x e a r n d p j t u s m r d u i s a g o n b e e r o s m a r r g r r p p o m c o Inlining c Itanium Evaluation Input CASCON: October 17, 2005 Paul Berube 27
Performance: bzip2.combined 18 16 14 % Faster then Static 12 10 8 6 4 2 0 f s p g l g 3 g e c s d d m m m d c a e o p e r c i p e e o e g h p l p a o r x m n s d t u p j r m d u i s g o a b n e e o r s m r a r g p r Inlining r p o m c o c Itanium Evaluation Input CASCON: October 17, 2005 Paul Berube 28
Recommend
More recommend