Approximating with Input Level Granularity Parker Hill, Michael Laurenzano, Mehrzad Samadi Scott Mahlke, Jason Mars, Lingjia Tang
Computational Model ● Each operation executed with several inputs Computation 2
Sensitivity to Input 3
Sensitivity to Input Input Gamma Filter 4
Sensitivity to Input (16x8 Tiling*) Input Gamma Filter Approximation *Samadi et al. ASPLOS 2014 5
Sensitivity to Input (16x8 Tiling*) Input Gamma Filter Approximation ✓ Is this an acceptable approximation method? *Samadi et al. ASPLOS 2014 6
Sensitivity to Input (16x8 Tiling*) Input Gamma Filter Approximation ✓ *Samadi et al. ASPLOS 2014 7
Sensitivity to Input (16x8 Tiling*) Input Gamma Filter Approximation ✓ *Samadi et al. ASPLOS 2014 8
Sensitivity to Input (16x8 Tiling*) Input Gamma Filter Approximation ✓ ⊗ *Samadi et al. ASPLOS 2014 9
Previous Work ● Use some set of inputs to: – Determine if approximation is accurate enough – Pick fastest acceptable approximation ● Reuse the approximation for several inputs 10
Performance vs Accuracy 16x8 Tiling 4x2 Tiling ✓ ⊗ Speedup 49x 5.9x 11
Performance vs Accuracy 16x8 Tiling 4x2 Tiling ✓ ✓ ✓ ⊗ Speedup 49x 5.9x 12
Performance vs Accuracy 16x8 Tiling 4x2 Tiling ✓ ✓ ✓ ⊗ __ Speedup 49x 5.9x 13
Trade-off with Many Inputs M i s s e d O p p o r t u n i t y F a s t + H i g h Q u a l i t y T O Q = 9 0 % T O Q V i o l a t i o n 3 5 % 3 0 % n 2 5 % o i t r 2 0 % o p o r 1 5 % P 1 0 % 5 % 0 % 1 0 0 % 9 0 % 8 0 % O u t p u t Q u a l i t y 4 x 2 t i l i n g a p p r o x i m a t i o n ( 5 . 9 x s p e e d u p ) 14
Trade-off with Many Inputs M i s s e d O p p o r t u n i t y F a s t + H i g h Q u a l i t y T O Q = 9 0 % T O Q V i o l a t i o n 3 5 % 3 0 % n 2 5 % o i t r 2 0 % o p o r 1 5 % P 1 0 % 5 % 0 % 1 0 0 % 9 0 % 8 0 % O u t p u t Q u a l i t y 4 x 2 t i l i n g a p p r o x i m a t i o n ( 5 . 9 x s p e e d u p ) ● Conservative approximation → small speedup 15
Trade-off with Many Inputs M i s s e d O p p o r t u n i t y F a s t + H i g h Q u a l i t y T O Q = 9 0 % T O Q V i o l a t i o n 3 5 % 1 5 % 3 0 % n n 2 5 % o o i i t t r r 1 0 % 2 0 % o o p p o o r 1 5 % r P P 5 % 1 0 % 5 % 0 % 0 % 1 0 0 % 9 0 % 8 0 % 1 0 0 % 9 0 % 8 0 % O u t p u t Q u a l i t y O u t p u t Q u a l i t y 4 x 2 t i l i n g a p p r o x i m a t i o n ( 5 . 9 x s p e e d u p ) 8 x 8 t i l i n g ( 2 2 x s p e e d u p ) ● Conservative approximation → small speedup ● Cannot approximate more aggressively 16
Trade-off with Many Inputs M i s s e d O p p o r t u n i t y F a s t + H i g h Q u a l i t y T O Q = 9 0 % T O Q V i o l a t i o n 3 5 % 1 5 % 3 0 % n n 2 5 % o o i i t t r r 1 0 % 2 0 % o o p p o o r 1 5 % r P P 5 % 1 0 % 5 % 0 % 0 % 1 0 0 % 9 0 % 8 0 % 1 0 0 % 9 0 % 8 0 % O u t p u t Q u a l i t y O u t p u t Q u a l i t y 4 x 2 t i l i n g a p p r o x i m a t i o n ( 5 . 9 x s p e e d u p ) 8 x 8 t i l i n g ( 2 2 x s p e e d u p ) ● Conservative approximation → small speedup ● Cannot approximate more aggressively 17
Trade-off with Many Inputs M i s s e d O p p o r t u n i t y F a s t + H i g h Q u a l i t y T O Q = 9 0 % T O Q V i o l a t i o n 3 5 % 1 5 % 3 0 % n n 2 5 % o o i i t t r r 1 0 % 2 0 % o o p p o o r 1 5 % r P P 5 % 1 0 % 5 % 0 % 0 % 1 0 0 % 9 0 % 8 0 % 1 0 0 % 9 0 % 8 0 % O u t p u t Q u a l i t y O u t p u t Q u a l i t y 4 x 2 t i l i n g a p p r o x i m a t i o n ( 5 . 9 x s p e e d u p ) 8 x 8 t i l i n g ( 2 2 x s p e e d u p ) ● Conservative approximation → small speedup ● Cannot approximate more aggressively ● We would like to approximate inputs differently 18
Dynamic Approximation Challenges ● Must analyze accurately – Cannot violate TOQ – Need to pick a fast approximation ● Must analyze quickly – Limits potential speedup 19
One Possible Dynamic System Approximations 1) Provide: Tiling ● A set of approximations Input 2x2 16x8 ● Input 16x16 4x2 2) Apply analysis to each pair: ● Performance ● Output quality Analysis 3) Select best approximation: Customized ● Meets accuracy constraint 2x2 Approximation ● High performance 4x2 16x8 4) Apply approximation 16x16 Selection Approx. Result 20
One Possible Dynamic System Approximations 1) Provide: Tiling ● A set of approximations Input 2x2 16x8 ● Input 16x16 4x2 2) Apply analysis to each pair: ● Performance ● Output quality Analysis 3) Select best approximation: Customized ● Meets accuracy constraint 2x2 Approximation ● High performance 4x2 16x8 4) Apply approximation 16x16 Selection Approx. Result 21
One Possible Dynamic System Approximations 1) Provide: Tiling ● A set of approximations Input 2x2 16x8 ● Input 16x16 4x2 2) Apply analysis to each pair: ● Performance ● Output quality Analysis 3) Select best approximation: ✓ Customized ● Meets accuracy constraint 2x2 Approximation ✓ ● High performance 4x2 ✓ 16x8 4) Apply approximation ⊗ 16x16 Selection Approx. Result 22
One Possible Dynamic System Approximations 1) Provide: Tiling ● A set of approximations Input 2x2 16x8 ● Input 16x16 4x2 2) Apply analysis to each pair: ● Performance ● Output quality Analysis 3) Select best approximation: ✓ Customized ● Meets accuracy constraint 2x2 16x8 Approximation ✓ ● High performance 4x2 ✓ 16x8 4) Apply approximation ⊗ 16x16 Selection Approx. Result 23
One Possible Dynamic System Approximations 1) Provide: Tiling ● A set of approximations Input 2x2 16x8 ● Input 16x16 4x2 2) Apply analysis to each pair: ● Performance ● Output quality Analysis 3) Select best approximation: ✓ Customized ● Meets accuracy constraint 2x2 16x8 Approximation ✓ ● High performance 4x2 ✓ 16x8 4) Apply approximation ⊗ 16x16 Selection Approx. Result 24
One Possible Dynamic System Approximations 1) Provide: Tiling ● A set of approximations Input 2x2 16x8 ● Input 16x16 4x2 2) Apply analysis to each pair: ● Performance ● Output quality Analysis 3) Select best approximation: Customized ● Meets accuracy constraint 2x2 Approximation ● High performance 4x2 16x8 4) Apply approximation 16x16 Selection Approx. Result 25
One Possible Dynamic System Approximations 1) Provide: Tiling ● A set of approximations Input 2x2 16x8 ● Input 16x16 4x2 2) Apply analysis to each pair: ● Performance ● Output quality Analysis 3) Select best approximation: ✓ Customized ● Meets accuracy constraint 2x2 Approximation ✓ ● High performance 4x2 ⊗ 16x8 4) Apply approximation ⊗ 16x16 Selection Approx. Result 26
One Possible Dynamic System Approximations 1) Provide: Tiling ● A set of approximations Input 2x2 16x8 ● Input 16x16 4x2 2) Apply analysis to each pair: ● Performance ● Output quality Analysis 3) Select best approximation: ✓ Customized ● Meets accuracy constraint 2x2 4x2 Approximation ✓ ● High performance 4x2 ⊗ 16x8 4) Apply approximation ⊗ 16x16 Selection Approx. Result 27
One Possible Dynamic System Approximations 1) Provide: Tiling ● A set of approximations Input 2x2 16x8 ● Input 16x16 4x2 2) Apply analysis to each pair: ● Performance ● Output quality Analysis 3) Select best approximation: ✓ Customized ● Meets accuracy constraint 2x2 4x2 Approximation ✓ ● High performance 4x2 ⊗ 16x8 4) Apply approximation ⊗ 16x16 Selection Approx. Result 28
● Optimal choice depends heavily on input Dynamic Oracle Selections P r o p o r t i o n 1 1 0 5 0 5 % % % % 1 6 x 1 6 8 x 1 6 1 6 x 3 2 8 x 3 2 3 2 x 3 2 4 x 1 6 8 x 8 1 6 x 6 4 3 2 x 1 6 3 2 x 6 4 6 4 x 1 2 8 4 x 8 4 x 3 2 1 6 x 1 2 8 3 2 x 1 2 8 4 x 4 8 x 6 4 8 x 4 2 4 o t h e r s 29
Dynamic Oracle Performance M i s s e d O p p o r t u n i t y 5 0 % F a s t + H i g h Q u a l i t y T O Q = 9 0 % 4 0 % n T O Q V i o l a t i o n o i t r o 3 0 % p o r P 2 0 % 1 0 % 0 % 1 0 0 % 9 0 % 8 0 % O u t p u t Q u a l i t y ● Accuracy near TOQ 30
Recommend
More recommend