Efficient Analysis of Multidimensional Linear Systems for - PowerPoint PPT Presentation

Efficient Analysis of Multidimensional Linear Systems for Wordlength Optimization Gaël Deest Tomofumi Yuki Olivier Sentieys Steven Derrien This work was funded by European FP7 project Alma 1

Embedded System Design Many constraints: • Power efficiency • Production cost • Performance / speed • Time-to-market • … 2

Design-Space Exploration (DSE) Cost Feasible solutions Execution time Cost = power or area 3

Design-Space Exploration (DSE) Time constraint Cost Execution time Cost = power or area 3

Design-Space Exploration (DSE) Time constraint Cost Optimum Execution time Cost = power or area 3

Design-Space Exploration (DSE) Accuracy constraint Cost Optimum Accuracy degradation (Signal to Noise Ratio) Cost = power or area 3

Design-Space Exploration (DSE) Accuracy constraint Cost Optimum Accuracy degradation (Signal to Noise Ratio) Cost = power or area Custom fixed-point formats used to reduce cost 3

Wordlength Optimization Process Soft accuracy constraints (eg., noise power) Fast accuracy evaluation is critical for thorough design-space exploration 4

This Work State of the art: Techniques Applicability Depth of DSE Simulation-based Excellent Limited Current analytical Limited Good techniques Our approach Good Good Diagnostic: Applicability issues for analytical techniques. Contribution: Extended applicability and scalability. 5

Overview • Background • Analytical techniques • Proposed approach 6

Fixed-Point Arithmetic • Scaled integers: 2 −𝑙 × Integer value • Product of 2 𝑜 -bit numbers 2𝑜 bits ! • Some bits must be dropped ( quantization) Example (truncation): 𝟑 −𝟐 𝟑 −𝟑 𝟑 −𝟒 𝟑 −𝟓 𝟑 −𝟔 𝟑 −𝟕 𝟑 𝟒 𝟑 𝟑 𝟑 𝟐 𝟑 𝟏 Dropped bits 7

Quantization Errors Modeled as noise / random variable Example: Truncation to 𝟑 −𝒐 precision 𝑓𝑠𝑠𝑝𝑠 ~ 𝑉 −2 −𝑜 ; 0 • Assumptions: Widrow hypothesis • Statistical moments: 2 −2𝑜 𝜏 2 = 𝜈 = −2 −𝑜−1 12 8

Analytical Techniques Goal: Compute an output noise formula Idea: Model propagation of errors to the output Representation : Signal Flow Graphs (SFG) 9

Accuracy Model Construction 10

Accuracy Model Construction Quantization errors = new inputs 10

Accuracy Model Construction Compute transfer function for each error 10

The Challenge How to go from …to this ? this … float xb[N]; float fir(float in) { float y = 0; xb[0] = in; for (int i=0; i<N; i++) acc += b[i]*xb[i]; for (int i=N-1; i>0; i--) xb[i] = xb[i-1]; return y; } 11

The Challenge Current methods: • Flatten control (completely unroll loops, etc.) • Heavy use of annotations: #pragma DELAY Example: float xb[N]; Limitations: • Scalability issues (large graphs) • Implicit 1D stream assumption • Not easily applicable to image processing 12

Motivating Example: Deriche Filter Recursive Filter Horizontal: 𝑐 1 𝑐 2 Left-to-right 𝑏 1 𝑏 2 𝑏 3 𝑏 4 Right-to-left Vertical: 𝑐 1 𝑐 2 Similar along columns 13

Motivating Example: Deriche Filter Issues with SFG representation: • Requires image size to be statically known • Each pixel is a different input • Number of transfer functions: 𝑃(𝑂 4 ) • For 32x32 image: 1,048,576 ! Cannot be handled with current methods 14

Intuition of the Technique • Current tools cannot capture regularity of multidimensional filters. Idea : • Generalize SFGs to multidimensional systems of equations. • Infer this representation using polyhedral dependence analysis. 15

Steps of our Method 1. Build an equational representation of the program. float xb[N]; float fir(float x) { float y = 0; 𝑇 0 𝑜 = 0 xb[0] = in; 𝑇 1 𝑜 = 𝑦(𝑜) 𝑇 0 𝑜 + 𝑐(𝑗) × 𝑇 1 (𝑜) 𝑗 = 0 𝑇 2 𝑜, 𝑗 = for (int i=0; i<N; i++) 𝑇 2 𝑜, 𝑗 − 1 + 𝑐(𝑗) × 𝑇 3 (𝑜 − 1, 𝑗) 𝑗 > 0 y += b[i]*xb[i]; 𝑇 1 (𝑜) 𝑗 = 1 𝑇 3 𝑜, 𝑗 = 𝑇 3 (𝑜 − 1, 𝑗 − 1) 𝑗 > 1 for (int i=N-1; i>0; i--) 𝑧 𝑜 = 𝑇 2 (𝑜, 𝑂 − 1) xb[i] = xb[i-1]; return y; } 16

Equational Representation Example: float tmp = 0; 𝑇 0 () = 0 𝑇 0 () 𝑗 = 0 for (int i=0; i<N; i++) 𝑇 1 (𝑗) = 𝑏𝑠𝑠 𝑗 + 𝑇 1 (𝑗 − 1) 𝑗 > 0 tmp = arr[i] + tmp; • Statement ≡ equation • Keeps track of data dependencies • Easy to transform / reason about • Relies on Array Dataflow Analysis (Feautrier, 1991) 17

Example: Simplified Deriche Filter for (int i=0; i<N; i++) { prev = 0; for (int j=0; j<N; j++) { Horizontal pass tmp[i][j] = a1*x[i][j] + b1*prev; (row scan) prev = tmp[i][j]; } } for (int j=0; j<N; j++) { prev = 0; for (int i=0; i<N; i++) { Vertical pass y[i][j] = a2*tmp[i][j] + b2*prev; (column scan) prev = y[i][j]; } } 20

Equation System After pre-processing: 𝑇 1 𝑗, 𝑘 = 𝑏 1 𝑦 𝑗, 𝑘 + 𝑐 1 𝑇 1 (𝑗, 𝑘 − 1) 𝑇 2 𝒌, 𝒋 = 𝑏 2 𝑇 1 𝒋, 𝒌 + 𝑐 2 𝑇 2 (𝑘, 𝑗 − 1) 21

Equation System After pre-processing: 𝑇 1 𝑗, 𝑘 = 𝑏 1 𝑦 𝑗, 𝑘 + 𝑐 1 𝑇 1 (𝑗, 𝑘 − 1) 𝑇 2 𝒌, 𝒋 = 𝑏 2 𝑇 1 𝒋, 𝒌 + 𝑐 2 𝑇 2 (𝑘, 𝑗 − 1) Swapped dimensions (Non-uniform dependences) 21

Steps of our method 2. Uniformization 𝑇 1 𝑗, 𝑘 = 𝑏 1 𝑦 𝑗, 𝑘 + 𝑐 1 𝑇 1 (𝑗, 𝑘 − 1) 𝑇 2 𝒋, 𝒌 = 𝑏 2 𝑇 1 𝒋, 𝒌 + 𝑐 2 𝑇 2 (𝑗 − 1, 𝑘) 23

Steps of our Method 3. Convolution Detection Computation pattern: 𝑧 𝒍 = 𝑑(𝒘) × 𝑦(𝒍 − 𝒘) 𝒘 • Pattern matching. • Simplifies noise propagation. 19

Convolution Detection After pre-processing: 𝑇 1 𝑗, 𝑘 = 𝑏 1 𝑦 𝑗, 𝑘 + 𝑐 1 𝑇 1 (𝑗, 𝑘 − 1) 𝑇 2 𝑗, 𝑘 = 𝑏 2 𝑇 1 𝑗, 𝑘 + 𝑐 2 𝑇 2 (𝑗 − 1, 𝑘) 𝑐 1 ∗ 𝑏 1 𝑦 ∗ 𝑇 1 23

Convolution Detection After pre-processing: 𝑇 1 𝑗, 𝑘 = 𝑏 1 𝑦 𝑗, 𝑘 + 𝑐 1 𝑇 1 (𝑗, 𝑘 − 1) 𝑇 2 𝑗, 𝑘 = 𝑏 2 𝑇 1 𝑗, 𝑘 + 𝑐 2 𝑇 2 (𝑗 − 1, 𝑘) 𝑐 1 𝑐 2 ∗ ∗ 𝑏 1 𝑏 2 𝑦 ∗ 𝑇 1 ∗ 𝑇 2 23

Accuracy Model Construction 4. Compute noise propagation for each source Extract subfilter to the output. Example: From statement 𝑇 1 to 𝑇 2 𝑐 1 𝑐 2 ∗ ∗ 𝑏 1 𝑏 2 𝑦 ∗ 𝑇 1 ∗ 𝑇 2 24

Impulse Response Computation Determines noise propagation: 𝒊 𝑐 1 𝑐 2 ∗ ∗ 𝑓𝑠𝑠 𝑓𝑠𝑠 𝑏 2 𝑗𝑜 𝑝𝑣𝑢 𝑇 1 ∗ 𝑇 2 𝑓𝑠𝑠 𝑝𝑣𝑢 𝒘 = 𝑓𝑠𝑠 𝑗𝑜 ∗ 𝒊 𝒘 • Easy to compute for non-recursive filters • Infinite for recursive filters 25

Non-Recursive Filters 𝑨 = 𝑦 ∗ ℎ 1 + 𝑦 ∗ ℎ 2 𝑧 = 𝑨 ∗ ℎ 3 ℎ 1 ℎ 3 ∗ 𝑦 𝑨 ∗ 𝑧 ∗ ℎ 2 26

Non-Recursive Filters 𝑨 = 𝑦 ∗ ℎ 1 + 𝑦 ∗ ℎ 2 𝑧 = 𝑨 ∗ ℎ 3 ℎ 1 ℎ 3 ∗ 𝑦 𝑨 ∗ 𝑧 ∗ ℎ 2 27

Non-Recursive Filters 𝑨 = 𝑦 ∗ (ℎ 1 +ℎ 2 ) 𝑧 = 𝑨 ∗ ℎ 3 ℎ 1 + ℎ 2 ℎ 3 𝑦 ∗ 𝑨 ∗ 𝑧 28

Non-Recursive Filters 𝑨 = 𝑦 ∗ (ℎ 1 +ℎ 2 ) 𝑧 = 𝑨 ∗ ℎ 3 ℎ 1 + ℎ 2 ℎ 3 𝑦 ∗ 𝑨 ∗ 𝑧 29

Non-Recursive Filters 𝑧 = 𝑦 ∗ ℎ 3 ∗ (ℎ 1 + ℎ 2 ) ℎ 3 ∗ (ℎ 1 + ℎ 2 ) 𝑦 ∗ 𝑧 𝒊 = ℎ 3 ∗ (ℎ 1 + ℎ 2 ) 30

Recursive Filters 𝑧 = 𝑦 ∗ ℎ 1 + 𝑧 ∗ ℎ 2 ℎ 2 ∗ ℎ 1 𝑦 ∗ 𝑧 Finding 𝒊 ≡ solving multidimensional recurrence • • Hard problem 31

Impulse Response Approximation • Hypothesis: Filter is stable 𝒊 𝒘 < ∞ 𝒘 • Consequence: 𝑠→∞ lim ℎ(𝑤) = 0 𝑤 >𝑠 • Idea: Evaluate impulse response on sufficiently large window. 32

Back to the Definition • Impulse response = output of the filter when the input is a unit impulse : 𝜀 𝑤 = 1 𝒘 = 𝟏 0 otherwise • Feed the filter with impulse and use the output as impulse response 33

Experimental Results: Model Construction Time Algorithm ID.Fix (s) Our Tool (s) IIR8 23.1 20.5 Sobel ( 𝟒𝟑 × 𝟒𝟑 ) 169.1 9.2 Sobel ( 𝟕𝟓 × 64) 2173.1 9.7 Sobel ( 𝟐𝟑𝟗 × 𝟐𝟑𝟗 ) - 9.4 Gaussian blur ( 𝟒𝟑 × 𝟒𝟑 ) 160.1 10.2 Gaussian blur ( 𝟕𝟓 × 𝟕𝟓 ) 2010.9 9.5 Gaussian blur ( 𝟐𝟑𝟗 × 𝟐𝟑𝟗 ) - 9.4 Deriche ( 𝟐𝟕 × 16) - 6.5 34

Experimental Results: Model Validity Algorithm Simulation (dB) Our Tool (dB) Error (%) IIR8 -17.80 -17.84 -0.2 Sobel 11.62 12.04 3.6 Gauss 3.78 3.78 0.1 Deriche -18.01 -18.06 -2.78 35

Conclusion 1. Extraction of a compact program representation (generalization of SFGs). 2. Reformulation of analytical techniques on this representation. 3. Wider applicability for analytical accuracy analysis 36

Open Issues • Extension to non linear, non time-invariant filters • Extensions exist for 1D SFGs • Expected to be easily applicable to our model • Regular, but non affine programs • Example: FFT • Highly correlated Inputs 37

Efficient Analysis of Multidimensional Linear Systems for - PowerPoint PPT Presentation

Efficient Analysis of Multidimensional Linear Systems for Wordlength Optimization Gal Deest Tomofumi Yuki Olivier Sentieys Steven Derrien This work was funded by European FP7 project Alma 1 Embedded System Design Many constraints:

Efficient signal processing using Haskell and LLVM Henning Thielemann 2016-09-15 Efficient

Efficient Graph Rewriting York Semigroup Graham Campbell May 2019 Graham Campbell Efficient

Energy Efficient Mortgages Initiative Energy efficient Mortgages Action Plan (EeMAP) Energy

An Introduction to Empirical Support of Efficient Market Hypothesis Behavioral Finance

Horn Formulas 1 Efficient satisfiability checks In the following: A very efficient

SWOT Analysis W T S O SWOT Analysis Learning Objectives What is SWOT Analysis? What is SWOT

Analysis and Optimizations Analysis and Optimizations Program Analysis Program Analysis

Efficient Analysis of Dynamical Properties in Stochastic Chemical Kinetic Models Hiroyuki

Technical Analysis Technical Analysis Technical Analysis Technical Analysis Introduction

ICE Analysis Training Program Module 5: How to Prepare the Analysis and Reach ICE Analysis

Alias Analysis Last time Reuse optimization Today Alias analysis (pointer analysis)

Survival Analysis / Time-to- Event Analysis in R Heidi Seibold Statistician at LMU Munich

E4 E4 E4 E4 - Energy Efficient Energy Efficient Elevators and Escalators Elevators and

BY: DOUG SCHERRER, MANAGING MEMBER, EFFICIENT ALPHA CAPITAL, LLC EFFICIENT ALPHA | PAGE | 02

Efficient Mining of Dissociation Rules Mikoaj Morzy 7 th International Conference DaWaK 2006

An Efficient Algorithm for An Efficient Algorithm for Simulating Coalescence with Simulating

Large-Scale R-CNN with Classifier Adaptive Quantization Redmon et al., ECCV 2016 Mincheul Kang

Autonomous SoC for Fuzzy Robot Path Tracking Kyriakos M. Deliparaschos, George P. Moustris,

Discretizing Time or States? A Comparative Study between DASSL and QSS Xenofon Floros 1 Francois

DESI GN OF A CELP CODER AND A STUDY DESI GN OF A CELP CODER AND A STUDY OF I TS PERFORMANCE USI

High Definition Oscilloscopes Embargo Date - October 22 nd 2012 Oscilloscope Evolution Digital

Small, Medium, and Big Data: Application of Machine Learning Methods to the Solution of

Berry Phases with Time Reversal Invariance Characterization of Gapped Quantum Liquids & LSM

North Palos School District 117s Reopening Plan for 2020-2021 Board of Education Meeting July

Efficient Analysis of Multidimensional Linear Systems for - PowerPoint PPT Presentation

Efficient Analysis of Multidimensional Linear Systems for Wordlength Optimization Gal Deest Tomofumi Yuki Olivier Sentieys Steven Derrien This work was funded by European FP7 project Alma 1 Embedded System Design Many constraints:

Efficient signal processing using Haskell and LLVM Henning Thielemann 2016-09-15 Efficient

Efficient Graph Rewriting York Semigroup Graham Campbell May 2019 Graham Campbell Efficient

Energy Efficient Mortgages Initiative Energy efficient Mortgages Action Plan (EeMAP) Energy

An Introduction to Empirical Support of Efficient Market Hypothesis Behavioral Finance

Horn Formulas 1 Efficient satisfiability checks In the following: A very efficient

SWOT Analysis W T S O SWOT Analysis Learning Objectives What is SWOT Analysis? What is SWOT

Analysis and Optimizations Analysis and Optimizations Program Analysis Program Analysis

Efficient Analysis of Dynamical Properties in Stochastic Chemical Kinetic Models Hiroyuki

Technical Analysis Technical Analysis Technical Analysis Technical Analysis Introduction

ICE Analysis Training Program Module 5: How to Prepare the Analysis and Reach ICE Analysis

Alias Analysis Last time Reuse optimization Today Alias analysis (pointer analysis)

Survival Analysis / Time-to- Event Analysis in R Heidi Seibold Statistician at LMU Munich

E4 E4 E4 E4 - Energy Efficient Energy Efficient Elevators and Escalators Elevators and

BY: DOUG SCHERRER, MANAGING MEMBER, EFFICIENT ALPHA CAPITAL, LLC EFFICIENT ALPHA | PAGE | 02

Efficient Mining of Dissociation Rules Mikoaj Morzy 7 th International Conference DaWaK 2006

An Efficient Algorithm for An Efficient Algorithm for Simulating Coalescence with Simulating

Large-Scale R-CNN with Classifier Adaptive Quantization Redmon et al., ECCV 2016 Mincheul Kang

Autonomous SoC for Fuzzy Robot Path Tracking Kyriakos M. Deliparaschos, George P. Moustris,

Discretizing Time or States? A Comparative Study between DASSL and QSS Xenofon Floros 1 Francois

DESI GN OF A CELP CODER AND A STUDY DESI GN OF A CELP CODER AND A STUDY OF I TS PERFORMANCE USI

High Definition Oscilloscopes Embargo Date - October 22 nd 2012 Oscilloscope Evolution Digital

Small, Medium, and Big Data: Application of Machine Learning Methods to the Solution of

Berry Phases with Time Reversal Invariance Characterization of Gapped Quantum Liquids &amp; LSM

North Palos School District 117s Reopening Plan for 2020-2021 Board of Education Meeting July

Berry Phases with Time Reversal Invariance Characterization of Gapped Quantum Liquids & LSM