polymage automatic optimization for image processing
play

PolyMage: Automatic Optimization for Image Processing Pipelines - PowerPoint PPT Presentation

PolyMage: Automatic Optimization for Image Processing Pipelines Ravi Teja Mullapudi Vinay Vasista Uday Bondhugula CSA, Indian Institute of Science June 27, 2016 Table of Contents 1 Image Processing Pipelines 2 Language 3 Compiler 4 Related


  1. PolyMage: Automatic Optimization for Image Processing Pipelines Ravi Teja Mullapudi Vinay Vasista Uday Bondhugula CSA, Indian Institute of Science June 27, 2016

  2. Table of Contents 1 Image Processing Pipelines 2 Language 3 Compiler 4 Related Work 5 Performance Evaluation

  3. Table of Contents 1 Image Processing Pipelines 2 Language 3 Compiler 4 Related Work 5 Performance Evaluation

  4. Image Processing Pipelines - Data Cameras and Internet Instagram 60 Million photos per day. http://instagram.com/press/ YouTube 100 hours of video uploaded every minute. https://www.youtube.com/yt/press/statistics.html Astronomy Large Synoptic Survey Telescope (LSST) Generates 30 TB of image data every night. http://lsst.org/lsst/google Medical Imaging Human Connectome Project fMRI data for 68 subjects 1.873 TB. http://www.humanconnectome.org/

  5. Image Processing Pipelines - Computation Synthesis, Enhancement and Analysis of Images Applications Computational Photography Computer Vision Medical Imaging

  6. Image Processing Pipelines - Challenges Need for Speed • Real-time processing • High resolution • Complex algorithms

  7. Image Processing Pipelines - Challenges Need for Speed Modern Architectures • Real-time processing • Deep memory hierarchies • High resolution • Parallelism • Complex algorithms • Heterogeneity

  8. Image Processing Pipelines - Challenges Need for Speed Modern Architectures • Real-time processing • Deep memory hierarchies • High resolution • Parallelism • Complex algorithms • Heterogeneity Hand Optimization • Requires expertise • Tedious and error prone • Not portable

  9. Image Processing Pipelines - Challenges Need for Speed Modern Architectures • Real-time processing • Deep memory hierarchies • High resolution • Parallelism • Complex algorithms • Heterogeneity Hand Optimization Libraries • OpenCV, CImg, MATLAB • Requires expertise • Limited optimization • Tedious and error prone • Architecture support • Not portable

  10. Domain Specific Languages Productivity, Performance and Portability • Decouple algorithms from schedules • Support common patterns in the domain • High performance compilation

  11. Image Processing Pipelines - Computation Patterns Point-wise f ( x , y ) = g ( x , y ) Stencil +1 +1 f ( x , y ) = � � g ( x + σ x , y + σ y ) σ x = − 1 σ y = − 1

  12. Image Processing Pipelines - Computation Patterns Downsample +1 +1 f ( x , y ) = � � g (2 x + σ x , 2 y + σ y ) σ x = − 1 σ y = − 1 Upsample +1 +1 f ( x , y ) = � � g (( x + σ x ) / 2 , ( y + σ y ) / 2) σ x = − 1 σ y = − 1

  13. Image Processing Pipelines - Computation Patterns Histogram f ( g ( x ))+ = 1 Time-iterated f ( t , x , y ) = g ( f ( t − 1 , x , y ))

  14. PolyMage Framework Build stage graph Polyhedral representation DSL Spec Static bounds check Default schedule Inlining Alignment Schedule transformation Scaling Code generation Storage optimization Grouping

  15. Table of Contents 1 Image Processing Pipelines 2 Language 3 Compiler 4 Related Work 5 Performance Evaluation

  16. Language Constructs Parameter N = Parameter ( I n t ) Variable x = V a r i a b l e () I = Image ( Float , [N]) Image c1 = Condition (x, ’>=’, 1) & Condition (x, ’<=’, N -2) c2 = Condition (x, ’==’, 0) | Condition (x, ’==’, N -1) Interval f = Function (varDom = ([x], I n t e r v a l (0, N-1, 1)), Float ) f.defn = [ Case (c1 , S t e n c i l (I(x), 1.0/3 , [[1, 1, 1]])), Function Case (c2 , 0) ] Accumulator Stencil f : [0 .. N − 1] → R Condition  +1 � I ( x + σ x ) / 3 if 1 ≤ x ≤ N − 2  Select  f ( x ) = σ x = − 1  0 if x = 0 ∨ x = N − 1 . Case  Accumulate

  17. Language Constructs Parameter Variable R, C = Parameter ( I n t ), Parameter ( I n t ) Image I = Image ( UChar , [R, C]) x, y = V a r i a b l e (), V a r i a b l e () Interval row , col = I n t e r v a l (0, R, 1), I n t e r v a l (0, C, 1) bins = I n t e r v a l (0, 255, 1) Function hist = Accumulator (redDom = ([x,y],[row ,col ]), varDom = ([x],bins), I n t ) Accumulator hist.defn = Accumulate (hist(I(x,y)), 1, Sum ) Stencil Condition hist : [0 .. 255] → Z Select hist ( p ) = | { ( x , y ) : I ( x , y ) = p } | Case Accumulate

  18. Unsharp Mask R, C = Parameter ( I n t ), Parameter ( I n t ) thresh , w = Parameter ( Float ), Parameter ( Float ) x, y, c = V a r i a b l e (), V a r i a b l e (), V a r i a b l e () I in I = Image ( Float , [3, R+4, C+4]) cr = I n t e r v a l (0, 2, 1) xr , xc = I n t e r v a l (2, R+1, 1), I n t e r v a l (0, C+3, 1) blur x yr , yc = I n t e r v a l (2, R+1, 1), I n t e r v a l (2, C+1, 1) Function (varDom = ([c, x, y], [cr , xr , xc]), Float ) blurx = S t e n c i l (I(c, x, y), 1.0/16 , blurx.defn = [ blur y [[1, 4, 6, 4, 1]]) ] Function (varDom = ([c, x, y], [cr , yr , yc]), Float ) blury = blury.defn = [ S t e n c i l (blurx(c, x, y), 1.0/16 , [[1] , [4], [6], [4], [1]]]) ] sharpen sharpen = Function (varDom = ([c, x, y], [cr , yr , yc]), Float ) sharpen.defn = [ I(c, x, y) * ( 1 + w ) - blury(c, x, y) * w ] masked masked = Function (varDom = ([c, x, y], [cr , yr , yc]), Float ) diff = Abs ((I(c, x, y) - blury(c, x, y))) cond = Condition ( diff , ‘<’, thresh ) masked. definition = Select(cond , I(c, x, y), sharpen(c, x, y))

  19. Harris Corner Detection R, C = Parameter ( I n t ), Parameter ( I n t ) I = Image ( Float , [R+2, C+2]) x, y = V a r i a b l e (), V a r i a b l e () row , col = I n t e r v a l (0,R+1 ,1), I n t e r v a l (0,C+1 ,1) c = Condition (x,’>=’ ,1) & Condition (x,’<=’,R) & Condition (y,’>=’ ,1) & Condition (y,’<=’,C) I in cb = Condition (x,’>=’ ,2) & Condition (x,’<=’,R -1) & Condition (y,’>=’ ,2) & Condition (y,’<=’,C -1) I y I x Iy = Function (varDom = ([x,y],[row ,col ]), Float ) Iy.defn = [ Case (c, S t e n c i l (I(x,y), 1.0/12 , [[-1, -2, -1], [ 0, 0, 0], [ 1, 2, 1]]) ] I xy I yy I xx Ix = Function (varDom = ([x,y],[row ,col ]), Float ) Ix.defn = [ Case (c, S t e n c i l (I(x,y), 1.0/12 , [[-1, 0, 1], [-2, 0, 2], [-1, 0, 1]]) ] S xy S yy S xx Ixx = Function (varDom = ([x,y],[row ,col ]), Float ) Ixx.defn = [ Case (c, Ix(x,y) * Ix(x,y)) ] Iyy = Function (varDom = ([x,y],[row ,col ]), Float ) Iyy.defn = [ Case (c, Iy(x,y) * Iy(x,y)) ] Ixy = Function (varDom = ([x,y],[row ,col ]), Float ) det Ixy.defn = [ Case (c, Ix(x,y) * Iy(x,y)) ] Sxx = Function (varDom = ([x,y],[row ,col ]), Float ) Syy = Function (varDom = ([x,y],[row ,col ]), Float ) Sxy = Function (varDom = ([x,y],[row ,col ]), Float ) f o r i n pair [(Sxx , Ixx), (Syy , Iyy), (Sxy , Ixy)]: trace pair [0]. defn = [ Case (cb , S t e n c i l (pair [1], 1, [[1, 1, 1], [1, 1, 1], [1, 1, 1]]) ] det = Function (varDom = ([x,y],[row ,col ]), Float ) d = Sxx(x,y) * Syy(x,y) - Sxy(x,y) * Sxy(x,y) harris det.defn = [ Case (cb , d) ] trace = Function (varDom = ([x,y],[row ,col ]), Float ) trace.defn = [ Case (cb , Sxx(x,y) + Syy(x,y)) ] harris = Function (varDom = ([x,y],[row ,col ]), Float ) coarsity = det(x,y) - .04 * trace(x,y) * trace(x,y) harris.defn = [ Case (cb , coarsity) ]

  20. Pyramid Blending ↓ y ↓ x ↓ y ↓ x ↓ y ↓ x M ↓ x ↓ y ↓ y ↓ x ↓ y ↓ x ↓ y ↓ x ↓ x ↓ y ↑ x ↑ x ↑ x ↑ x ↑ x ↑ x ↑ y ↑ y ↑ y ↑ y ↑ y ↑ y ↓ x ↓ y L L L L L L X X ↑ x X ↑ x X ↑ x ↑ + ↑ + ↑ +

  21. Table of Contents 1 Image Processing Pipelines 2 Language 3 Compiler 4 Related Work 5 Performance Evaluation

  22. Compiler - Polyhedral Representation x = V a r i a b l e () f in = Image ( Float , [18]) f 1 = Function (varDom = ([x], [ I n t e r v a l (0, 17, 1)]), Float ) f 1 .defn = [ f in (x) + 1 ] f 2 = Function (varDom = ([x], [ I n t e r v a l (1, 16, 1)]), Float ) f 2 .defn = [ f 1 (x -1) + f 1 (x+1) ] f out = Function (varDom = ([x], [ I n t e r v a l (2, 15, 1)]), Float ) f out .defn = [ f 2 (x -1) . f 2 (x+1) ] Domains f out ( x ) f 2 ( x ) f 1 ( x ) x

  23. Compiler - Polyhedral Representation x = V a r i a b l e () f in = Image ( Float , [18]) f 1 = Function (varDom = ([x], [ I n t e r v a l (0, 17, 1)]), Float ) f 1 .defn = [ f in (x) + 1 ] f 2 = Function (varDom = ([x], [ I n t e r v a l (1, 16, 1)]), Float ) f 2 .defn = [ f 1 (x -1) + f 1 (x+1) ] f out = Function (varDom = ([x], [ I n t e r v a l (2, 15, 1)]), Float ) f out .defn = [ f 2 (x -1) . f 2 (x+1) ] Dependence vectors f out ( x ) f 2 ( x ) f 1 ( x ) x

  24. Compiler - Polyhedral Representation x = V a r i a b l e () f in = Image ( Float , [18]) f 1 = Function (varDom = ([x], [ I n t e r v a l (0, 17, 1)]), Float ) f 1 .defn = [ f in (x) + 1 ] f 2 = Function (varDom = ([x], [ I n t e r v a l (1, 16, 1)]), Float ) f 2 .defn = [ f 1 (x -1) + f 1 (x+1) ] f out = Function (varDom = ([x], [ I n t e r v a l (2, 15, 1)]), Float ) f out .defn = [ f 2 (x -1) . f 2 (x+1) ] Live-outs f out ( x ) f 2 ( x ) f 1 ( x ) x

  25. Compiler - Polyhedral Representation x = V a r i a b l e () f in = Image ( Float , [18]) f 1 = Function (varDom = ([x], [ I n t e r v a l (0, 17, 1)]), Float ) f 1 .defn = [ f in (x) + 1 ] f 2 = Function (varDom = ([x], [ I n t e r v a l (1, 16, 1)]), Float ) f 2 .defn = [ f 1 (x -1) + f 1 (x+1) ] f out = Function (varDom = ([x], [ I n t e r v a l (2, 15, 1)]), Float ) f out .defn = [ f 2 (x -1) . f 2 (x+1) ] Schedule default f out ( x ) f out ( x ) → (2 , x ) f 2 ( x ) f 2 ( x ) → (1 , x ) f 1 ( x ) f 1 ( x ) → (0 , x ) x

Recommend


More recommend