pointless fusion for pointwise application
play

Pointless fusion for pointwise application Malcolm Wallace - PowerPoint PPT Presentation

Pointless fusion for pointwise application Malcolm Wallace University of York (soon @ Standard-Chartered Bank) Monday, 15 February 1 1.000E+003 7.216E+001 7.599E-001 6.937E-005 2.400E-001 3.135E-054 4.936E-079 2.600E-012 1.455E-010 7.263E-015


  1. Pointless fusion for pointwise application Malcolm Wallace University of York (soon @ Standard-Chartered Bank) Monday, 15 February 1

  2. 1.000E+003 7.216E+001 7.599E-001 6.937E-005 2.400E-001 3.135E-054 4.936E-079 2.600E-012 1.455E-010 7.263E-015 1.000E+003 7.216E+001 7.599E-001 6.937E-005 2.400E-001 2.410E-054 4.936E-079 2.601E-012 1.456E-010 7.264E-015 1.000E+003 7.216E+001 7.599E-001 6.937E-005 2.400E-001 1.853E-054 4.936E-079 2.601E-012 1.457E-010 7.265E-015 An array of numbers 1.000E+003 7.216E+001 7.599E-001 6.937E-005 2.400E-001 1.425E-054 4.936E-079 2.602E-012 1.458E-010 7.266E-015 1.000E+003 7.216E+001 7.599E-001 6.937E-005 2.400E-001 1.095E-054 4.936E-079 2.602E-012 1.459E-010 7.267E-015 1.000E+003 7.216E+001 7.599E-001 6.937E-005 2.400E-001 8.421E-055 4.936E-079 2.602E-012 1.460E-010 7.268E-015 1.000E+003 7.216E+001 7.599E-001 6.937E-005 2.400E-001 6.474E-055 4.936E-079 2.603E-012 1.461E-010 7.269E-015 1.000E+003 7.216E+001 7.599E-001 6.937E-005 2.400E-001 4.977E-055 4.936E-079 2.603E-012 1.462E-010 7.270E-015 1.000E+003 7.216E+001 7.599E-001 6.937E-005 2.400E-001 3.827E-055 4.936E-079 2.604E-012 1.463E-010 7.271E-015 1.000E+003 7.216E+001 7.599E-001 6.937E-005 2.400E-001 2.942E-055 4.936E-079 2.604E-012 1.464E-010 7.272E-015 A LARGE array of numbers 8.563E+002 2.051E+004 4.180E-004 7.596E-001 5.260E-005 6.898E-002 1.710E-001 8.053E-011 2.686E-013 8.650E-012 8.571E+002 2.050E+004 4.205E-004 7.596E-001 5.546E-005 7.249E-002 1.675E-001 8.104E-011 2.729E-013 8.706E-012 8.581E+002 2.049E+004 4.231E-004 7.596E-001 5.855E-005 7.627E-002 1.637E-001 8.158E-011 2.775E-013 8.764E-012 8.592E+002 2.047E+004 4.258E-004 7.596E-001 6.190E-005 8.037E-002 1.596E-001 8.214E-011 2.823E-013 8.824E-012 8.604E+002 2.046E+004 4.286E-004 7.596E-001 6.555E-005 8.479E-002 1.551E-001 8.274E-011 2.874E-013 8.887E-012 8.619E+002 2.044E+004 4.315E-004 7.596E-001 6.952E-005 8.958E-002 1.504E-001 8.338E-011 2.929E-013 8.952E-012 8.635E+002 2.041E+004 4.346E-004 7.596E-001 7.383E-005 9.474E-002 1.452E-001 8.405E-011 2.988E-013 9.020E-012 95,946,240,000 x 8.653E+002 2.039E+004 4.379E-004 7.596E-001 7.851E-005 1.003E-001 1.396E-001 8.477E-011 3.052E-013 9.090E-012 8.674E+002 2.036E+004 4.413E-004 7.596E-001 8.358E-005 1.063E-001 1.336E-001 8.555E-011 3.120E-013 9.164E-012 8.697E+002 2.032E+004 4.450E-004 7.596E-001 8.905E-005 1.127E-001 1.272E-001 8.639E-011 3.194E-013 9.241E-012 8.723E+002 2.028E+004 4.489E-004 7.596E-001 9.495E-005 1.195E-001 1.204E-001 8.729E-011 3.275E-013 9.323E-012 8.752E+002 2.024E+004 4.531E-004 7.595E-001 1.013E-004 1.267E-001 1.132E-001 8.827E-011 3.364E-013 9.408E-012 8.784E+002 2.019E+004 4.576E-004 7.595E-001 1.080E-004 1.343E-001 1.056E-001 8.934E-011 3.461E-013 9.498E-012 8.819E+002 2.013E+004 4.624E-004 7.595E-001 1.151E-004 1.421E-001 9.775E-002 9.049E-011 3.567E-013 9.593E-012 8.857E+002 2.007E+004 4.675E-004 7.595E-001 1.225E-004 1.502E-001 8.966E-002 9.174E-011 3.684E-013 9.692E-012 8.899E+002 2.000E+004 4.730E-004 7.595E-001 1.303E-004 1.584E-001 8.145E-002 9.312E-011 3.813E-013 9.797E-012 8.945E+002 1.992E+004 4.791E-004 7.595E-001 1.382E-004 1.666E-001 7.324E-002 9.465E-011 3.959E-013 9.908E-012 8.994E+002 1.984E+004 4.856E-004 7.595E-001 1.462E-004 1.747E-001 6.518E-002 9.627E-011 4.116E-013 1.002E-011 9.043E+002 1.976E+004 4.923E-004 7.595E-001 1.542E-004 1.824E-001 5.744E-002 9.796E-011 4.283E-013 1.014E-011 9.092E+002 1.967E+004 4.992E-004 7.595E-001 1.619E-004 1.897E-001 5.012E-002 9.972E-011 4.459E-013 1.026E-011 9.140E+002 1.959E+004 5.063E-004 7.595E-001 1.693E-004 1.965E-001 4.335E-002 1.015E-010 4.644E-013 1.038E-011 9.191E+002 1.950E+004 5.138E-004 7.595E-001 1.765E-004 2.027E-001 3.716E-002 1.035E-010 4.846E-013 1.051E-011 9.240E+002 1.942E+004 5.215E-004 7.595E-001 1.831E-004 2.082E-001 3.164E-002 1.054E-010 5.054E-013 1.064E-011 9.286E+002 1.934E+004 5.290E-004 7.595E-001 1.893E-004 2.130E-001 2.678E-002 1.073E-010 5.264E-013 1.076E-011 9.329E+002 1.927E+004 5.365E-004 7.595E-001 1.948E-004 2.172E-001 2.257E-002 1.092E-010 5.477E-013 1.089E-011 9.368E+002 1.920E+004 5.439E-004 7.595E-001 1.998E-004 2.208E-001 1.895E-002 1.111E-010 5.691E-013 1.101E-011 9.406E+002 1.914E+004 5.514E-004 7.594E-001 2.044E-004 2.239E-001 1.588E-002 1.129E-010 5.911E-013 1.113E-011 9.442E+002 1.908E+004 5.590E-004 7.594E-001 2.086E-004 2.265E-001 1.328E-002 1.148E-010 6.137E-013 1.126E-011 9.475E+002 1.903E+004 5.665E-004 7.594E-001 2.124E-004 2.287E-001 1.110E-002 1.166E-010 6.363E-013 1.138E-011 9.506E+002 1.898E+004 5.738E-004 7.594E-001 2.158E-004 2.305E-001 9.281E-003 1.184E-010 6.590E-013 1.151E-011 9.534E+002 1.893E+004 5.811E-004 7.594E-001 2.189E-004 2.320E-001 7.767E-003 1.202E-010 6.818E-013 1.163E-011 Monday, 15 February 2

  3. Multi-dimensional data 95,946,240,000 = 200 x 248 x 248 x 600 x 13 timesteps Z Y X fields fields: pressure temperature density gas motion (vector) H + ion concentration H 2 concentration ... Monday, 15 February 3

  4. Flat or structured? Flat representation: i=[0..3079] “Virtual” dimensions: z=[0..4] y=[0..21] x=[0..27] Monday, 15 February 4

  5. 2D contours Monday, 15 February 5

  6. 2D continuous Monday, 15 February 6

  7. 3D surfaces Monday, 15 February 7

  8. 3D volume Monday, 15 February 8

  9. 3D vectors Monday, 15 February 9

  10. 2D sketchline vectors Monday, 15 February 10

  11. 2D sketchline vectors Monday, 15 February 11

  12. Animation Monday, 15 February 12

  13. Overlays Monday, 15 February 13

  14. Scatterplot Monday, 15 February 14

  15. Requirements 95,946,240,000 = 200 x 248 x 248 x 600 x 13 timesteps Z Y X fields • Flat ‣ Dimensionally-structured numeric data • Packed ‣ Close-packing, no boxing • Large ‣ Avoid copying, moving, or intermediates • Affordable ‣ Good performance is a goal - minimal traversals - streaming if possible Monday, 15 February 15

  16. Observation Array operations fall into three categories: • structural manipulation • slice, sub-range, downsample • tuple, join, split • transpose, rotate dimensions • pointwise computation • arithmetic • comparison • folds • sums, products Monday, 15 February 16

  17. Structural fusion Monday, 15 February 17

  18. Isosurfacing picture th dataset = isosurface th (mkCell8 (size dataset) (values dataset)) (mkCell8 (size dataset) (locations dataset)) isosurface th samples geom = zipWith (surf_cell th) (stream samples) (stream geom) surf_cell th sample geom = map (surf_geom th sample geom) $ mc_case $ fmap (>th) sample surf_geom th sample geom (v0,v1) = interp (inv_interp th samp_0 samp_1) geom_0 geom_1 where samp_0 = select v0 sample samp_1 = select v1 sample geom_0 = select v0 geom geom_1 = select v1 geom mkCell8 sz v = zipWith8 Cell8 v (drop 1 v) (drop line v) (drop (1+line) v) (drop plane v) (drop (1+plane) v) (drop (plane+line) v) (drop (1+plane+line) v) where line = thd3 size plane = snd3 size Monday, 15 February 18

  19. Isosurfacing picture th dataset = isosurface th (mkCell8 (size dataset) (values dataset)) (mkCell8 (size dataset) (locations dataset)) isosurface th samples geom = zipWith (surf_cell th) (stream samples) (stream geom) surf_cell th sample geom = map (surf_geom th sample geom) $ mc_case $ fmap (>th) sample surf_geom th sample geom (v0,v1) = interp (inv_interp th samp_0 samp_1) geom_0 geom_1 where samp_0 = select v0 sample samp_1 = select v1 sample geom_0 = select v0 geom geom_1 = select v1 geom mkCell8 sz v = zipWith8 Cell8 v (drop 1 v) (drop line v) (drop (1+line) v) (drop plane v) (drop (1+plane) v) (drop (plane+line) v) (drop (1+plane+line) v) where line = thd3 size plane = snd3 size Monday, 15 February 19

  20. Re-structured data 200 x 248 x 248 x 600 x 13 timesteps Z Y X fields sub-range slice downsample slices [160,162..180] [124] [0,4..247] [0,4..599] [2,7] Monday, 15 February 20

  21. Fusion 1: Structural slice i (transpose (subrange j k arr)) intermediate arrays original array result array Monday, 15 February 21

  22. Represent traversal explicitly varies faster 200 x 248 x 248 x 600 x 13 timesteps Z Y X fields sub-range slice downsample slices [160,162..180] [124] [0,4..247] [0,4..599] [2,7] for (t=160; t<=180; t+=2) { z = 124; for (y=0; y<=247; y+=4) { for (x=0; x<=599; x+=4) { for (field `elem` [2,7]) { arr [ t*248*248*600*13 + z*248*600*13 + y*600*13 + x*13 + field ]; } } } } Monday, 15 February 22

  23. Manipulate traversal order slice i (transpose (subrange j k arr)) 200 x 248 x [0,4..248] x [0,4..600] x [2,7] timesteps Z Y X fields for (t=0; t<=199; t+=1) { for (z=0; z<=247; z+=1) { for (y=0; y<=247; y+=4) { for (x=0; x<=599; x+=4) { for (field `elem` [2,7]) { arr [ t*248*248*600*13 + z*248*600*13 + y*600*13 + x*13 + field ]; } } } } } Monday, 15 February 23

  24. Manipulate traversal order slice i (transpose (subrange j k arr)) [j..k] x 248 x [0,4..248] x [0,4..600] x [2,7] timesteps Z Y X fields for (t=j; t<=k; t+=1) { for (z=0; z<=247; z+=1) { for (y=0; y<=247; y+=4) { for (x=0; x<=599; x+=4) { for (field `elem` [2,7]) { arr [ t*248*248*600*13 + z*248*600*13 + y*600*13 + x*13 + field ]; } } } } } Monday, 15 February 24

Recommend


More recommend