Pointless fusion for pointwise application Malcolm Wallace University of York (soon @ Standard-Chartered Bank) Monday, 15 February 1
1.000E+003 7.216E+001 7.599E-001 6.937E-005 2.400E-001 3.135E-054 4.936E-079 2.600E-012 1.455E-010 7.263E-015 1.000E+003 7.216E+001 7.599E-001 6.937E-005 2.400E-001 2.410E-054 4.936E-079 2.601E-012 1.456E-010 7.264E-015 1.000E+003 7.216E+001 7.599E-001 6.937E-005 2.400E-001 1.853E-054 4.936E-079 2.601E-012 1.457E-010 7.265E-015 An array of numbers 1.000E+003 7.216E+001 7.599E-001 6.937E-005 2.400E-001 1.425E-054 4.936E-079 2.602E-012 1.458E-010 7.266E-015 1.000E+003 7.216E+001 7.599E-001 6.937E-005 2.400E-001 1.095E-054 4.936E-079 2.602E-012 1.459E-010 7.267E-015 1.000E+003 7.216E+001 7.599E-001 6.937E-005 2.400E-001 8.421E-055 4.936E-079 2.602E-012 1.460E-010 7.268E-015 1.000E+003 7.216E+001 7.599E-001 6.937E-005 2.400E-001 6.474E-055 4.936E-079 2.603E-012 1.461E-010 7.269E-015 1.000E+003 7.216E+001 7.599E-001 6.937E-005 2.400E-001 4.977E-055 4.936E-079 2.603E-012 1.462E-010 7.270E-015 1.000E+003 7.216E+001 7.599E-001 6.937E-005 2.400E-001 3.827E-055 4.936E-079 2.604E-012 1.463E-010 7.271E-015 1.000E+003 7.216E+001 7.599E-001 6.937E-005 2.400E-001 2.942E-055 4.936E-079 2.604E-012 1.464E-010 7.272E-015 A LARGE array of numbers 8.563E+002 2.051E+004 4.180E-004 7.596E-001 5.260E-005 6.898E-002 1.710E-001 8.053E-011 2.686E-013 8.650E-012 8.571E+002 2.050E+004 4.205E-004 7.596E-001 5.546E-005 7.249E-002 1.675E-001 8.104E-011 2.729E-013 8.706E-012 8.581E+002 2.049E+004 4.231E-004 7.596E-001 5.855E-005 7.627E-002 1.637E-001 8.158E-011 2.775E-013 8.764E-012 8.592E+002 2.047E+004 4.258E-004 7.596E-001 6.190E-005 8.037E-002 1.596E-001 8.214E-011 2.823E-013 8.824E-012 8.604E+002 2.046E+004 4.286E-004 7.596E-001 6.555E-005 8.479E-002 1.551E-001 8.274E-011 2.874E-013 8.887E-012 8.619E+002 2.044E+004 4.315E-004 7.596E-001 6.952E-005 8.958E-002 1.504E-001 8.338E-011 2.929E-013 8.952E-012 8.635E+002 2.041E+004 4.346E-004 7.596E-001 7.383E-005 9.474E-002 1.452E-001 8.405E-011 2.988E-013 9.020E-012 95,946,240,000 x 8.653E+002 2.039E+004 4.379E-004 7.596E-001 7.851E-005 1.003E-001 1.396E-001 8.477E-011 3.052E-013 9.090E-012 8.674E+002 2.036E+004 4.413E-004 7.596E-001 8.358E-005 1.063E-001 1.336E-001 8.555E-011 3.120E-013 9.164E-012 8.697E+002 2.032E+004 4.450E-004 7.596E-001 8.905E-005 1.127E-001 1.272E-001 8.639E-011 3.194E-013 9.241E-012 8.723E+002 2.028E+004 4.489E-004 7.596E-001 9.495E-005 1.195E-001 1.204E-001 8.729E-011 3.275E-013 9.323E-012 8.752E+002 2.024E+004 4.531E-004 7.595E-001 1.013E-004 1.267E-001 1.132E-001 8.827E-011 3.364E-013 9.408E-012 8.784E+002 2.019E+004 4.576E-004 7.595E-001 1.080E-004 1.343E-001 1.056E-001 8.934E-011 3.461E-013 9.498E-012 8.819E+002 2.013E+004 4.624E-004 7.595E-001 1.151E-004 1.421E-001 9.775E-002 9.049E-011 3.567E-013 9.593E-012 8.857E+002 2.007E+004 4.675E-004 7.595E-001 1.225E-004 1.502E-001 8.966E-002 9.174E-011 3.684E-013 9.692E-012 8.899E+002 2.000E+004 4.730E-004 7.595E-001 1.303E-004 1.584E-001 8.145E-002 9.312E-011 3.813E-013 9.797E-012 8.945E+002 1.992E+004 4.791E-004 7.595E-001 1.382E-004 1.666E-001 7.324E-002 9.465E-011 3.959E-013 9.908E-012 8.994E+002 1.984E+004 4.856E-004 7.595E-001 1.462E-004 1.747E-001 6.518E-002 9.627E-011 4.116E-013 1.002E-011 9.043E+002 1.976E+004 4.923E-004 7.595E-001 1.542E-004 1.824E-001 5.744E-002 9.796E-011 4.283E-013 1.014E-011 9.092E+002 1.967E+004 4.992E-004 7.595E-001 1.619E-004 1.897E-001 5.012E-002 9.972E-011 4.459E-013 1.026E-011 9.140E+002 1.959E+004 5.063E-004 7.595E-001 1.693E-004 1.965E-001 4.335E-002 1.015E-010 4.644E-013 1.038E-011 9.191E+002 1.950E+004 5.138E-004 7.595E-001 1.765E-004 2.027E-001 3.716E-002 1.035E-010 4.846E-013 1.051E-011 9.240E+002 1.942E+004 5.215E-004 7.595E-001 1.831E-004 2.082E-001 3.164E-002 1.054E-010 5.054E-013 1.064E-011 9.286E+002 1.934E+004 5.290E-004 7.595E-001 1.893E-004 2.130E-001 2.678E-002 1.073E-010 5.264E-013 1.076E-011 9.329E+002 1.927E+004 5.365E-004 7.595E-001 1.948E-004 2.172E-001 2.257E-002 1.092E-010 5.477E-013 1.089E-011 9.368E+002 1.920E+004 5.439E-004 7.595E-001 1.998E-004 2.208E-001 1.895E-002 1.111E-010 5.691E-013 1.101E-011 9.406E+002 1.914E+004 5.514E-004 7.594E-001 2.044E-004 2.239E-001 1.588E-002 1.129E-010 5.911E-013 1.113E-011 9.442E+002 1.908E+004 5.590E-004 7.594E-001 2.086E-004 2.265E-001 1.328E-002 1.148E-010 6.137E-013 1.126E-011 9.475E+002 1.903E+004 5.665E-004 7.594E-001 2.124E-004 2.287E-001 1.110E-002 1.166E-010 6.363E-013 1.138E-011 9.506E+002 1.898E+004 5.738E-004 7.594E-001 2.158E-004 2.305E-001 9.281E-003 1.184E-010 6.590E-013 1.151E-011 9.534E+002 1.893E+004 5.811E-004 7.594E-001 2.189E-004 2.320E-001 7.767E-003 1.202E-010 6.818E-013 1.163E-011 Monday, 15 February 2
Multi-dimensional data 95,946,240,000 = 200 x 248 x 248 x 600 x 13 timesteps Z Y X fields fields: pressure temperature density gas motion (vector) H + ion concentration H 2 concentration ... Monday, 15 February 3
Flat or structured? Flat representation: i=[0..3079] “Virtual” dimensions: z=[0..4] y=[0..21] x=[0..27] Monday, 15 February 4
2D contours Monday, 15 February 5
2D continuous Monday, 15 February 6
3D surfaces Monday, 15 February 7
3D volume Monday, 15 February 8
3D vectors Monday, 15 February 9
2D sketchline vectors Monday, 15 February 10
2D sketchline vectors Monday, 15 February 11
Animation Monday, 15 February 12
Overlays Monday, 15 February 13
Scatterplot Monday, 15 February 14
Requirements 95,946,240,000 = 200 x 248 x 248 x 600 x 13 timesteps Z Y X fields • Flat ‣ Dimensionally-structured numeric data • Packed ‣ Close-packing, no boxing • Large ‣ Avoid copying, moving, or intermediates • Affordable ‣ Good performance is a goal - minimal traversals - streaming if possible Monday, 15 February 15
Observation Array operations fall into three categories: • structural manipulation • slice, sub-range, downsample • tuple, join, split • transpose, rotate dimensions • pointwise computation • arithmetic • comparison • folds • sums, products Monday, 15 February 16
Structural fusion Monday, 15 February 17
Isosurfacing picture th dataset = isosurface th (mkCell8 (size dataset) (values dataset)) (mkCell8 (size dataset) (locations dataset)) isosurface th samples geom = zipWith (surf_cell th) (stream samples) (stream geom) surf_cell th sample geom = map (surf_geom th sample geom) $ mc_case $ fmap (>th) sample surf_geom th sample geom (v0,v1) = interp (inv_interp th samp_0 samp_1) geom_0 geom_1 where samp_0 = select v0 sample samp_1 = select v1 sample geom_0 = select v0 geom geom_1 = select v1 geom mkCell8 sz v = zipWith8 Cell8 v (drop 1 v) (drop line v) (drop (1+line) v) (drop plane v) (drop (1+plane) v) (drop (plane+line) v) (drop (1+plane+line) v) where line = thd3 size plane = snd3 size Monday, 15 February 18
Isosurfacing picture th dataset = isosurface th (mkCell8 (size dataset) (values dataset)) (mkCell8 (size dataset) (locations dataset)) isosurface th samples geom = zipWith (surf_cell th) (stream samples) (stream geom) surf_cell th sample geom = map (surf_geom th sample geom) $ mc_case $ fmap (>th) sample surf_geom th sample geom (v0,v1) = interp (inv_interp th samp_0 samp_1) geom_0 geom_1 where samp_0 = select v0 sample samp_1 = select v1 sample geom_0 = select v0 geom geom_1 = select v1 geom mkCell8 sz v = zipWith8 Cell8 v (drop 1 v) (drop line v) (drop (1+line) v) (drop plane v) (drop (1+plane) v) (drop (plane+line) v) (drop (1+plane+line) v) where line = thd3 size plane = snd3 size Monday, 15 February 19
Re-structured data 200 x 248 x 248 x 600 x 13 timesteps Z Y X fields sub-range slice downsample slices [160,162..180] [124] [0,4..247] [0,4..599] [2,7] Monday, 15 February 20
Fusion 1: Structural slice i (transpose (subrange j k arr)) intermediate arrays original array result array Monday, 15 February 21
Represent traversal explicitly varies faster 200 x 248 x 248 x 600 x 13 timesteps Z Y X fields sub-range slice downsample slices [160,162..180] [124] [0,4..247] [0,4..599] [2,7] for (t=160; t<=180; t+=2) { z = 124; for (y=0; y<=247; y+=4) { for (x=0; x<=599; x+=4) { for (field `elem` [2,7]) { arr [ t*248*248*600*13 + z*248*600*13 + y*600*13 + x*13 + field ]; } } } } Monday, 15 February 22
Manipulate traversal order slice i (transpose (subrange j k arr)) 200 x 248 x [0,4..248] x [0,4..600] x [2,7] timesteps Z Y X fields for (t=0; t<=199; t+=1) { for (z=0; z<=247; z+=1) { for (y=0; y<=247; y+=4) { for (x=0; x<=599; x+=4) { for (field `elem` [2,7]) { arr [ t*248*248*600*13 + z*248*600*13 + y*600*13 + x*13 + field ]; } } } } } Monday, 15 February 23
Manipulate traversal order slice i (transpose (subrange j k arr)) [j..k] x 248 x [0,4..248] x [0,4..600] x [2,7] timesteps Z Y X fields for (t=j; t<=k; t+=1) { for (z=0; z<=247; z+=1) { for (y=0; y<=247; y+=4) { for (x=0; x<=599; x+=4) { for (field `elem` [2,7]) { arr [ t*248*248*600*13 + z*248*600*13 + y*600*13 + x*13 + field ]; } } } } } Monday, 15 February 24
Recommend
More recommend