Multigrid Codes Alexander Grebhahn, Norbert Siegmund, Sven Apel - - PowerPoint PPT Presentation

multigrid codes
SMART_READER_LITE
LIVE PREVIEW

Multigrid Codes Alexander Grebhahn, Norbert Siegmund, Sven Apel - - PowerPoint PPT Presentation

Performance-Influence Models of Multigrid Codes Alexander Grebhahn, Norbert Siegmund, Sven Apel University of Passau ExaStencils @ Dagstuhl April 2015 1 Alexander Grebhahn Whic Which is h is t the he Opt Optimal imal C Configur


slide-1
SLIDE 1

Alexander Grebhahn

Performance-Influence Models of Multigrid Codes

Alexander Grebhahn, Norbert Siegmund, Sven Apel

University of Passau

ExaStencils @ Dagstuhl April 2015

1

slide-2
SLIDE 2

Alexander Grebhahn

Whic Which is h is t the he Opt Optimal imal C Configur

  • nfigurat

ation ion for a for a giv given H en Har ardw dwar are e Plat Platform? form?

2

slide-3
SLIDE 3

Alexander Grebhahn

How How to to Ide Identify ntify Opti Optimal Configurat mal Configurations? ions?

350 optinal binary options lead to more configurations than the expected number of atoms in the universe Numeric options make things much worse Optimal configuration for Optimal configuration for

3

slide-4
SLIDE 4

Alexander Grebhahn

What can w What can we e do? do?

 Use Machine Learning

Pros:

Automated

Many tools, much research

Cons:

Overfitting, underfitting

Not tailored to the application domain

4

slide-5
SLIDE 5

Alexander Grebhahn

What can w What can we e do? do?

 Use Machine Learning

Pros:

Automated

Many tools, much research

Cons:

Overfitting, underfitting

Not tailored to the application domain

  • ptimal conf.

influence model

5

slide-6
SLIDE 6

Alexander Grebhahn

What can w What can we e do? do?

 Use Machine Learning

Pros:

Automated

Many tools, much research

Cons:

Overfitting, underfitting

Not tailored to the application domain

 Use Domain Knowledge

Pros:

Knowledge about asymptotic behavior

No measurement overhead

Cons:

Expensive, hard to incorperate

Sometimes misleading

  • ptimal conf.
influence model

6

slide-7
SLIDE 7

Alexander Grebhahn

 Simple example: Multigrid Solver

With 320 configurations (c ∈ C)

 Performance-Influence Model (Π)  Π : C → R  Π(c) = 77 - 4.5 * GS + 24.5 * pre + 32 * GS * pre

  • 1.3 * pre * AMG + …
  • ptimal conf.
influence model

Π

7

Performance Performance-Inf Influence luence Model Model

slide-8
SLIDE 8

Alexander Grebhahn

 Simple example: Multigrid Solver

With 320 configurations (c ∈ C)

 Performance-Influence Model (Π)  Π : C → R  Π(c) = 77 - 4.5 * GS + 24.5 * pre + 32 * GS * pre

  • 1.3 * pre * AMG + …

 Simple example: Multigrid Solver

With 320 configurations (c ∈ C)

 Performance-Influence Model (Π)  Π : C → R  Π(c) = 77 - 4.5 * GS + 24.5 * pre + 32 * GS * pre

  • 1.3 * pre * AMG + …
  • ptimal conf.
influence model

Π

8

Performance Performance-Inf Influence luence Model Model

slide-9
SLIDE 9

Alexander Grebhahn

Performance Performance-Inf Influence luence Model Model

{MGS}

9

  • ptimal conf.
influence model

Π

slide-10
SLIDE 10

Alexander Grebhahn

Performance Performance-Inf Influence luence Model Model

{MGS} {MGS}+{pre-smoothing} {MGS}+{post-smoothing} {MGS}+{GS} {MGS}+{Jac} {MGS}+{AMG} {MGS}+{CG}

10

  • ptimal conf.
influence model

Π

slide-11
SLIDE 11

Alexander Grebhahn

Performance Performance-Inf Influence luence Model Model

{MGS} {MGS}+{pre-smoothing}

11

  • ptimal conf.
influence model

Π

slide-12
SLIDE 12

Alexander Grebhahn

Performance Performance-Inf Influence luence Model Model

{MGS}+{pre-smoothing} {MGS}+{pre-smoothing}+{pre-smoothing} {MGS}+{pre-smoothing}+{post-smoothing} {MGS}+{pre-smoothing}+{GS} {MGS}+{pre-smoothing}+{Jac} {MGS}+{pre-smoothing}+{AMG} {MGS}+{pre-smoothing}+{CG} {MGS}+{pre-smoothing,pre-smoothing} {MGS}+{pre-smoothing,post-smoothing} {MGS}+{pre-smoothing,GS} {MGS}+{pre-smoothing,Jac} {MGS}+{pre-smoothing,AMG} {MGS}+{pre-smoothing,CG}

12

  • ptimal conf.
influence model

Π

slide-13
SLIDE 13

Alexander Grebhahn

Performance Performance-Inf Influence luence Model Model

{MGS}+{pre-smoothing} {MGS}+{pre-smoothing}+{GS}

13

  • ptimal conf.
influence model

Π

slide-14
SLIDE 14

Alexander Grebhahn

Performance Performance-Inf Influence luence Model Model

{MGS}+{pre-smoothing}+{GS} {MGS}+{pre-smoothing}+{GS} +{pre-smoothing} {MGS}+{pre-smoothing}+{GS} +{post-smoothing} {MGS}+{pre-smoothing}+{GS}+{GS} {MGS}+{pre-smoothing}+{GS}+{Jac} {MGS}+{pre-smoothing}+{GS}+{AMG} {MGS}+{pre-smoothing}+{GS}+{CG} {MGS}+{pre-smoothing,pre-smoothing}+{GS} {MGS}+{pre-smoothing,post-smoothing}+{GS} {MGS}+{pre-smoothing,GS}+{GS} {MGS}+{pre-smoothing,Jac}+{GS} {MGS}+{pre-smoothing,AMG}+{GS} {MGS}+{pre-smoothing,CG}+{GS} {MGS}+{pre-smoothing}+{GS,pre-smoothing} {MGS}+{pre-smoothing}+{GS,post-smoothing} {MGS}+{pre-smoothing}+{GS,GS} {MGS}+{pre-smoothing}+{GS,Jac} {MGS}+{pre-smoothing}+{GS,AMG} {MGS}+{pre-smoothing}+{GS,CG}

14

  • ptimal conf.
influence model

Π

slide-15
SLIDE 15

Alexander Grebhahn

Performance Performance-Inf Influence luence Model Model

{MGS}+{pre-smoothing}+{GS}+{GS,pre-smoothing}

15

  • ptimal conf.
influence model

Π

slide-16
SLIDE 16

Alexander Grebhahn

Sampling Sampling

  • ptimal conf.

influence model

Π

16

slide-17
SLIDE 17

Alexander Grebhahn

Sampling Sampling

  • ptimal conf.

influence model

Π

17

slide-18
SLIDE 18

Alexander Grebhahn

Sampling Sampling

  • ptimal conf.

influence model

Π

18

slide-19
SLIDE 19

Alexander Grebhahn

Sampling Sampling

  • ptimal conf.

influence model

Π

19

slide-20
SLIDE 20

Alexander Grebhahn

Binar Binary y and Numeric Opt and Numeric Options ions

Structured sampling approaches for the different kinds of options Binary Options Numeric Options

(0,0) (0,1) (1,0) (1,1) GS Jac (0,0) (0,1) (1,0) (1,1)

pre-smoothing post-smoothing (0,0) (0,8) (8,8) (8,0)

20

  • ptimal conf.
influence model

Π

slide-21
SLIDE 21

Alexander Grebhahn

 Random?  Unlikely to select a valid configuration  Only locally clustered solutions using SAT  Heuristics  Option-Wise (OW)  Negative Option-Wise (nOW)  Pair-Wise (PW)

{ } , { } … { , , } …

Heuris Heuristics tics for Binar for Binary y Opti Options

  • ns
  • ptimizations

vectorize unroll tileOuterLoop colorSplitting

tileOuterLoop

{ , },{ , } …

colorSplitting unroll vectorize vectorize unroll vectorize unroll unroll

21

  • ptimal conf.
influence model

Π

slide-22
SLIDE 22

Alexander Grebhahn

Heuris Heuristics tics for Numeric for Numeric Opti Options (Des

  • ns (Design

ign of

  • f Expe

Experim riments ents)

 Response surface models  Identify the influence of independent

variables on a parameter

 Scale to multiple numeric options  Central Composite Design (CCD)  Plackett-Burman Design (PBD)

pre-smoothing post-smoothing pre-smoothing post-smoothing

22

  • ptimal conf.
influence model

Π

slide-23
SLIDE 23

Alexander Grebhahn

Expe Experim riments ents: : Subjec Subject Sy t Syst stems ems

 DUNE MGS  HIPAcc  HSMGP

HIPAcc API CUDA Texture Memory OpenCL Linear2D Array2D Padding [0,32,…,512] Pixels per Thread [1,2,3,4] 1 Blocksize

¬(Local Memory ˄ 1024x1 ˄ Pixel Per Thread = 2) ¬(Local Memory ˄ 32x32 ˄ Pixel Per Thread = 3) ¬(Local Memory ˄ 64x16 ˄ Pixel Per Thread = 3)

Local Memory 32x1 64x16 128x1 128x2 128x4 128x8 256x4 512x1 512x2 1024x1 Ldg 32x2 32x4 64x2 64x8 256x1 256x2

(Array2D Padding = 0)

¬(Local Memory ˄ 128x8 ˄ Pixel Per Thread = 3) ¬(Local Memory ˄ 256x4 ˄ Pixel Per Thread = 3) ¬(Local Memory ˄ 512x2 ˄ Pixel Per Thread = 3) ¬(Local Memory ˄ 1024x1 ˄ Pixel Per Thread = 3) ¬(Local Memory ˄ 32x32 ˄ Pixel Per Thread = 4) ¬(Local Memory ˄ 64x16 ˄ Pixel Per Thread = 4) ¬(Local Memory ˄ 128x8 ˄ Pixel Per Thread = 4) ¬(Local Memory ˄ 256x4 ˄ Pixel Per Thread = 4) ¬(Local Memory ˄ 512x2 ˄ Pixel Per Thread = 4) ¬(Local Memory ˄ 1024x1 ˄ Pixel Per Thread = 4)

32x16 32x8 64x4 32x32 64x1 Linear1D HSMGP post-smoothing [0,…,6] 3 pre-smoothing [0,…,6] 3

sum (pre-smoothing, post-smoothing) > 0

coarse grid solver IP_CG IP_AMG RED_AMG smoother GSAC GS Jac BS RBGS RBGSAC Number of Cores [64,256,1024,4096] 64 Dune MGS post-smoothing [0,…,6] 3 pre-smoothing [0,…,6] 3

sum (pre-smoothing, post-smoothing) > 0

preconditioner GS solver CG Loop BicGSTAB Gradient Number of Cells [50,…,55] 50 SOR

  • 2 304 configurations
  • Intel i5-4570 Quad

Code and 32 GB RAM

  • 13 485 configurations
  • nVidia Tesla K20 with

5GB RAM and 2495 cores

  • 3 456 configurations
  • JuQueen at Jülich

23

slide-24
SLIDE 24

Alexander Grebhahn

Expe Experim rimental ental Res Results ults

Option-Wise is the best trade-off between prediction accuracy and measurement overhead

Option-Wise combined with PBD(49,7) has best accurcy (~avg. error of 9.1%) compared to measurement overhead

Option-Wise Pair-Wise Negative Option-Wise e ¯/|C| e ¯/|C| e ¯/|C| Dune MGS PBD(9,3)

14.1%/45 14.9%/72 15.8%/45

PBD(49,7)

11.4%/245 11.9%/392 11.6%/245

CCD

11.1%/75 11.9%/120 10.8%/75

HIPAcc

PBD(9,3)

14.7%/240 13.8%/1221 49.3%/85

PBD(49,7)

13.9%/736 11.1%/3645 41.4%/161

CCD

14.2%/242 10.5%/1247 48.2%/102

HSMGP

PBD(9,3)

2%/72 2.4%/162 3.3%/72

PBD(49,7)

2.1%/392 1.5%/882 2.4%/392

CCD

3.2%/120 2.7%/270 3.7%/120

ē: average prediction error, |C| : number of measurements PBD: Plackett-Burman Design, CCD: Central Composite Design

24

slide-25
SLIDE 25

Alexander Grebhahn

What about What about Domain Know Domain Knowledge? ledge?

Tailor numeric option sampling to known shape of function Tailor binary option sampling to known interactions Tailor numeric option sampling to known absence of interactions Learn separate models for independent configuration spaces Learn specific functions (do not probe for any function)

  • ptimal conf.

influence model

Π1 +Π2+Π3

25

slide-26
SLIDE 26

Alexander Grebhahn

Outl Outlook

  • ok

Energy efficiency

  • ptimal conf.
influence model

Π1 +Π2+Π3

Domain-knowledge integration and validation Combined sampling of binary and numeric options +

26

GS Jac (0,0) (0,1) (1,0) (1,1)

pre-smoothing post-smoothing (0,0) (0,8) (8,8) (8,0)

slide-27
SLIDE 27

Alexander Grebhahn

Outl Outlook

  • ok

27

vs.

slide-28
SLIDE 28

Alexander Grebhahn

Publications Publications

  • Norbert Siegmund, Alexander Grebhahn, Sven Apel, Christian Kästner.

Performance-Influence Models for Highly Configurable Systems. Submitted to ESEC/FSE 2015

  • Alexander Grebhahn, Sebastian Kuckuk, Christian Schmitt, Harald Köstler,

Norbert Siegmund, Sven Apel, Frank Hannig, and Jürgen Teich. Experiments on Optimizing the Performance of Stencil Codes with SPL Conqueror. Parallel Processing Letters, 24(3):Article 1441001, September 2014.

  • Alexander Grebhahn, Norbert Siegmund, Sven Apel, Sebastian Kuckuk, Christian

Schmitt, and Harald Köstler. Optimizing Performance of Stencil Code with SPL

  • Conqueror. In HiStencils, January 2014.

SPL Conqueror

28

  • nqueror

SPL