Complexity Measures for Parallel Computation Complexity Measures - PowerPoint PPT Presentation

Complexity Measures for Parallel Computation

Complexity Measures for Parallel Computation Problem parameters: • n index of problem size • p number of processors Algorithm parameters: • t p running time on p processors • t 1 time on 1 processor = sequential time = “work” • t ∞ time on unlimited procs = critical path length = “span” • v total communication volume Performance measures • speedup s = t 1 / t p • efficiency e = t 1 / (p*t p ) = s / p • (potential) parallelism pp = t 1 / t ∞ • computational intensity q = t 1 / v

Several possible models! • Execution time and parallelism: • Work / Span Model • Total cost of moving data: • Communication Volume Model • Detailed models that try to capture time for moving data: • Latency / Bandwidth Model (for message-passing) • Cache Memory Model (for hierarchical memory) • Other detailed models we won ’ t discuss: LogP, UMH, … .

Work / Span Model t p = execution time on p processors

Work / Span Model t p = execution time on p processors t 1 = wo work

Work / Span Model t p = execution time on p processors t 1 = wo work t ∞ = sp span an * * * Also called criti tical-path th length th or computa tati tional depth th .

Work / Span Model t p = execution time on p processors t 1 = wo work t ∞ = sp span an * * W ORK ORK L L AW AW ∙ t p ≥ t 1 /p S PAN PAN L L AW AW ∙ t p ≥ t ∞ * Also called criti tical-path th length th or computa tati tional depth th .

Series Composition A B Work: Work: t 1 (A ∪ B) = Work: Work: t 1 (A ∪ B) = t 1 (A) + t 1 (B) Sp Span: Sp Span: n: t ∞ (A ∪ B) = t ∞ (A) +t ∞ (B) n: t ∞ (A ∪ B) =

Parallel Composition A B Work: Work: t 1 (A ∪ B) = t 1 (A) + t 1 (B) Sp Span: n: t ∞ (A ∪ B) = max{t ∞ (A), t ∞ (B)}

Speedup Def. t 1 /t P = sp De speed eedup up on p processors. If t 1 /t P = Θ (p), we have lin linear speedu ear speedup , = p, we have perfect t linear speedup , > p, we have sup superlinear erlinear sp speed eedup up , (which is not possible in this model,   because of the Work Law t p ≥ t 1 /p)

Parallelism Because the Span Law requires t p ≥ t ∞ , the maximum possible speedup is t 1 /t ∞ = (potential) parallelism = the average amount of work per step along the span.

Laws of Parallel Complexity • Work law: t p ≥ t 1 / p • Span law: t p ≥ t ∞ • Amdahl’s law: • If a fraction f, between 0 and 1, of the work must be done sequentially, then speedup ≤ 1 / f • Exercise: prove Amdahl’s law from the span law.

Communication Volume Model • Network of p processors • Each with local memory • Message-passing • Communication volume (v) • Total size (words) of all messages passed during computation • Broadcasting one word costs volume p (actually, p-1) • No explicit accounting for communication time • Thus, can ’ t really model parallel efficiency or speedup; for that, we ’ d use the latency-bandwidth model (see later slide)

Complexity Measures for Parallel Computation Problem parameters: • n index of problem size • p number of processors Algorithm parameters: • t p running time on p processors • t 1 time on 1 processor = sequential time = “work” • t ∞ time on unlimited procs = critical path length = “span” • v total communication volume Performance measures • speedup s = t 1 / t p • efficiency e = t 1 / (p*t p ) = s / p • (potential) parallelism pp = t 1 / t ∞ • computational intensity q = t 1 / v

Detailed complexity measures for data movement I: Latency/Bandwith Model Moving data between processors by message-passing • Machine parameters: • α or t startup latency (message startup time in seconds) • β or t data inverse bandwidth (in seconds per word) • between nodes of Triton, α ∼ 2.2 × 10 -6 and β ∼ 6.4 × 10 -9 • Time to send & recv or bcast a message of w words: α + w* β • t comm total commmunication time • t comp total computation time • Total parallel time: t p = t comp + t comm

Detailed complexity measures for data movement II: Cache Memory Model Moving data between cache and memory on one processor: • Assume just two levels in memory hierarchy, fast and slow • All data initially in slow memory • m = number of memory elements (words) moved between fast and slow memory • t m = time per slow memory operation • f = number of arithmetic operations • t f = time per arithmetic operation, t f << t m • q = f / m ( computational intensity) flops per slow element access • Minimum possible time = f * t f when all data in fast memory • Actual time • f * t f + m * t m = f * t f * (1 + t m /t f * 1/q) • Larger q means time closer to minimum f * t f

Complexity Measures for Parallel Computation Complexity Measures - PowerPoint PPT Presentation

Complexity Measures for Parallel Computation Complexity Measures for Parallel Computation Problem parameters: n index of problem size p number of processors Algorithm parameters: t p running time on p processors t 1 time on 1

Background Background Text Complexity Text Complexity Text Complexity Sowmya V.B., Sowmya

Models of Parallel Computation Mark Greenstreet CpSc 418 Oct. 10, 2013 The RAM Model of

CSL 860: Modern Parallel Computation Computation Hello OpenMP #pragma omp parallel { // I am

Formal Definition of Computation Formal Definition of Computation p.1/28 Computation

Massively Parallel Computation Philip Bille Sequential Computation Computation. Read and

CSL 860: Modern Parallel Computation Computation PARALLEL ALGORITHM TECHNIQUES: BALANCED BINARY

Hans Vangheluwe Modelling and Simulation Causes of Complexity Dealing with Complexity

Hans Vangheluwe Modelling and Simulation Causes of Complexity Dealing with Complexity

Kolmogorov Complexity of Categories Complexity Programing Language Kolmogorov Noson S.

IN 5210 Complexity Theory Complexity Complexity: Socio-technical (Internet, globalization)

Communication Complexity Lecture 23 Computing with remote inputs 1 Communication Complexity

Complexity and Character of Human Languages The Faculty of Language Informatics 2A: Lecture 28

Overview CS20a: Complexity (Nov 19, 2002) Complexity definitions Space and time bounded

Texts Complexity Theory The main text for the course is: Computational Complexity . Christos H.

Lecture 8: Space Complexity I Arijit Bishnu 18.03.2010 Space Bounded Computation Configuration

The Complexity of ( +1) Coloring in Congested Clique, Massively Parallel Computation, and

Kernel level memory management 1. The very base on boot vs memory management 2. Memory Nodes

CS 423 Operating System Design: Memory Wrap-Up Professor Adam Bates CS 423: Operating

Over the Edge: Silently Owning Windows 10's Secure Browser Erik Bosman , Kaveh Razavi, Herbert

Large-Scale Adaptive Mesh Simulations Through Non-Volatile Byte-Addressable Memory Bao Nguyen Hua

Delerium and Dementia -Sadly, I still have nothing new to disclose since early my last

Parallel Algorithms and CS260 Algorithmic Engineering Implementations Yihan Sun Algorithmic

HPC Architectures Types of resource currently in use Reusing this material This work is licensed

r r rqrts r