Lecture 10: Parallel Patterns: The What and How of Parallel Programming G63.2011.002/G22.2945.001 · November 9, 2010 Embarrassing Partition Pipelines Reduction Scan
Tentative Plan for Rest of Class • Today: Parallel Patterns • Nov 16: Load Balancing • Nov 23: More performance tricks, tools • Nov 30: Odds and Ends in GPU Land • Dec 7: moved to Dec 14 (still ok?) • Dec 14, 21: Final Project Presentations • Will assign presentation date this week. Embarrassing Partition Pipelines Reduction Scan
Tentative Plan for Rest of Class • Today: Parallel Patterns • Nov 16: Load Balancing • Nov 23: More performance tricks, tools • Nov 30: Odds and Ends in GPU Land • Dec 7: moved to Dec 14 (still ok?) • Dec 14, 21: Final Project Presentations • Will assign presentation date this week. Anything not on here that you would like covered? Embarrassing Partition Pipelines Reduction Scan
Tentative Plan for Rest of Class • Today: Parallel Patterns • Nov 16: Load Balancing • Nov 23: More performance tricks, tools • Nov 30: Odds and Ends in GPU Land • Dec 7: moved to Dec 14 (still ok?) • Dec 14, 21: Final Project Presentations • Will assign presentation date this week. Will post HW3 solution soon. (list message) Graded HW3 next week. Embarrassing Partition Pipelines Reduction Scan
Today “Traditional” parallel programming in a nutshell Key question: • Data Dependencies Embarrassing Partition Pipelines Reduction Scan
Outline Embarrassingly Parallel Partition Pipelines Reduction Scan Embarrassing Partition Pipelines Reduction Scan
Outline Embarrassingly Parallel Partition Pipelines Reduction Scan Embarrassing Partition Pipelines Reduction Scan
Embarrassingly Parallel y i = f i ( x i ) where i ∈ { 1 , . . . , N } . Notation: (also for rest of this lecture) • x i : inputs • y i : outputs • f i : (pure) functions (i.e. no side effects ) Embarrassing Partition Pipelines Reduction Scan
Embarrassingly Parallel When does a function have a “side effect”? In addition to producing a value, it y i = f i ( x i ) • modifies non-local state, or • has an observable interaction with the where i ∈ { 1 , . . . , N } . outside world. Notation: (also for rest of this lecture) • x i : inputs • y i : outputs • f i : (pure) functions (i.e. no side effects ) Embarrassing Partition Pipelines Reduction Scan
Embarrassingly Parallel y i = f i ( x i ) where i ∈ { 1 , . . . , N } . Notation: (also for rest of this lecture) • x i : inputs • y i : outputs • f i : (pure) functions (i.e. no side effects ) Embarrassing Partition Pipelines Reduction Scan
Embarrassingly Parallel y i = f i ( x i ) where i ∈ { 1 , . . . , N } . Notation: (also for rest of this lecture) • x i : inputs • y i : outputs • f i : (pure) functions (i.e. no side effects ) Often: f 1 = · · · = f N . Then • Lisp/Python function map • C++ STL std::transform Embarrassing Partition Pipelines Reduction Scan
Embarrassingly Parallel: Graph Representation x 0 x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 f 0 f 1 f 2 f 3 f 4 f 5 f 6 f 7 f 8 y 0 y 1 y 2 y 3 y 4 y 5 y 6 y 7 y 8 Embarrassing Partition Pipelines Reduction Scan
Embarrassingly Parallel: Graph Representation x 0 x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 f 0 f 1 f 2 f 3 f 4 f 5 f 6 f 7 f 8 y 0 y 1 y 2 y 3 y 4 y 5 y 6 y 7 y 8 Trivial? Often: no. Embarrassing Partition Pipelines Reduction Scan
Embarrassingly Parallel: Examples Surprisingly useful: • Element-wise linear algebra: Addition, scalar multiplication ( not inner product) • Image Processing: Shift, rotate, clip, scale, . . . • Monte Carlo simulation • (Brute-force) Optimization • Random Number Generation • Encryption, Compression (after blocking) • Software compilation • make -j8 Embarrassing Partition Pipelines Reduction Scan
Embarrassingly Parallel: Examples Surprisingly useful: • Element-wise linear algebra: Addition, scalar multiplication ( not inner product) • Image Processing: Shift, rotate, clip, scale, . . . • Monte Carlo simulation • (Brute-force) Optimization • Random Number Generation • Encryption, Compression (after blocking) But: Still needs a minimum of • Software compilation coordination. How can that be • make -j8 achieved? Embarrassing Partition Pipelines Reduction Scan
Mother-Child Parallelism Mother-Child parallelism: Send initial data Children Mother 0 1 2 3 4 Collect results (formerly called “Master-Slave”) Embarrassing Partition Pipelines Reduction Scan
Embarrassingly Parallel: Issues • Process Creation: Dynamic/Static? • MPI 2 supports dynamic process creation • Job Assignment (‘Scheduling’): Dynamic/Static? • Operations/data light- or heavy-weight? • Variable-size data? • Load Balancing: • Here: easy Embarrassing Partition Pipelines Reduction Scan
Embarrassingly Parallel: Issues • Process Creation: Dynamic/Static? • MPI 2 supports dynamic process creation • Job Assignment (‘Scheduling’): Dynamic/Static? • Operations/data light- or heavy-weight? • Variable-size data? • Load Balancing: • Here: easy Can you think of a load balancing recipe? Embarrassing Partition Pipelines Reduction Scan
Outline Embarrassingly Parallel Partition Pipelines Reduction Scan Embarrassing Partition Pipelines Reduction Scan
Partition y i = f i ( x i − 1 , x i , x i +1 ) where i ∈ { 1 , . . . , N } . Embarrassing Partition Pipelines Reduction Scan
Partition y i = f i ( x i − 1 , x i , x i +1 ) where i ∈ { 1 , . . . , N } . Includes straightforward generalizations to dependencies on a larger (but not O ( P )-sized!) set of neighbor inputs. Embarrassing Partition Pipelines Reduction Scan
Partition: Graph x 0 x 1 x 2 x 3 x 4 x 5 x 6 y 1 y 2 y 3 y 4 y 5 Embarrassing Partition Pipelines Reduction Scan
Partition: Examples • Time-marching (in particular: PDE solvers) • (Including finite differences → HW3!) • Iterative Methods • Solve Ax = b (Jacobi, . . . ) • Optimization (all P on single problem) • Eigenvalue solvers • Cellular Automata (Game of Life :-) Embarrassing Partition Pipelines Reduction Scan
Partition: Issues • Only useful when the computation is mainly local • Responsibility for updating one datum rests with one processor • Synchronization, Deadlock, Livelock, . . . • Performance Impact • Granularity • Load Balancing: Thorny issue • → next lecture • Regularity of the Partition? Embarrassing Partition Pipelines Reduction Scan
Rendezvous Trick • Assume an irregular partition. • Assume problem p i components i , j on unknown i partitions p i , p j need to communicate. • How can p i find p j (and vice p j versa)? j Embarrassing Partition Pipelines Reduction Scan
Rendezvous Trick • Assume an irregular partition. • Assume problem p i components i , j on unknown i partitions p i , p j need to communicate. • How can p i find p j (and vice p j versa)? p f ( i , j ) j Communicate via a third party, p f ( i , j ) . For f : think ‘hash function’. Embarrassing Partition Pipelines Reduction Scan
Rendezvous Trick • Assume an irregular partition. • Assume problem p i components i , j on unknown i partitions p i , p j need to communicate. “I’m in p i .” • How can p i find p j (and vice p j versa)? p f ( i , j ) j Communicate via a third party, p f ( i , j ) . For f : think ‘hash function’. Embarrassing Partition Pipelines Reduction Scan
Rendezvous Trick • Assume an irregular partition. • Assume problem p i components i , j on unknown i “I’m in p j .” partitions p i , p j need to communicate. • How can p i find p j (and vice p j versa)? p f ( i , j ) j Communicate via a third party, p f ( i , j ) . For f : think ‘hash function’. Embarrassing Partition Pipelines Reduction Scan
Rendezvous Trick • Assume an irregular partition. • Assume problem p i components i , j on unknown i partitions p i , p j need to communicate. • How can p i find p j (and vice p j versa)? p f ( i , j ) j Communicate via a third party, p f ( i , j ) . For f : think ‘hash function’. Embarrassing Partition Pipelines Reduction Scan
Rendezvous Trick • Assume an irregular partition. • Assume problem p i components i , j on unknown i partitions p i , p j need to communicate. • How can p i find p j (and vice p j versa)? p f ( i , j ) j Communicate via a third party, p f ( i , j ) . For f : think ‘hash function’. Embarrassing Partition Pipelines Reduction Scan
Outline Embarrassingly Parallel Partition Pipelines Reduction Scan Embarrassing Partition Pipelines Reduction Scan
Pipelined Computation y = f N ( · · · f 2 ( f 1 ( x )) · · · ) = ( f N ◦ · · · ◦ f 1 )( x ) where N is fixed. Embarrassing Partition Pipelines Reduction Scan
Pipelined Computation: Graph f 1 f 1 f 2 f 3 f 4 f 6 y x Embarrassing Partition Pipelines Reduction Scan
Pipelined Computation: Graph f 1 f 1 f 2 f 3 f 4 f 6 y x Processor Assignment? Embarrassing Partition Pipelines Reduction Scan
Recommend
More recommend