Concurrency and Parallelism in ML John Reppy University of Chicago - PowerPoint PPT Presentation

Concurrency and Parallelism in ML John Reppy University of Chicago MacQueen Fest — May 12, 2012

History Personal history I ML on Unix (Cardelli ML) I ML + Amber = ⇒ Pegasus ML I Standard ML of New Jersey (Version 0.15 on tape) I Pegasus ML + SML/NJ = ⇒ Concurrent ML ⇒ Ph.D.!!! I = ⇒ Department 11261 at Bell Labs I = MacQueen Fest — May 12, 2012 Concurrency and Parallelism in ML 2

Why ML? What makes parallelism and concurrency hard? The sequential core matters! I Combination of shared mutable state and concurrency leads to data races and non-determinism. I Adding synchronization to avoid these problems leads to deadlock. I Shared memory does not scale well to NUMA and Distributed Memory architectures. I Scaling is hard. Claim: traditional imperative programming languages are a bad fit for concurrent and parallel programming. MacQueen Fest — May 12, 2012 Concurrency and Parallelism in ML 3

Why ML? Alternatives I Java, C#. etc. I Haskell I X10 MacQueen Fest — May 12, 2012 Concurrency and Parallelism in ML 4

Why ML? Standard ML Claim: what we want is a strict, statically typed, functional language. I.e. , Standard ML I Strict CBV semantics I Type system distinguishes between mutable and non-mutable values. I Programming style is value-oriented. MacQueen Fest — May 12, 2012 Concurrency and Parallelism in ML 5

Why ML? Challenges SML does not come without challenges. I Polymorphism I Higher-order functions I Garbage collection I Exceptions MacQueen Fest — May 12, 2012 Concurrency and Parallelism in ML 6

Parallel ML The Manticore Project I The Manticore project is our effort to address the programming needs of commodity applications running on multicore SMP systems I No shared memory I Preserve determinism where possible I Declarative mechanisms for fine-grain parallelism MacQueen Fest — May 12, 2012 Concurrency and Parallelism in ML 7

Parallel ML The Manticore Project (continued ...) Our initial language is called Parallel ML (PML). I Sequential core language based on subset of SML: strict with no mutable storage. I A variety of lightweight implicitly-threaded constructs for fine-grain parallelism. I Explicitly-threaded parallelism based on CML: message passing with first-class synchronization. I Prototype implementation with good scaling on 48-way parallel hardware for a range of applications. MacQueen Fest — May 12, 2012 Concurrency and Parallelism in ML 8

Parallel ML Implicit threading PML provides several light-weight syntactic forms for introducing parallel computation. I Parallel tuples provide a basic fork-join parallel computation. I Nested Data-parallel arrays provide fine-grain data-parallel computations over sequences. I Parallel bindings provide data-flow parallelism with cancellation of unused subcomputations. I Parallel cases provide non-deterministic speculative parallelism. These forms are annotations that mark a computation as a good candidate for parallel execution, but the details are left to the implementation. MacQueen Fest — May 12, 2012 Concurrency and Parallelism in ML 9

Parallel ML Challenges revisited SML does not come without challenges. I Polymorphism — whole program monomorphism using MLton’s front end I Higher-order functions — advanced CFA techniques I Garbage collection — DGL split-heap GC and parallel global GC I Exceptions — reduce use of arithmetic exceptions MacQueen Fest — May 12, 2012 Concurrency and Parallelism in ML 10

Parallel ML PML performance 48 perfect 44 RayTracer QuickSort 40 Black − Scholes 36 Barnes − Hut 32 28 Speedup 24 20 16 12 8 4 0 4 8 16 24 30 36 40 48 Number of processors Speedup over sequential PML. MacQueen Fest — May 12, 2012 Concurrency and Parallelism in ML 11

The future The need for shared mutable state I Mutable storage is a very powerful communication mechanism: essentially a broadcast mechanism supported by the memory hardware. I Sequential algorithms and data-structures gain significant (asymptotic) performance benefits from shared memory ( e.g. , union-find with path compression). I Some algorithms seem hard/impossible to parallelize without shared state (e.g., mesh refinement). I But shared memory makes parallel programming hard, so we want to be cautious in adding it to PML. MacQueen Fest — May 12, 2012 Concurrency and Parallelism in ML 12

The future The design challenge I How do we add shared memory while preserving PML ’s declarative programming model for fine-grain parallelism? I Some races are okay in an implicitly threaded setting. I Deadlock is not okay in an implicitly threaded setting. MacQueen Fest — May 12, 2012 Concurrency and Parallelism in ML 13

The future Limits on parallel performance: Amdahl’s Law 1 100% 0.9 0.8 0.7 99% Efficiency 0.6 0.5 0.4 0.3 95% 0.2 90% 0.1 80% 0 1 2 3 4 6 8 12 16 24 32 40 48 Number of Processors MacQueen Fest — May 12, 2012 Concurrency and Parallelism in ML 14

The future Speculation I Amdahl’s Law tells us that as the number of cores increases, execution time will be dominated by sequential code. I Speculation is an important tool for introducing parallelism in otherwise sequential code. I PML supports both deterministic and nondeterministic speculation. I For many applications, we can relax determinism and still get a correct answer. MacQueen Fest — May 12, 2012 Concurrency and Parallelism in ML 15

Conclusion Credits I Matthew Fluet (RIT) I Claudio Russo (MSR Cambridge) I Sven Auhagen, Lars Bergstrom, Mike Rainey, Adam Shaw, and Yingqi Xiao (U. of Chicago Graduate Students) I Carsen Berger, Stephen Rosen, and Nora Sandler (U. of Chicago Undergraduates) I Chelsea Bingiel, Nic Ford, Korie Klein, Joshua Knox, Jordan Lewis, and Damon Wang (Past U. of Chicago Undergraduates) I National Science Foundation MacQueen Fest — May 12, 2012 Concurrency and Parallelism in ML 16

Conclusion Questions? http://manticore.cs.uchicago.edu MacQueen Fest — May 12, 2012 Concurrency and Parallelism in ML 17

Concurrency and Parallelism in ML John Reppy University of Chicago - PowerPoint PPT Presentation

Concurrency and Parallelism in ML John Reppy University of Chicago MacQueen Fest May 12, 2012 History Personal history I ML on Unix (Cardelli ML) I ML + Amber = Pegasus ML I Standard ML of New Jersey (Version 0.15 on tape) I Pegasus ML

Hardware Parallelism vs. Software Parallelism USENIX Workshop on Hot Topics in Parallelism March

Concurrency, Parallelism and Coroutines Anthony Williams Just Software Solutions Ltd

Parallelism vs. Concurrency Key concerns Concurrency: Parallelism: Correctly

Parallelism & Concurrency Advanced functional programming - Lecture 9 Trevor L. McDonell

CONCURRENCY MODELS: GO CONCURRENCY MODEL BY VASYL NAKVASIUK, 2014 KYIV GO MEETUP #1

Chapter 17: Parallel Databases Introduction I/O Parallelism Interquery Parallelism

COMP31212: Concurrency Topics 4.1: Concurrency Patterns - Monitors Topic 4.1: Concurrency

B3CC: Concurrency 08: Parallelism from Concurrency Trevor L. McDonell Utrecht University, B2

Concurrency What is concurrency? In computer science, concurrency is a property of systems which

Concurrency Control Ensuring Isolation 354 Concurrency control Concurrency To increase

Pervasive Parallelism Laboratory Stanford University ppl.stanford.edu Make parallelism

Data-Level Parallelism Nima Honarmand Fall 2015 :: CSE 610 Parallel Computer Architectures

Advanced OpenMP Lecture 6: Nested parallelism Nested parallelism Nested parallelism is

CSCI341 Lecture 37, Introduction to Parallelism PIPELINING Exploits potential parallelism

Opportunities for Parallelism Dr. Michael K. Bane HIGH END COMPUTE Questions 1. What do you

Effective Parallelism with Reagents KC Sivaramakrishnan University of OCaml Cambridge

Che-Lin Su The University of Chicago Booth School of Business joint work with Michael Egesdal

Quiz Next Thursday, Sept 6 Will focus on terminology and notation (mostly multiple choice)

Requirements for Rules Interoperability Ed Barkmeyer, Ravi Raman, Evan Wallace Manufacturing

Partitioned Successive-Cancellation List Decoding of Polar Codes Seyyed Ali Hashemi McGill

A Tutorial on Model Checker SPIN Instructor: Hao Zheng Department of Computer Science and

A Practical Framework for Curry-Style Languages (Inspired by realizability semantics) Rodolphe

MWUA 2014 Bound Structure Limits in Force of 6,993,441,138 as of 10/31/13 $970M 70M xs 900M Agg

Lecture 1: Introduction to CS 3220 David Bindel 24 Jan 2011 Plan for today What is

Concurrency and Parallelism in ML John Reppy University of Chicago - PowerPoint PPT Presentation

Concurrency and Parallelism in ML John Reppy University of Chicago MacQueen Fest May 12, 2012 History Personal history I ML on Unix (Cardelli ML) I ML + Amber = Pegasus ML I Standard ML of New Jersey (Version 0.15 on tape) I Pegasus ML

Hardware Parallelism vs. Software Parallelism USENIX Workshop on Hot Topics in Parallelism March

Concurrency, Parallelism and Coroutines Anthony Williams Just Software Solutions Ltd

Parallelism vs. Concurrency Key concerns Concurrency: Parallelism: Correctly

Parallelism &amp; Concurrency Advanced functional programming - Lecture 9 Trevor L. McDonell

CONCURRENCY MODELS: GO CONCURRENCY MODEL BY VASYL NAKVASIUK, 2014 KYIV GO MEETUP #1

Chapter 17: Parallel Databases Introduction I/O Parallelism Interquery Parallelism

COMP31212: Concurrency Topics 4.1: Concurrency Patterns - Monitors Topic 4.1: Concurrency

B3CC: Concurrency 08: Parallelism from Concurrency Trevor L. McDonell Utrecht University, B2

Concurrency What is concurrency? In computer science, concurrency is a property of systems which

Concurrency Control Ensuring Isolation 354 Concurrency control Concurrency To increase

Pervasive Parallelism Laboratory Stanford University ppl.stanford.edu Make parallelism

Data-Level Parallelism Nima Honarmand Fall 2015 :: CSE 610 Parallel Computer Architectures

Advanced OpenMP Lecture 6: Nested parallelism Nested parallelism Nested parallelism is

CSCI341 Lecture 37, Introduction to Parallelism PIPELINING Exploits potential parallelism

Opportunities for Parallelism Dr. Michael K. Bane HIGH END COMPUTE Questions 1. What do you

Effective Parallelism with Reagents KC Sivaramakrishnan University of OCaml Cambridge

Che-Lin Su The University of Chicago Booth School of Business joint work with Michael Egesdal

Quiz Next Thursday, Sept 6 Will focus on terminology and notation (mostly multiple choice)

Requirements for Rules Interoperability Ed Barkmeyer, Ravi Raman, Evan Wallace Manufacturing

Partitioned Successive-Cancellation List Decoding of Polar Codes Seyyed Ali Hashemi McGill

A Tutorial on Model Checker SPIN Instructor: Hao Zheng Department of Computer Science and

A Practical Framework for Curry-Style Languages (Inspired by realizability semantics) Rodolphe

MWUA 2014 Bound Structure Limits in Force of 6,993,441,138 as of 10/31/13 $970M 70M xs 900M Agg

Lecture 1: Introduction to CS 3220 David Bindel 24 Jan 2011 Plan for today What is

Parallelism & Concurrency Advanced functional programming - Lecture 9 Trevor L. McDonell