Algebraic Multiparty Protocol Programming David Castro-Perez Nobuko Yoshida Imperial College London December 17, 2018
Parallel Programming ◮ Parallel programming is increasingly important: many-core architectures, GPUs, FPGAs, . . . ◮ Low-level techniques are error-prone : deadlocks, data races, etc. ◮ High-level techniques constraints programmers to using a particular model, or a fixed set of parallel constructs. ◮ Achieving (predictable) speedups is hard! ◮ Our goal: generate message-passing parallel code from sequential implementations. ◮ Not constrained by a fixed set of high-level parallel constructs. ◮ Guarantee correctness ◮ Predictability
Proposal : Algebraic Multiparty Protocol Programming ◮ Algebra of programming for specifying sequential algorithms. ◮ Use higher-order combinators . ◮ Use their equational theory for program optimisation and parallelisation. ◮ Multiparty session types for message-passing concurrency. ◮ We provide an abstraction of the communication protocol of the generated parallel code as a global type . ◮ We prove that we do not introduce concurrency errors, using the theory of Multiparty Session Types (MPST). ◮ Key idea : convert the implicit data-flow of the higher-order combinators to explicit communication .
Overview Algebraic Functional Language (Alg) role annotations Parallel Algebraic Language (ParAlg) code generation protocol inference typability Parallel Code MPST
Algebra of Programming ◮ Mathematical framework that codifies the basic laws of algorithmics. [Backus 78, Meertens 86, Bird 89]. ◮ We define Algebraic Functional Language (Alg), a point-free functional programming language with a number of categorically-inspired combinators as syntactic constructs: composition, polynomial functors, recursion. ◮ Examples: ◮ Function composition and identity: e 1 ◦ e 2 = λ x . e 1 ( e 2 x ) id = λ x . x e 1 ◦ ( e 2 ◦ e 3 ) ≡ ( e 1 ◦ e 2 ) ◦ e 3 id ◦ e ≡ e ◦ id ≡ e ◮ Split and projections: e 1 △ e 2 = λ x . ( e 1 x , e 2 x ) π i = λ ( x 1 , x 2 ) . x i π i ◦ ( e 1 △ e 2 ) ≡ e i ( e 1 △ e 2 ) ◦ e ≡ ( e 1 ◦ e ) △ ( e 2 ◦ e )
Algebra of Programming ◮ Mathematical framework that codifies the basic laws of algorithmics. [Backus 78], “Squiggol”. ◮ We define Algebraic Functional Language (Alg), a point-free functional programming language with a number of categorically-inspired combinators as syntactic constructs: composition, polynomial functors, recursion. ◮ Examples: ◮ Function composition and identity: e 1 ◦ e 2 = λ x . e 1 ( e 2 x ) id = λ x . x e 1 ◦ ( e 2 ◦ e 3 ) ≡ ( e 1 ◦ e 2 ) ◦ e 3 id ◦ e ≡ e ◦ id ≡ e ◮ Split and projections: e 1 △ e 2 = λ x . ( e 1 x , e 2 x ) π i = λ ( x 1 , x 2 ) . x i π i ◦ ( e 1 △ e 2 ) ≡ e i ( e 1 △ e 2 ) ◦ e ≡ ( e 1 ◦ e ) △ ( e 2 ◦ e )
Example: Cooley-Tukey FFT ◮ Discrete Fourier Transform N − 1 � x n e − 2 π i N nk = E k + e − 2 π i N k O k X k = n =0 = E k − e − 2 π i N k O k X k + N 2 E k = dft of the even-indexed part of x n O k = dft of the odd-indexed part of x n ◮ Alg expression dft n = (add △ sub ) ◦ (( dft n / 2 ◦ π 1 ) △ ( exp ◦ dft n / 2 ◦ π 2 )) ���� ���� ���� � �� � � �� � + − e − 2 π i E k N k O k
Evaluating dft n ((add △ sub) ◦ (( dft n / 2 ◦ π 1 ) △ (exp ◦ dft n / 2 ◦ π 2 )))( x , y )
Evaluating dft n ((add △ sub) ◦ (( dft n / 2 ◦ π 1 ) △ (exp ◦ dft n / 2 ◦ π 2 )))( x , y ) = (add △ sub) ((( dft n / 2 ◦ π 1 ) △ (exp ◦ dft n / 2 ◦ π 2 ))( x , y ))
Evaluating dft n ((add △ sub) ◦ (( dft n / 2 ◦ π 1 ) △ (exp ◦ dft n / 2 ◦ π 2 )))( x , y ) = (add △ sub) ((( dft n / 2 ◦ π 1 ) △ (exp ◦ dft n / 2 ◦ π 2 ))( x , y )) = (add △ sub) (( dft n / 2 ◦ π 1 ) ( x , y ) , (exp ◦ dft n / 2 ◦ π 2 ) ( x , y ))
Evaluating dft n ((add △ sub) ◦ (( dft n / 2 ◦ π 1 ) △ (exp ◦ dft n / 2 ◦ π 2 )))( x , y ) = (add △ sub) ((( dft n / 2 ◦ π 1 ) △ (exp ◦ dft n / 2 ◦ π 2 ))( x , y )) = (add △ sub) (( dft n / 2 ◦ π 1 ) ( x , y ) , (exp ◦ dft n / 2 ◦ π 2 ) ( x , y )) = (add △ sub) ( dft n / 2 x , exp ( dft n / 2 y ))
Evaluating dft n ((add △ sub) ◦ (( dft n / 2 ◦ π 1 ) △ (exp ◦ dft n / 2 ◦ π 2 )))( x , y ) = (add △ sub) ((( dft n / 2 ◦ π 1 ) △ (exp ◦ dft n / 2 ◦ π 2 ))( x , y )) = (add △ sub) (( dft n / 2 ◦ π 1 ) ( x , y ) , (exp ◦ dft n / 2 ◦ π 2 ) ( x , y )) = (add △ sub) ( dft n / 2 x , exp ( dft n / 2 y )) = ( add ( dft n / 2 x , exp ( dft n / 2 y )) , sub ( dft n / 2 x , exp ( dft n / 2 y )) ) � �� � � �� � X k X k + N 2
Evaluating dft n = E k + e − 2 π i N k O k X k 2 = E k − e − 2 π i N k O k X k + N ((add △ sub) ◦ (( dft n / 2 ◦ π 1 ) △ (exp ◦ dft n / 2 ◦ π 2 )))( x , y ) = (add △ sub) ((( dft n / 2 ◦ π 1 ) △ (exp ◦ dft n / 2 ◦ π 2 ))( x , y )) = (add △ sub) (( dft n / 2 ◦ π 1 ) ( x , y ) , (exp ◦ dft n / 2 ◦ π 2 ) ( x , y )) = (add △ sub) ( dft n / 2 x , exp ( dft n / 2 y )) = ( add ( dft n / 2 x , exp ( dft n / 2 y )) , sub ( dft n / 2 x , exp ( dft n / 2 y )) ) � �� � � �� � X k X k + N 2
ParAlg: Alg + role annotations ◮ We call Parallel Algebraic Language (ParAlg) to Alg extended with role annotations. ◮ ⊢ e ⇒ p : A → B | C : “Alg expression e synthethises ParAlg expression p , with type A → B and choices C ”. ◮ E.g. ◮ A = a @ r 0 × b @ r 1 is the product a × b , where a is at r 0 and b at r 1 . ◮ p = e 2 @ r 2 ◦ e 1 @ r 1 is the composition of e 2 ◦ e 1 , where e 2 is applied at r 2 , and e 1 at r 1 . r 0 r 1 r 2
ParAlg: Inferring Global Types ◮ A global type , in Multiparty Session Types , is a global description of a communication protocol between multiple participants. ◮ Inferring a global type from ParAlg implies representing the implicit dataflow with explicit communication . ◮ C � p ⇐ A ∼ G : “Expression p with domain A , in a choice context C behaves as global type G .” ParAlg global type e 0 @ r 0 ◦ e 1 @ r 1 : a @ r → c @ r 0 r → r 1 : a . r 1 → r 0 : b . end e 0 @ r 0 △ e 1 @ r 1 : a @ r → b @ r 0 × c @ r 1 r → r 0 : a . r → r 1 : a . end e 0 @ r 0 ▽ e 1 @ r 1 : ( a + b )@ r → c @ r 0 ∪ c @ r 1 r → { r 0 , r 1 }{ inj 1 . r → r 0 : a . end , inj 2 . r → r 1 : b . end }
ParAlg: Size-2 FFT protocol (add △ sub ) ◦ (( dft n / 2 ◦ π 1 ) △ ( exp ◦ dft n / 2 ◦ π 2 ))
ParAlg: Size-2 FFT protocol (add@ r 0 △ sub@ r 1 ) ◦ (( dft n / 2 @ r 2 ◦ π 1 ) △ ( { exp ◦ dft n / 2 } @ r 3 ◦ π 2 ))
ParAlg: Size-2 FFT protocol (add@ r 0 △ sub@ r 1 ) ◦ (( dft n / 2 @ r 2 ◦ π 1 ) △ ( { exp ◦ dft n / 2 } @ r 3 ◦ π 2 )) Global type assuming that the domain is: V @ r 4 × V @ r 5 : r 4 → r 2 : V . r 5 → r 3 : V . r 4 r 2 r 0 r 2 → r 0 : V . r 2 → r 1 : V . r 5 r 3 r 1 r 3 → r 0 : V . r 3 → r 1 : V . end
ParAlg: Size-2 FFT protocol (add@ r 0 △ sub@ r 1 ) ◦ (( dft n / 2 @ r 2 ◦ π 1 ) △ ( { exp ◦ dft n / 2 } @ r 3 ◦ π 2 )) Global type assuming that the domain is: ( V × V )@ r 4 : r 4 → r 2 : V . r 4 → r 3 : V . r 2 r 0 r 2 → r 0 : V . r 4 r 2 → r 1 : V . r 3 r 1 r 3 → r 0 : V . r 3 → r 1 : V . end
Message Passing Monad(I) ◮ We translate ParAlg to the Message Passing Monad (Mp): send r x , recv r a , branch r m 1 m 2 , choice x r f 1 f 2 . ◮ The translation keeps track of: ◮ Location of the data. ◮ Branches in the control flow: which roles perform choices, and which roles are affected by which choice. ◮ For each role r in p : A → B , we “project” its behaviour as a monadic action. E.g. e 0 @ r 0 ◦ e 1 @ r 1 : a @ r → c @ r 0 � r �→ λ x . send r 1 x r 0 �→ λ . recv r 1 b > > = λ x . return ( e 0 x ) r 1 �→ λ . recv r a > = λ x . send r 0 ( e 1 x ) >
Correctness Theorem (Protocol Deadlock Freedom) For all e, p, A, B, C , if ⊢ e ⇒ p : A → B | C , then there exists a global type G s.t. C � p ⇐ A ∼ G, and G is well-formed. Theorem (Deadlock Freedom of the Generated Code) For all p, A, B, C , G, r, if ⊢ e ⇒ p : A → B | C and C � p ⇐ A ∼ G then � p � r A : A ↾ r → Mp ( G ↾ r ) ( B ↾ r ) .
Speedups on a 4-Core Machine FFT Size = 1048576 3.5 K2 K4 3.0 K6 K8 2.5 Speedup 2.0 1.5 1.0 0.5 0.0 1 2 3 4 5 6 7 8 +RTS -N
Speedups on a 4-Core Machine FFT +RTS -N8 3.5 1024 8192 3.0 32768 1048576 2.5 Speedup 2.0 1.5 1.0 0.5 0.0 1 2 3 4 5 6 7 8 K
Speedups on a 4-Core Machine FFT +RTS -N8 3.5 K1 K2 3.0 K6 K8 2.5 Speedup 2.0 1.5 1.0 0.5 0.0 10 3 10 4 10 5 10 6 Size (n. elems)
Speedups on a 4-Core Machine Mergesort Size = 640000 3.5 K2 K4 K6 3.0 K8 2.5 Speedup 2.0 1.5 1.0 0.5 0.0 1 2 3 4 5 6 7 8 +RTS -N
Speedups on a 4-Core Machine Mergesort +RTS -N8 10000 4.0 40000 80000 3.5 640000 3.0 Speedup 2.5 2.0 1.5 1.0 0.5 0.0 1 2 3 4 5 6 7 8 K
Speedups on a 4-Core Machine Mergesort +RTS -N8 K1 3.5 K2 K6 K8 3.0 2.5 Speedup 2.0 1.5 1.0 0.5 0.0 10 4 10 5 Size (n. elems)
Recommend
More recommend