verifying distributed programs via canonical
play

Verifying Distributed Programs via Canonical Sequentialization - PowerPoint PPT Presentation

Verifying Distributed Programs via Canonical Sequentialization Klaus von Gleissenthall Joint work with Alexander Bakst, Ranjit Jhala and Rami Gkhan Kc 1 Writing distributed programs A bug appears Issue: random hangs / deadlock in


  1. Verifying Distributed Programs via Canonical Sequentialization Klaus von Gleissenthall Joint work with Alexander Bakst, Ranjit Jhala and Rami Gökhan Kıcı 1

  2. Writing distributed programs A bug appears… Issue: random hangs / deadlock in mono

  3. Writing distributed programs … haunts you … occurs in about Issue: 10% of our runs random hangs / deadlock in mono

  4. Writing distributed programs … then you write some more code… moved to version 4.8.0.483.

  5. Writing distributed programs … and the bug disappears… yet to reproduce moved to the issue in 
 version 4.8.0.483. 4.8.0.483

  6. Writing distributed programs …leaving you hoping it stays gone. yet to reproduce moved to the issue in 
 version 4.8.0.483. 4.8.0.483 should be more confident in a few weeks

  7. A better world Can we catch all deadlocks during compile-edit cycle?

  8. A better world let’s fix it coord :: Transaction -> Int -> SymSet ProcessId -> Process () coord transaction n nodes = do fold query () nodes n_ <- fold countVotes 0 nodes if n == n_ then sent wrong forEach nodes commit () else response address forEach nodes abort () forEach nodes expect :: Ack unmatched where query () pid = do { me <- myPid; send pid (pid, transaction) } receive countVotes init nodes = do msg <- expect :: Vote case msg of Accept _ -> return (x + 1) Reject -> return x acceptor :: Process () acceptor = do me <- myPid (who, transaction) <- expect :: (ProcessId, Transaction) unmatched vote <- chooseVote transaction send send who vote check

  9. A better world A better world proof No deadlocks can occur! coord :: Transaction -> Int -> SymSet ProcessId -> Process () coord transaction n nodes = do fold query () nodes n_ <- fold countVotes 0 nodes if n == n_ then forEach nodes commit () else forEach nodes abort () forEach nodes expect :: Ack where query () pid = do { me <- myPid; send pid (me, transaction) } countVotes init nodes = do msg <- expect :: Vote case msg of Accept _ -> return (x + 1) Reject -> return x acceptor :: Process () acceptor = do me <- myPid (who, transaction) <- expect :: (ProcessId, Transaction) vote <- chooseVote transaction send who vote check

  10. This talk: Brisk Proves absence of deadlocks Provides counterexamples Fast enough for interactive use Restricted computation model

  11. Restricted computation model But Expressive Enough to Implement: - Work Stealing - Map Reduce - Distributed File System

  12. Outline The Problems The Key Idea The Implementation The Evaluation

  13. The Problems

  14. Example: Two phase commit (2PC) Goal: Commit Transaction to all nodes nodes coordinator

  15. Example: Two phase commit (2PC) Phase 1 depending on the value, votes to commit or abort data sends data

  16. Example: Two phase commit (2PC) Phase 1 depending on the value, votes to commit or abort commit commits if no one voted to abort commit commit aborts otherwise commit

  17. Example: Two phase commit (2PC) Phase 2 commits transaction commit sends decision to commit (or abort)

  18. Example: Two phase commit (2PC) Phase 2 send acknowledgement ACK done

  19. How to verify 2PC? Sends match receives? Does Implementation Deadlock?

  20. How to verify 2PC?

  21. How to verify 2PC? Problem: Asynchrony messages may travel at different speeds data commit processes execute at 
 different speeds commit Races trigger different behaviors commit

  22. How to verify 2PC? Problem: Unbounded Processes … … don’t know how many nodes at runtime

  23. How to verify 2PC? Testing? No guarantees Proofs? High user burden Model checking…? Infinite number of states

  24. Outline The Problems The Key Idea The Implementation The Evaluation

  25. Outline The Problems The Key Idea The Implementation The Evaluation

  26. The Key Idea Canonical Sequentialization

  27. Canonical Sequentialization Don’t enumerate execution orders… 1 ; 1 ; 2 2 3 3 … Reason about single representative execution

  28. Canonical Sequentialization Example 2PC 1. Sends 4. Send 3. Relay decision transaction it wants to 2. Send votes acknowledgments commit ; ; ; ; 1 1 1 1 ; ; ; ; ; ; ; ; ; 2 2 2 2 ; ; ; 3 3 3 3

  29. Canonical Sequentialization A Trickier Example Work stealing queue

  30. Work stealing queue workers perform tasks queue 1 coordinator assigns work collects results 2 3

  31. Work stealing queue idle workers ask for work queue assigns an 1 item 2 3 sends result to the coordinator compute results

  32. Sequentialized queue arbitrary who assigns task for sends it to worker picks computes to arbitrary each master result from writes result worker item set to result set 1 ; ; 1 ; ; ; 3 ; 1 ; 1 ; 3

  33. How can sequentialization help verify programs?

  34. How can sequentialization help verify programs? no sequentialization means likely compute its wrong canonical sequentialization use to implies same on simpler, prove deadlock halting sequential additional freedom states program properties

  35. Outline The Problems The Key Idea The Implementation The Evaluation

  36. Outline The Problems The Key Idea The Implementation The Evaluation

  37. The Implementation

  38. The Implementation 1. Restrict Computation Model 2. Sequentialize by Rewriting

  39. 1. Restrict Computation Model Symmetric Nondeterminism Races yield equivalent outcomes

  40. Symmetric Nondeterminism Example: Phase 1 of 2PC data coordinator sends transaction no race

  41. Symmetric Nondeterminism Example: Phase 1 of 2PC Send vote commit Race same outcome? commit processes are symmetric commit

  42. Symmetry Symmetry means invariant under invariance under not this one rotation transformation look at from above

  43. Symmetry In Distributed Systems [Norris and Dill 1996] Permuting Process Identifiers Yields equivalent halting states

  44. Symmetry Example: Phase 1 of 2PC Name the processes n1 Permuting n1 and n2 n2 equivalent halting states n3

  45. Symmetric Nondeterminism Example: Phase 1 of 2PC choose between picking n1 commit n1 and n2 (msg,id) <-recv; (commit,n1) pick n1 n2 commit did we lose any states? n3 commit

  46. Symmetric Nondeterminism Example: Phase 1 of 2PC No! n1 commit if we pick n2 (msg,id)<-recv; (commit,n1) (commit,n2) we can n2 commit permute ids to end up in same state so the n3 commit states have the same behavior

  47. How can we use symmetry to sequentialize?

  48. Symmetric Nondeterminism Example: Phase 1 of 2PC receive directly after sending data [Lipton75] coordinator sends transaction no race

  49. Symmetric Nondeterminism Example: Phase 1 of 2PC ; ; ;

  50. Symmetric Nondeterminism Example: Phase 1 of 2PC Send vote commit What Race now? processes commit are symmetric equivalent pick any! outcomes commit

  51. Symmetric Nondeterminism Example: Phase 1 of 2PC ; ; ;

  52. The Implementation 1. Restrict Computation Model 2. Sequentialize by Rewriting

  53. 2. Sequentialize by Rewriting (by example)

  54. 2. Sequentialize by Rewriting Example 1 send q ping v <- recv p; v <- ping ; q || w <- recv q send p pong p q p, q are in parallel

  55. 2. Sequentialize by Rewriting Example 1 Sequentialization v <- ping ; q || w <- recv q send p pong w <- pong p q p p, q are in parallel

  56. 2. Sequentialize by Rewriting Example 2 loop over set processes of symmetric processes for q in qs do v <- recv p; ∏ send q ping || send p pong w <- recv q q ∈ qs q end p p, qs={q1…qn} are in parallel

  57. 2. Sequentialize by Rewriting Example 2 Arbitrary Generalize iteration for q in qs do for q in qs do v <- recv p; ∏ v <- ping ; send q ping || q send p pong w <- pong w <- recv q q ∈ qs p q end end p p, qs={q1…qn} are in parallel

  58. 2. Sequentialize by Rewriting Example 3 two loops { for q in qs do send q ping end ∏ v <- recv p; || { for q in qs do q ∈ qs send p pong w <- recv qs q end p

  59. 2. Sequentialize by Rewriting Example 3 for q in qs do send q ping end for q in qs do ∏ v <- recv p; || for q in qs do v <- ping ; q ∈ qs send p pong q end w <- recv qs q end p

  60. 2. Sequentialize by Rewriting Example 3 partially for q in qs do sequentialized for q in qs do v <- ping ; q ; v <- ping ; q end end symmetric (checked) for q in qs do for q in qs do w <- pong ∏ p || send p pong w <- recv qs end q q ∈ qs end p

  61. The Implementation 1. Restrict Computation Model 2. Sequentialize by Rewriting

  62. Outline The Problems The Key Idea The Implementation The Evaluation

  63. Outline The Problems The Key Idea The Implementation The Evaluation

  64. The Evaluation

  65. The Evaluation computes canonical sequentialization Implemented in a Haskell library ; ; Brisk ; communication primitives like send / receive / foreach provides counterexample to sequentialization

Recommend


More recommend