debugging highly parallel programs
play

Debugging Highly-Parallel Programs Joo M. Loureno , Jos C. Cunha and - PowerPoint PPT Presentation

Debugging Highly-Parallel Programs Joo M. Loureno , Jos C. Cunha and Vitor Duarte CITI / Universidade Nova de Lisboa joao.lourenco@fct.unl.pt 1 Why do programs have errors? Problem Problem solved! Devise a Write a computational


  1. Debugging Highly-Parallel Programs João M. Lourenço , José C. Cunha and Vitor Duarte CITI / Universidade Nova de Lisboa joao.lourenco@fct.unl.pt 1

  2. Why do programs have errors? Problem Problem solved! Devise a Write a computational computer solution program

  3. What about parallel programs? Problem Problem solved! Devise a Write a computational computer solution program Interleaving errors

  4. All men are equal! What about errors? Yields a correct result, Non fail-stop Byzantine although it takes longer errors than acceptable Harder Performance Unwanted side effects Interleaving caused by non-reentrant code and shared data Synchronization Ordering failures Ordering Easier and deadlocks Sequential Violations of errors precedence or mutual exclusion relations

  5. Parallel Multicore program system ? Expected Observed behavior behavior

  6. Parallel computations Parallel program What? running Parallel Parallel generates computations computations How? Multicore system

  7. Program histories • Local History : h i – sequence of events generated by executing the program “ p i ” – h i = e i 0 , e i 1 , …, e i f – k th event in h i (e i k ) produces the local state s k • Global History : H – union of the local histories of N processes – H = h 1 U h 2 U … U h N

  8. Parallel Computation • A parallel computation is a partially ordered set ( poset ) defined as C D = (H,  ) – H = Global history –  = Lamport’s happens before relation

  9. Cut of a parallel computation • A cut of a parallel computation is a subset C of its global history H that contains an initial prefix for each of its local histories x , h 2 y , …, h n z } C = {h 1 e 1 e 1 e 1 e 1 1 2 3 4 P 1 e 2 1 e 2 2 e 2 3 e 2 4 e 2 5 P 2 e 3 1 e 3 2 e 3 3 P 3

  10. Frontier of a cut & Global state • The frontier of a cut is the set of the last states/events in a cut x , s 2 y , …, s n z } F = {s 1 • The frontier of a cut defines a global state e 1 1 e 1 2 e 1 3 e 1 4 P 1 e 2 1 e 2 2 e 2 3 e 2 4 e 2 5 P 2 e 3 1 e 3 2 e 3 3 P 3

  11. Consistent cut • A cut is consistent if for all events in its frontier, all their past events are also included in the cut Inconsistent Consistent cut cut F 1 F 2 e 1 e 1 e 1 e 1 1 2 3 4 P 1 e 2 1 e 2 2 e 2 3 e 2 4 e 2 5 P 2 e 3 1 e 3 2 e 3 3 P 3

  12. Consistent cut • A global state is consistent if it corresponds to the frontier of a consistent cut Inconsistent Consistent global state global state F 1 F 2 e 1 e 1 e 1 e 1 1 2 3 4 P 1 e 2 1 e 2 2 e 2 3 e 2 4 e 2 5 P 2 e 3 1 e 3 2 e 3 3 P 3

  13. Runs 7 states 6 states 30 states  00  00  00  00  10  10  10  01  01  01 P 1 P 2  20  20  11  11  11  02  02  02 e 2 1 e 1 1  30  30  21  21  21  12  12  03  03  03  40  40  31  31  31  22  22  13  13  04  04  04 e 2 2 e 1 2  50  50  41  41  41  32  32  23  23  14  14  14  05  05 e 1 e 2 3 3  60  60  51  51  42  42  42  33  33  24  24  24  15  15 e 1 4 e 2 4  61  61  52  52  43  43  43  34  34  34  25  25 e 1 5  53  53  53  44  44  35  35  35  62  62 e 2 5  63  63  63  54  54  45  45  45 e 1 6  64  64  64  55  55  55  65  65  65  65

  14. Observing a parallel program Consistent Observation observation internal & interaction permutation Process (P 1 ) events Events / Local Consistent Process (P 2 ) Run / states histories run union Process (P N ) arbitrary total order Global history casual precedence constraints subset parallel Consistent Frontier of a Cut computation cut consistent cut

  15. Observing a parallel program Consistent Developer perspective Observation observation internal & interaction permutation Process (P 1 ) events Events / Local Consistent Process (P 2 ) Run Program execution / states histories run union Process (P N ) arbitrary total order Global history casual precedence constraints subset parallel Consistent Frontier of a Cut Program state computation cut consistent cut

  16. Observing and debugging interactive debugging state based debugging of remote processes observation of program states to obtain reproducible behavior trace, replay deterministic re-execution and debugging repeatable observations to analyze alternative paths combined testing, systematic state exploration steering and debugging alternative observations to evaluate correctness properties global predicate global program properties detection observation of consistency

  17. The scaling challenge • How to deal with hundreds (or thousands) of threads? – Collect, store and gather observations / logs • What shall be the detail level? • Logs may be huge – Combining the logs – Reason about global observations • Visualize large amounts of information • Evaluate global predicates on the program state • Evaluate global predicates on the program run – Map observation points to the original program • Dealing with code-generators • Supporting high-level abstractions, DSLs

  18. The End… Happy debugging! 

Recommend


More recommend