dynamic purity analysis for java programs
play

Dynamic Purity Analysis for Java Programs Chris Pickett Clark - PowerPoint PPT Presentation

Dynamic Purity Analysis for Java Programs Chris Pickett Clark Verbrugge Haiying Xu { hxu31,cpicke,clump } @sable.mcgill.ca Sable Research Group, McGill University Montr eal, Qu ebec, Canada H3A 2A7 PASTE 2007 June 14, 2007 PASTE 2007


  1. Dynamic Purity Analysis for Java Programs Chris Pickett Clark Verbrugge Haiying Xu { hxu31,cpicke,clump } @sable.mcgill.ca Sable Research Group, McGill University Montr´ eal, Qu´ ebec, Canada H3A 2A7 PASTE 2007 June 14, 2007 PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 1/30

  2. Outline Introduction and Motivation 1 Static Purity Analysis 2 Dynamic Purity Analysis 3 Experimental Results 4 Memoization 5 Conclusion and Future Work 6 PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 2/30

  3. What is Method Purity? Roughly, a pure method has no externally visible side effects . Different variations on purity are possible: S˘ alcianu and Rinard: can create, modify and return new objects Rountev: similar, but cannot return a new object PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 3/30

  4. Why is Method Purity Important? Artzi, Kiezun, Glasser, Ernst: program comprehension modelling formal verification compiler optimization memoization thread level speculation stack allocation refactoring test input generation regression oracle creation invariant detection specification mining program slicing PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 4/30

  5. Why is Method Purity Important? Artzi, Kiezun, Glasser, Ernst: program comprehension modelling formal verification compiler optimization memoization thread level speculation stack allocation refactoring test input generation regression oracle creation invariant detection specification mining program slicing PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 5/30

  6. Contributions In this work, we: Design and implement dynamic purity analysis . Investigate several different purity definitions. Introduce three different dynamic purity metrics. Implement memoization as a purity consumer. PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 6/30

  7. Outline Introduction and Motivation 1 Static Purity Analysis 2 Dynamic Purity Analysis 3 Experimental Results 4 Memoization 5 Conclusion and Future Work 6 PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 7/30

  8. Static Purity Analysis Consider the classic functional form of purity: A method is strongly pure iff it Does not r/w the heap or static data Does not perform any synchronization Does not invoke any native method Does not invoke any impure method PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 8/30

  9. Static Purity Analysis Framework PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 9/30

  10. Static Analysis Results metric comp db jack javac jess mpeg rt static method purity 14% 13% 13% 12% 13% 13% 13% dynamic method purity 6% 6% 6% 5% 5% 6% 5% dynamic invocation purity ≈ 0% 2% 10% 10% 6% 16% 3% dynamic bytecode purity ≈ 0% 2% 1% ≈ 0% ≈ 0% 2% ≈ 0% Static/dynamic method purity % of reachable/reached methods that are pure Dynamic invocation purity % of all invocations that are pure Dynamic bytecode purity % of bytecode instruction stream contained in a pure method PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 10/30

  11. Outline Introduction and Motivation 1 Static Purity Analysis 2 Dynamic Purity Analysis 3 Experimental Results 4 Memoization 5 Conclusion and Future Work 6 PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 11/30

  12. Motivation for Dynamic Purity Analysis Static purity analysis is hard: Implementation is complex Whole-program analysis is expensive Dynamic evaluation tells a different story: Static vs. dynamic call graph Choice of metrics Input dependence PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 12/30

  13. Motivation for Dynamic Purity Analysis Purity can also depend on method input: int x; void foo (boolean b) { if (b) x = 10; } If we only ever execute foo (false) , foo is pure! PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 13/30

  14. Different Kinds of Dynamic Purity Four different kinds of dynamic purity: Strong : the same as strong static purity – no heap or static r/w – no calls to impure methods Moderate : – allow object allocation, if the object does not escape – allow heap r/w to non-escaping objects – allow calls to certain impure methods Weak : – moderate, but no limitations on heap reads Once-Impure : – weak, but no restrictions on the first invocation PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 14/30

  15. Moderate Purity class Obj { int f; public Obj() { f = 10; } Obj bar() { Obj o = new Obj(); return o; } int foo() { // moderately pure Obj o = bar(); return o.f; } ... PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 15/30

  16. Weak and Once-Impure Purity ... static int x; int baz (Obj o) { // weakly pure return o.f; } int baf (boolean b) { // once-impure for TF+ if (b) { Obj.x = 9 * 6; // write to static field } return 42; } } PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 16/30

  17. Outline Introduction and Motivation 1 Static Purity Analysis 2 Dynamic Purity Analysis 3 Experimental Results 4 Memoization 5 Conclusion and Future Work 6 PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 17/30

  18. Method Purity 100 strong_s 90 strong_d moderate 80 weak once_impure 70 Method Purity(%) 60 50 40 30 20 10 0 comp db jack javac jess mpeg rt Fairly uniform across benchmarks. Moderate purity does not improve much—cannot dereference input. PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 18/30

  19. Invocation Purity 100 strong_s 90 strong_d moderate 80 weak once_impure Invocation Purity(%) 70 60 50 40 30 20 10 0 comp db jack javac jess mpeg rt Unpredictable from method purity. Two different groups appear. PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 19/30

  20. Bytecode Purity 100 strong_s 90 strong_d moderate 80 weak once_impure Bytecode Purity(%) 70 60 50 40 30 20 10 0 comp db jack javac jess mpeg rt Somewhat predictable from invocation purity. Three different groups appear. PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 20/30

  21. Sources of Impurity source comp db jack javac jess mpeg rt 27% 29% 21% 21% 23% 24% 28% PUTFIELD 52% 52% 58% 66% 61% 60% 53% PUTFIELD+ method impurity source comp db jack javac jess mpeg rt 81% 82% 45% 25% 24% 40% 71% PUTFIELD 19% 17% 37% 58% 19% 60% 28% PUTFIELD+ invocation impurity source comp db jack javac jess mpeg rt 21% 85% 38% 25% 8% 11% 33% PUTFIELD 79% 13% 48% 66% 45% 89% 66% PUTFIELD+ bytecode impurity PUTFIELD is the main reason for impurity. PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 21/30

  22. Outline Introduction and Motivation 1 Static Purity Analysis 2 Dynamic Purity Analysis 3 Experimental Results 4 Memoization 5 Conclusion and Future Work 6 PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 22/30

  23. Using Purity for Memoization Overview of memoization: Maps method input to output Allows repeat invocations of pure methods to be skipped Once-impure purity is a natural fit How can we use memoization? Candidate for optimization Good functional sanity test PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 23/30

  24. Memoization Framework PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 24/30

  25. Applying Memoization Factors influencing memoization decisions: Method size (50 instructions) Input size (100 KB—otherwise potentially the whole heap!) Hashtable warm up period (1000 cold start misses) Hit ratio (better than 1 in 10) Global memory consumption (1 GB) These are fairly generous limits... PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 25/30

  26. Execution Times 300 vanilla online online+memo offline+memo execution time(s) 200 100 0 comp db jack javac jess mpeg rt PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 26/30

  27. Memoization Improvements Why doesn’t memoization achieve speedup? Small number of memoized methods Most memoized methods are short Usually, less than 1% of bytecode is skipped (best case 9%) Implementation limitations Potential improvements: Consider purity on a per-input basis Track only those fields read by the method Adaptively turn off memoization if no benefit Allow for cycles in input data structures PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 27/30

  28. Outline Introduction and Motivation 1 Static Purity Analysis 2 Dynamic Purity Analysis 3 Experimental Results 4 Memoization 5 Conclusion and Future Work 6 PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 28/30

  29. Conclusions Static results correlate weakly with dynamic behaviour We considered three different metrics: method purity varies only slightly invocation purity separates benchmarks into two groups bytecode purity separates benchmarks into three groups Consumer applications can impose strong constraints PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 29/30

  30. Future Work Future work: Consider purity at different granularities Visualize purity evolution over time Support arbitrary kinds of dynamic purity Memoization improvements Other applications besides memoization (lots!) – e.g., speculate past nearly pure methods PASTE 2007 Haiying Xu, Chris Pickett, Clark Verbrugge 30/30

Recommend


More recommend