efficient and precise points to analysis
play

Efficient and Precise Points-to Analysis: Modeling the Heap by - PowerPoint PPT Presentation

Efficient and Precise Points-to Analysis: Modeling the Heap by Merging Equivalent Automata Tian Tan, Yue Li and Jingling Xue PLDI 2017 June, 2017 1 A New Points-to Analysis T echnique for Object-Oriented Programs 2 Points-to Analysis


  1. Efficient and Precise Points-to Analysis: Modeling the Heap by Merging Equivalent Automata Tian Tan, Yue Li and Jingling Xue PLDI 2017 June, 2017 1

  2. A New Points-to Analysis T echnique for Object-Oriented Programs 2

  3. Points-to Analysis  Determines ◦ “which objects a variable can point to?” 3

  4. Uses of Points-to Analysis Clients Tools  Security analysis  Bug detection  Compiler optimization Chord  Program verification  Program understanding …  … 4

  5. Uses of Points-to Analysis Clients Tools  Security analysis  Bug detection  Compiler optimization Chord  Program verification  Program understanding …  … Call Graph 5

  6. Existing Call Graph Construction  On-the-fly construction (run with points-to analysis) ◦ Precise ◦ Inefficient 6

  7. Existing Call Graph Construction  On-the-fly construction (run with points-to analysis) ◦ Precise ◦ Inefficient  3-object-sensitive points-to analysis ◦ Very precise ◦ Adopted by, e.g., Chord 7 7

  8. 3-Object-Sensitive Points-to Analysis  Analyze Java programs ◦ Intel Xeon E5 3.70GHz,128GB of memory ◦ Time budget: 5 hours (18000 secs) 8

  9. 3-Object-Sensitive Points-to Analysis  Analyze Java programs ◦ Intel Xeon E5 3.70GHz,128GB of memory ◦ Time budget: 5 hours (18000 secs) Analysis time (sec.) 14469 pmd (4 hours) Unscalable findbugs (> 5 hours) 0 5000 10000 15000 9

  10. T wo Mainstreams of Points-to Analysis T echniques  Model control-flow  Model data-flow 10

  11. T wo Mainstreams of Points-to Analysis T echniques  Model control-flow ◦ Context-sensitivity  Call-site- sensitivity (PLDI’04, PLDI’06)  Object- sensitivity (ISSTA’02, TOSEM’05, SAS’16)  Type- sensitivity (POPL’11)  …  Model data-flow 11

  12. T wo Mainstreams of Points-to Analysis T echniques  Model control-flow ◦ Context-sensitivity  Call-site- sensitivity (PLDI’04, PLDI’06)  Object- sensitivity (ISSTA’02, TOSEM’05, SAS’16)  Type- sensitivity (POPL’11)  …  Model data-flow ◦ Heap abstraction  Allocation-site abstraction  Type-based abstraction  … 12

  13. T wo Mainstreams of Points-to Analysis T echniques  Model control-flow ◦ Context-sensitivity  Call-site- sensitivity (PLDI’04, PLDI’06)  Object- sensitivity (ISSTA’02, TOSEM’05, SAS’16)  Type- sensitivity (POPL’11)  …  Model data-flow ◦ Heap abstraction  Allocation-site abstraction  Type-based abstraction  … 13

  14. Heap Abstraction Dynamic Static execution analysis abstracted or partitioned … … Finite Infinite-size (abstract) heap objects 14

  15. Allocation-Site Abstraction  One object per allocation site 1 A a1 = new A(); 2 A a2 = new A(); 3 B b = new B() ; 15

  16. Allocation-Site Abstraction  One object per allocation site o 1 A 1 A a1 = new A(); A o 2 2 A a2 = new A(); 3 B b = new B() ; o 3 B 16

  17. Allocation-Site Abstraction  One object per allocation site ◦ Adopted by all mainstream points-to analyses o 1 A 1 A a1 = new A(); A o 2 2 A a2 = new A(); 3 B b = new B() ; o 3 B 17

  18. Allocation-Site Abstraction  Over-partition for call graph construction o 1 o 2 A A o 1 A 1 A a1 = new A(); void foo(Object o) { o.toString(); 2 A a2 = new A(); A o 2 3 foo(a1); } 4 foo(a2); A::toString() 18

  19. Allocation-Site Abstraction  Over-partition for type-dependent clients ◦ Call graph construction ◦ Devirtualization ◦ May-fail casting o 1 o 2 A A ◦ … o 1 A 1 A a1 = new A(); void foo(Object o) { o.toString(); 2 A a2 = new A(); A o 2 3 foo(a1); A a = (A) o; 4 foo(a2); } 19

  20. Type-Based Abstraction  One object per type 1 A a1 = new A(); 2 A a2 = new A(); 3 B b = new B() ; 20

  21. Type-Based Abstraction  One object per type A o 1 A a1 = new A(); 2 A a2 = new A(); B o 3 B b = new B(); 21

  22. Type-Based Abstraction  Precision loss for type-dependent clients A o A a1 = new A(); A a2 = new A(); B o B b = new B(); C c = new C(); C o a1.f = b; a2.f = c; Object o = a1.f; o.toString(); 22

  23. Type-Based Abstraction  Precision loss for type-dependent clients A o A a1 = new A(); A a2 = new A(); B o B b = new B(); C c = new C(); C o B o A o a1.f = b; a2.f = c; C o Object o = a1.f; o.toString(); 23

  24. Type-Based Abstraction  Precision loss for type-dependent clients A o A a1 = new A(); A a2 = new A(); B o B b = new B(); C c = new C(); C o B o A o a1.f = b; a2.f = c; C o B o Object o = a1.f; o.toString(); C o 24

  25. Type-Based Abstraction  Precision loss for type-dependent clients A o A a1 = new A(); A a2 = new A(); B o B b = new B(); C c = new C(); C o B o A o a1.f = b; a2.f = c; C o B o Object o = a1.f; B::toString() o.toString(); C o C::toString() 25

  26. Type-Based Abstraction  Precision loss for type-dependent clients A o A a1 = new A(); A a2 = new A(); B o B b = new B(); C c = new C(); C o B o A o a1.f = b; a2.f = c; C o B o Object o = a1.f; B::toString() o.toString(); C o C::toString() False positive 26

  27. Our Goal: Improve Efficiency Preserve Precision 27

  28. M AHJONG : A New Heap Abstraction Analysis Time (sec.) 128 14469 pmd (4 fours) 524 Unscalable findbugs (> 5 hours) MAHJONG Allocation-site abstraction Improve Efficiency Adopted by all mainstream points-to analyses 28

  29. M AHJONG : A New Heap Abstraction Analysis Time (sec.) 128 14469 pmd (4 fours) 524 Unscalable findbugs (> 5 hours) MAHJONG Allocation-site abstraction Improve Efficiency Adopted by all mainstream points-to analyses #call graph edges 44016 pmd 44004 MAHJONG Allocation-site abstraction Preserve Precision 29

  30. M AHJONG : A New Heap Abstraction Analysis Time (sec.) 128 14469 pmd (4 fours) 524 Unscalable findbugs (> 5 hours) MAHJONG Allocation-site abstraction Improve Efficiency Adopted by all mainstream points-to analyses #call graph edges 44016 pmd 44004 MAHJONG Allocation-site abstraction Preserve Precision How? 30

  31. alleviate Merging Objects Over-Partition cause Blindly Merging Objects Precision Loss 31

  32. alleviate Merging Objects Over-Partition cause Blindly Merging Objects Precision Loss f o 3 o 1 B A f o 4 C o 2 A inconsistent inconsistent types types 32

  33. alleviate Merging Objects Over-Partition cause Blindly Merging Objects Precision Loss f B o 3 o 1 B A o f A o f f C o 4 o C o 2 A inconsistent types 33

  34. Type-Consistent Objects  Definition T and O j T are type-consistent objects, O i if for every sequence of field names, = f 1 . f 2 . ... . f n : f O i T . and O j T . point to the objects of the f f same types. 34

  35. Type-Consistent Objects  Definition T and O j T are type-consistent objects, O i if for every sequence of field names, = f 1 . f 2 . ... . f n : f O i T . and O j T . point to the objects of the f f same types. M AHJONG only merges type-consistent objects 35

  36. Type-Consistent Objects  Example o 7 Y h h o 3 T f U o 9 Y o 1 g k o 11 o 5 Y X o 4 U h f o 2 T o 8 Y g o 6 X k 36

  37. Type-Consistent Objects  Example O 1 O 2 T T o 7 Y h .f U U h o 3 T f U o 9 Y o 1 .f.h Y Y g k o 11 o 5 Y X .g X X .g.k Y Y o 4 U h f o 2 T o 8 Y g o 6 X k 37

  38. Type-Consistent Objects  Example ∵ O 1 O 2 T T o 7 Y h .f U U h o 3 T f U o 9 Y o 1 .f.h Y Y g k o 11 o 5 Y X .g X X .g.k Y Y o 4 U h f T and O 2 T are o 2 T o 8 Y O 1 ∴ type-consistent objects g o 6 X k 38

  39. How to Check Type-Consistency? 39

  40. Our Solution: Sequential Automata Check Test T ype-Consistency Equivalence of Objects of Automata 40

  41. Sequential Automata  6-tuple (Q, Σ , δ , q 0 , Γ , γ ), where: ◦ Q is a set of states ◦ Σ is a set of input symbols ◦ δ is the next-state map: Q × Σ  P (Q) ◦ q 0 is the initial state ◦ Γ is a set of output symbols ◦ γ is the output map: Q  Γ 41

  42. Check Test T ype-Consistency Equivalence of Objects of Automata How? 42

  43. Objects Automata  A set of objects  Q: a set of states  A set of field names  Σ : a set of input symbols  δ : the next-state map  The field points-to map  The object to be checked  q 0 : the initial state  A set of types  Γ : a set of output symbols  The object-to-type map  γ : the output map o 4 U h f o 2 T o 8 Y o 6 X g k 43

  44. Objects Automata  A set of objects  Q: a set of states  A set of field names  Σ : a set of input symbols  δ : the next-state map  The field points-to map  The object to be checked  q 0 : the initial state  A set of types  Γ : a set of output symbols  The object-to-type map  γ : the output map objects ↔ states o 4 U h f O 2 T , O 4 U , O 6 X , O 8 Y o 2 T o 8 Y o 6 X g k 44

  45. Objects Automata  A set of objects  Q: a set of states  A set of field names  Σ : a set of input symbols  δ : the next-state map  The field points-to map  The object to be checked  q 0 : the initial state  A set of types  Γ : a set of output symbols  The object-to-type map  γ : the output map field names ↔ input symbols o 4 U h f f, g, h, k o 2 T o 8 Y o 6 X g k 45

Recommend


More recommend