datalog based scalable semantic diffing of concurrent
play

Datalog-based Scalable Semantic Diffing of Concurrent Programs - PowerPoint PPT Presentation

ASE 2018 Datalog-based Scalable Semantic Diffing of Concurrent Programs Chungha Sung | Shuvendu K. Lahiri | Constantin Enea Chao Wang Concurrent Programs Evolving Software becoming better Fixing bugs Fixing bugs Fixing bugs or or


  1. ASE 2018 Datalog-based Scalable Semantic Diffing of Concurrent Programs Chungha Sung | Shuvendu K. Lahiri | Constantin Enea Chao Wang

  2. Concurrent Programs

  3. Evolving Software becoming better Fixing bugs Fixing bugs Fixing bugs or or or Adding features Adding features Adding features

  4. Evolving Software Unexpected Behavior Fixing bugs Fixing bugs Fixing bugs or or or Adding features Adding features Adding features

  5. Thread 1 Thread 2 lock(a); lock(a); x = 1; x = 0; y = x ; unlock(a); unlock(a);

  6. Thread 1 Thread 2 lock(a); lock(a); x = 1; x = 0; y = x ; unlock(a); unlock(a); New Read-from edge is created!!

  7. Comparison after a change Program Program after a change NO! Is there any unexpected new behavior?

  8. Semantic difference T1 T2 T1 T2 == ? New data-flow edge

  9. Prior work • Bounded Model Checking (BMC) based approach - Need to instrument code with assertions - Interleaving enumeration => expensive

  10. Our approach • Constraint-based scalable program analysis - No code instrumentation needed - No interleaving enumeration - 10x to 1000x faster - Practically accurate

  11. Outline ▪ Motivation ▪ Contribution (Scalable approximate semantic diffing) ▪ Experiments ▪ Conclusion

  12. Overview Scalable & Pratically Accurate! Datalog inference rules for semantic diffing P1 P2 Compare the allowed data-flow edges over two programs

  13. Overview LLVM pass Query Differences Datalog 𝑸 𝟐 Facts + \ 𝑸 𝟑 + μ Z 𝚬 𝟐𝟑 = 𝑸 𝟐 Datalog + \ 𝑸 𝟐 + 𝚬 𝟑𝟐 = 𝑸 𝟑 Engine in Z3 Datalog 𝑸 𝟑 Facts Sematic Diffing framework Datalog Rules Patch info

  14. Example Thread1() { Thread2() { t = 0; lock(a); x = 1; t = x; create(Thread2); … lock(a); x = 2; … unlock(a); assert(x != t); } unlock(a); }

  15. Example Thread1() { Thread2() { t = 0; lock(a); x = 1; t = x; create(Thread2); … lock(a); x = 2; … unlock(a); assert(x != t); } unlock(a); }

  16. Example Thread1() { Thread2() { t = 0; lock(a); t=0, x=1 x = 1; t = x; create(Thread2); … lock(a); x = 2; … unlock(a); assert(x != t); } unlock(a); }

  17. Example Thread1() { Thread2() { t = 0; lock(a); x = 1; t = x; create( Thread2 ); … lock(a); x = 2; … unlock(a); assert(x != t); } unlock(a); }

  18. Example Thread1() { Thread2() { t = 0; lock(a); x = 1; t = x ; create(Thread2); … lock(a); x = 2; … unlock(a); assert( x != t); } unlock(a); }

  19. Example Thread1() { Thread2() { t = 0; lock(a); x = 1; t = x ; create(Thread2); … lock(a); x = 2; … unlock(a); assert( x != t); } t=0, x=1 unlock(a); } Assertion is not violated

  20. Example Thread1() { Thread2() { t = 0; lock(a); x = 1; t = x ; create(Thread2); … lock(a); x = 2; t=1, x=2 … unlock(a); assert( x != t); } unlock(a); } Assertion is not violated

  21. Example after a change Thread1() { Thread2() { t = 0; lock(a); x = 1; t = x ; create(Thread2); … lock(a); x = 2; … unlock(a); assert( x != t); } unlock(a); }

  22. Example after a change Thread1() { Thread2() { t = 0; lock(a); x = 1; t = x ; Read-from create(Thread2); … lock(a); x = 2; … unlock(a); Read-from assert( x != t ); } unlock(a); } Assertion is violated

  23. Overview LLVM pass Query Differences Datalog 𝑸 𝟐 Facts + \ 𝑸 𝟑 + μ Z 𝚬 𝟐𝟑 = 𝑸 𝟐 Datalog + \ 𝑸 𝟐 + 𝚬 𝟑𝟐 = 𝑸 𝟑 Engine in Z3 Datalog 𝑸 𝟑 Facts Sematic Diffing framework Datalog Rules Patch info

  24. Program Analysis in Datalog [Whaley & Lam, 2004] [Livshits & Lam, 2005] Evolving concurrent programs Datalog facts Datalog Engine Datalog Rules Semantic difference checking between the two programs

  25. What is Datalog? • Declarative language for deductive database [Ullman 1989] Facts parent (bill, mary) parent (mary, john) Rules ancestor (X, Y) ← parent (X, Y) ancestor (X, Y) ← parent (X, Z), ancestor (Z, Y) New relationship: ancestor (bill, john)

  26. Datalog Translation Thread1() { Thread2() { MustHappenBefore relations t = 0; lock(a); po (s1, s2) -> MustHB (s1, s2) 1: x = 1; 3: t = x; ThreadOrder(s1, t1, s2, t2) -> create(Thread2); … MustHB(s1, s2) lock(a); 4: x = 2; … unlock(a); 2: assert(x != t); } unlock(a); } Inferred relations MustHB: ( {1, 2}, {3, 4} , {1, 3}, {1, 4})

  27. Datalog Translation Thread1() { Thread2() { MustHappenBefore relations t = 0; lock(a); po (s1, s2) -> MustHB (s1, s2) 1: x = 1; 3: t = x; ThreadOrder(s1, t1, s2, t2) -> create(Thread2); … MustHB(s1, s2) lock(a); 4: x = 2; … unlock(a); 2: assert(x != t); } unlock(a); } Inferred relations MustHB: ({1, 2}, {3, 4}, {1, 3}, {1, 4})

  28. Datalog Translation Thread1() { Thread2() { MayHappenBefore relations t = 0; lock(a); MustHB (s1, s2) -> MayHB (s1, s2) 1: x = 1; 3: t = x; create(Thread2); … Not ThreadOrder(s1, t1, s2, t2) -> MayHB(s2, s1) lock(a); 4: x = 2; … unlock(a); 2: assert(x != t); } unlock(a); } Inferred relations MustHB: ({1, 2}, {3, 4}, {1, 3}, {1, 4}) MayHB: ( {1, 2}, {3, 4}, {1, 3}, {1, 4} , {2, 3}, {2, 4}, {3, 2}, {4, 2})

  29. Datalog Translation Thread1() { Thread2() { MayHappenBefore relations t = 0; lock(a); MustHB (s1, s2) -> MayHB (s1, s2) 1: x = 1; 3: t = x; create(Thread2); … Not ThreadOrder(s1, t1, s2, t2) -> MayHB(s2, s1) lock(a); 4: x = 2; … unlock(a); 2: assert(x != t); } unlock(a); } Inferred relations MustHB: ({1, 2}, {3, 4}, {1, 3}, {1, 4}) MayHB: ({1, 2}, {3, 4}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 2}, {4, 2} )

  30. Datalog Translation Thread1() { Thread2() { MayReadFrom relations t = 0; lock(a); MayHB (s1, s2) & St(s1) & Ld(s2) -> MayRF (s1, s2) 1: x = 1; 3: t = x; create(Thread2); … lock(a); 4: x = 2; … unlock(a); 2: assert(x != t); } unlock(a); } Inferred relations MustHB: ({1, 2}, {3, 4}, {1, 3}, {1, 4}) MayHB: ({1, 2}, {3, 4}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 2}, {4, 2}) MayRF: ({1, 2}, {1, 3}, {3, 2}, {4, 2})

  31. Datalog Translation Thread1() { Thread2() { Rank2 relations t = 0; lock(a); W(x) 1: x = 1; 3: t = x; CS CS create(Thread2); … R(x) R(x) lock(a); 4: x = 2; PostDom … unlock(a); W(x) 2: assert(x != t); } unlock(a); }

  32. Datalog Translation Thread1() { Thread2() { Rank2 relations t = 0; lock(a); W(x) RF2 RF1 1: x = 1; 3: t = x; CS CS create(Thread2); … R(x) R(x) lock(a); 4: x = 2; PostDom RF3 … unlock(a); W(x) 2: assert(x != t); } unlock(a); }

  33. Datalog Translation Thread1() { Thread2() { Rank2 relations t = 0; lock(a); W(x) RF2 RF1 1: x = 1; 3: t = x; CS CS create(Thread2); … R(x) R(x) lock(a); 4: x = 2; PostDom RF3 … unlock(a); W(x) 2: assert(x != t); } unlock(a); } RF1 -> not RF3 RF2 -> not RF1

  34. Datalog Translation Thread1() { Thread2() { Rank2 relations t = 0; lock(a); W(x) RF2 RF1 1: x = 1; 3: t = x; CS CS create(Thread2); … R(x) R(x) lock(a); 4: x = 2; PostDom RF3 … unlock(a); W(x) 2: assert(x != t); } unlock(a); } RF1 -> not RF3 RF2 -> not RF1 Inferred relations MustHB: ({1, 2}, {3, 4}, {1, 3}, {1, 4}) MayHB: ({1, 2}, {3, 4}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 2}, {4, 2}) MayRF: ({1, 2}, {1, 3}, {3, 2}, {4, 2}) Rank2: ([{1, 2} -> {1, 3}], [{1, 3} -> {4, 2}])

  35. Datalog Translation Thread1() { Thread2() { Rank2 relations t = 0; lock(a); W(x) RF2 RF1 1: x = 1; 3: t = x; CS CS create(Thread2); … R(x) R(x) lock(a); 4: x = 2; PostDom RF3 … unlock(a); W(x) 2: assert(x != t); } unlock(a); } RF1 -> not RF3 RF2 -> not RF1 Inferred relations MustHB: ({1, 2}, {3, 4}, {1, 3}, {1, 4}) MayHB: ({1, 2}, {3, 4}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 2}, {4, 2}) MayRF: ({1, 2}, {1, 3}, {3, 2}, {4, 2}) Rank2: ([{1, 2} -> {1, 3}], [{1, 3} -> {4, 2}], [{1, 3} -> {1, 2}] )

  36. Overview LLVM pass Query D 𝒋𝒈𝒈𝒇𝒔𝒇𝒐𝒅𝒇𝒕 Datalog 𝑸 𝟐 Facts + \ 𝑸 𝟑 + μ Z 𝚬 𝟐𝟑 = 𝑸 𝟐 Datalog + \ 𝑸 𝟐 + 𝚬 𝟑𝟐 = 𝑸 𝟑 Engine in Z3 Datalog 𝑸 𝟑 Facts Sematic Diffing framework Datalog Rules Patch info

  37. Computing differences MayRF MayRF P1 P2 MayRF (s1, s2, p1) & Not MayRF(s1, s2 p2) -> DiffP1-P2 (s1, s2) MayRF (s1, s2, p2) & Not MayRF(s1, s2 p1) -> DiffP2-P1 (s2, s1)

  38. Computing differences MayRF MayRF P1 P2 May be allowed in P1 ([{1, 2} -> {1, 3}], [{1, 3} -> {4, 2}]) May be allowed in P2 ([{1, 2} -> {1, 3}], [{1, 3} -> {4, 2}], [{1, 3} -> {1, 2}])

  39. Experimental Results 1 The first set # of apps 41 LOC 5,546 Types Sync, Th.Order, St.Order, Cond [Bouajjani et al. SAS 2017 ] [Yu & Narayanasamy ISCA 2009 ] [Beyer TACAS 2015 ] [Bloem et al. FM 2014 ] Sources [Lu et al. ASPLOS 2008 ] [Herlihy & Shavit The Art of Multiprocessor Programming 2008 ] [ Open source bug reports ]

  40. Comparison • Bounded Model Checking based approach

  41. Experimental Results 1 The first set Execution time of > 3 hours BMC-based approach Execution time of 15.57 seconds our approach (NEW) # of differences 402 dataflow edges our approach found ( All valid )

Recommend


More recommend