Conflict Exceptions: Simplifying Concurrent Language Semantics with Precise Hardware Exceptions for Data-Races Brandon Lucia , Luis Ceze, Karin Strauss, Shaz Qadeer and Hans-J. Boehm
Data-Races are Trouble Complicated language Usually incorrect, and specifications difficult to debug Negative impact on system reliability 2
What If... Fail-Stop Semantics for Data-Races Semantics are Better data-race Safety: races can’t clear and simple debugging cause problems When a data-race occurs, throw an exception 3
Requirements High-Performance - Always-on detection Precise detection - No false positives 4
Prior Work Approx. Methods Happens-Before [Elmas’07, Flanagan‘09] [Savage’97, Zhou’07, Yu’05] ✓ ✗ Performance ✓ ✗ Precision 5
Prior Work Approx. Methods Happens-Before [Elmas’07, Flanagan‘09] [Savage’97, Zhou’07, Yu’05] ✓ ✗ Performance ✓ ✗ Precision 5
Prior Work Conflict Approx. Methods Happens-Before Exceptions [Elmas’07, Flanagan‘09] [Savage’97, Zhou’07, Yu’05] [ISCA ‘10] ✓ ✓ ✗ Performance ✓ ✓ ✗ Precision 5
Conflict Exceptions Thread 1 Thread 2 Acquire(K) Rd Y Wr X Release(K) Rd T Wr T Acquire(M) Acquire(L) Rd X ... Rd Y ... Wr Y Wr Y Release(M) Release(L) 6
Conflict Exceptions Thread 1 Thread 2 Acquire(K) Rd Y Wr X Release(K) Synchronization-Free Rd T Regions Wr T Acquire(M) Acquire(L) Rd X ... Rd Y ... Wr Y Wr Y Release(M) Release(L) 6
Conflict Exceptions Thread 1 Thread 2 Acquire(K) Rd Y Wr X Release(K) Synchronization-Free Rd T Regions Wr T Acquire(M) Acquire(L) Rd X Conflict! ... Rd Y ... Wr Y Wr Y Release(M) Release(L) 6
Conflict Exceptions Thread 1 Thread 2 Acquire(K) Rd Y Wr X Release(K) Synchronization-Free Rd T Regions Wr T Exception Acquire(M) Delivered Acquire(L) Rd X Conflict! ... Here Rd Y ... Wr Y Wr Y Release(M) Release(L) 6
Conflict Exceptions Thread 1 Thread 2 Acquire(K) Rd Y Wr X U n d Release(K) e t e c t e Synchronization-Free d Rd T R a c Regions Wr T e Exception Acquire(M) Delivered Acquire(L) Rd X Conflict! ... Here Rd Y ... Wr Y Wr Y Release(M) Release(L) 6
Conflict Exceptions Thread 1 Thread 2 Precisely detect only races that can effect consistency Acquire(K) Rd Y Wr X U n d Release(K) e t e Ignoring unimportant races is key to performance c t e Synchronization-Free d Rd T R a c Regions Wr T e Exception Acquire(M) Delivered Acquire(L) Rd X Conflict! ... Here Rd Y The Guarantee : ... Wr Y Exception-Thrown? There was a data-race. Wr Y Release(M) Release(L) Exception-Free? Sequential Consistency. 7
Language Level Benefits Reordering in Exception-Free SFRs is legal executions are SC Acquire(K) Acq(K) Rd X Rd Y Acquire(K) ✓ Wr X Wr X Wr64_Low X Rel(K) Acq(K) Wr64_Hi X Release(K) Rd X Release(K) Granularity Wr X independence Rel(K) 8
Language Level Benefits Race semantics Programming is are simpler the same pthread_lock(K) Rd Y Acq(K) Wr X Rd X Wr Q Wr Z Wr X ! Acq(L) pthread_unlock(K) Rd X Racy programs are well-behaved 9
Debugging and Reliability Concurrent, conflicting All races have some SFRs throw exceptions exceptional schedule Acq(K) Rd X Wr X ! Acq(L) Rd X Exception Handling: Damage Control: Shut Log + Recover down buggy module 10
System Support for Conflict Exceptions 11
Hardware/Software Interface New Instructions: BeginRegion and EndRegion Synchronization Operations are Singleton Regions Exceptions Thrown Precisely Before Conflicting Instruction 12
Hardware/Software Interface Acquire(K) New Instructions: BeginRegion and EndRegion BeginRegion Rd Y Wr X EndRegion Synchronization Operations are Singleton Regions Release(K) BeginRegion Exceptions Thrown Precisely Rd T Before Conflicting Instruction Wr T EndRegion 12
Access Monitoring Byte-granular access information is required Line-level N-bit Supplied Bit Access Bits Local Read ... Local Write ... Remote Read Remote Write N-byte Cache Line Exception Test: compare appropriate local and remote bits 13
Coherence Support CPU 1 CPU 2 Read Read Request Coherence Actions Local Remote Read V Write Write Reply Bits Bits CPU 1 CPU 2 Write Write/Invalidate Coherence Actions Local Local Invalidate Read Write Ack Bits Bits 14
Coherence Support CPU 1 CPU 2 Read Read Request Coherence Actions Local Remote Read V Write Write Reply Bits Bits CPU 1 CPU 2 Write Write/Invalidate Coherence Actions Local Local Invalidate Read Write Ack Bits Bits 14
Ending a Region End-Of-Region Message For all supplied lines... CPU 1 CPU 2 Address Local Write Local Read Bits Bits End-Of-Region Ack Ending a Region Clears Remote Bits Specified in EOR Msg 15
Putting It Together CPU 1’s Cache CPU 2’s Cache Sup Sup A B C D A B C D LR LR LW LW RR RR RW RW A B C D A B C D CPU 1’s Code CPU 2’s Code BeginRegion BeginRegion Wr A Rd C EndRegion BeginRegion Wr C 16
Putting It Together CPU 1’s Cache CPU 2’s Cache Sup Sup A B C D A B C D LR LR LW LW RR RR RW RW A B C D A B C D CPU 1’s Code CPU 2’s Code BeginRegion BeginRegion Wr A Rd C EndRegion BeginRegion Wr C 16
Putting It Together CPU 1’s Cache CPU 2’s Cache Sup Sup A B C D A B C D LR LR LW LW RR RR RW RW A B C D A B C D CPU 1’s Code CPU 2’s Code BeginRegion BeginRegion Wr A Rd C EndRegion BeginRegion Wr C 16
Putting It Together CPU 1’s Cache CPU 2’s Cache Sup Sup A B C D A B C D LR LR LW LW RR RR RW RW A B C D A B C D CPU 1’s Code CPU 2’s Code BeginRegion BeginRegion Wr A Rd C EndRegion BeginRegion Wr C 16
Putting It Together CPU 1’s Cache CPU 2’s Cache Sup Sup A B C D A B C D LR LR LW LW RR RR RW RW A B C D A B C D CPU 1’s Code CPU 2’s Code BeginRegion BeginRegion Wr A Rd C EndRegion BeginRegion Wr C 16
Putting It Together CPU 1’s Cache CPU 2’s Cache Sup Sup A B C D A B C D LR LR LW LW RR RR RW RW A B C D A B C D CPU 1’s Code CPU 2’s Code BeginRegion BeginRegion Wr A Rd C EndRegion BeginRegion Wr C 16
Putting It Together CPU 1’s Cache CPU 2’s Cache Sup Sup A B C D A B C D LR LR Rd Req LW LW RR RR RW RW A B C D A B C D CPU 1’s Code CPU 2’s Code BeginRegion BeginRegion Wr A Rd C EndRegion BeginRegion Wr C 16
Putting It Together CPU 1’s Cache CPU 2’s Cache Sup Sup A B C D A B C D LR LR LW LW RR RR RW RW A B C D A B C D CPU 1’s Code CPU 2’s Code BeginRegion BeginRegion Wr A Rd C EndRegion BeginRegion Wr C 16
Putting It Together CPU 1’s Cache CPU 2’s Cache Sup Sup A B C D A B C D LR LR Rd Reply LW LW RR RR RW RW A B C D A B C D CPU 1’s Code CPU 2’s Code BeginRegion BeginRegion Wr A Rd C EndRegion BeginRegion Wr C 16
Putting It Together CPU 1’s Cache CPU 2’s Cache Sup Sup A B C D A B C D LR LR LW LW RR RR RW RW A B C D A B C D CPU 1’s Code CPU 2’s Code BeginRegion BeginRegion Wr A Rd C EndRegion BeginRegion Wr C 16
Putting It Together CPU 1’s Cache CPU 2’s Cache Sup Sup A B C D A B C D LR LR LW LW RR RR RW RW A B C D A B C D CPU 1’s Code CPU 2’s Code BeginRegion BeginRegion Wr A Rd C EndRegion BeginRegion Wr C 16
Putting It Together CPU 1’s Cache CPU 2’s Cache Sup Sup A B C D A B C D LR LR EoR LW LW RR RR RW RW A B C D A B C D CPU 1’s Code CPU 2’s Code BeginRegion BeginRegion Wr A Rd C EndRegion BeginRegion Wr C 16
Putting It Together CPU 1’s Cache CPU 2’s Cache Sup Sup A B C D A B C D LR LR LW LW RR RR RW RW A B C D A B C D CPU 1’s Code CPU 2’s Code BeginRegion BeginRegion Wr A Rd C EndRegion BeginRegion Wr C 16
Putting It Together CPU 1’s Cache CPU 2’s Cache Sup Sup A B C D A B C D LR LR LW LW RR RR RW RW A B C D A B C D CPU 1’s Code CPU 2’s Code BeginRegion BeginRegion Wr A Rd C EndRegion BeginRegion Wr C 16
Putting It Together CPU 1’s Cache CPU 2’s Cache Sup Sup A B C D A B C D LR LR LW LW RR RR RW RW A B C D A B C D CPU 1’s Code CPU 2’s Code BeginRegion BeginRegion Wr A Rd C EndRegion BeginRegion Wr C 16
Putting It Together CPU 1’s Cache CPU 2’s Cache Sup Sup A B C D A B C D LR LR Invalidate LW LW RR RR RW RW A B C D A B C D CPU 1’s Code CPU 2’s Code BeginRegion BeginRegion Wr A Rd C EndRegion BeginRegion Wr C 16
Putting It Together CPU 1’s Cache CPU 2’s Cache Sup Sup A B C D A B C D LR LR LW LW RR RR RW RW A B C D A B C D CPU 1’s Code CPU 2’s Code BeginRegion BeginRegion Wr A Rd C EndRegion BeginRegion Wr C 16
Putting It Together CPU 1’s Cache CPU 2’s Cache Sup Sup A B C D A B C D LR LR Inv Ack LW LW RR RR RW RW A B C D A B C D CPU 1’s Code CPU 2’s Code BeginRegion BeginRegion Wr A Rd C EndRegion BeginRegion Wr C 16
Recommend
More recommend