Data Caching, Garbage Collection, and the Java Memory Model Wolfgang Puffitsch wpuffits@mail.tuwien.ac.at JTRES ’09, September 23-25, 2009 1 / 24
Motivation I ◮ Sequential consistency is expensive ◮ Multi-processors often implement relaxed memory models ◮ JMM is a logical choice for a Java processor 2 / 24
Motivation II ◮ JMM specifies memory model for application ◮ JMM is agnostic of run-time system ◮ Minimal communication between application and GC ◮ Asymmetric synchronization 3 / 24
The Java Memory Model ◮ Happens-before relation ◮ Similar to lazy release consistency ◮ Allows various optimizations ◮ Rules out a number of odd behaviors ◮ Causality must be obeyed 4 / 24
Surprising Behavior int x = 0; Thread T1 Thread T2 int r1 = x; int r2 = x; x = 1; x = 2; Java memory model allows r1==2 , r2==1 5 / 24
Data Cache Implementation I ◮ Implemented for JopCMP ◮ Predictable, low HW cost ◮ Follows idea of lazy release consistency ◮ Invalidate cache on monitorenter and volatile reads ◮ Write-through cache 6 / 24
Data Cache Implementation II ◮ No global store order ◮ Accesses cannot bypass each other locally ◮ Relatively simple memory model ◮ Good predictability ◮ Consistency actions are always local 7 / 24
Moving Objects ◮ Only minimal communication between application and GC ◮ Avoid synchronization overhead for reads ◮ How to force application to see moved objects? ◮ Invalidate cache for each moved object ◮ Stronger memory model ◮ Avoid movement of objects 8 / 24
GC Algorithms – GC Cycle void runGC () { // i n i t i a t e new GC c y c l e s t a r t C y c l e ( ) ; // r e t r i e v e root s gatherRoots ( ) ; // t r a c e the ob j e ct graph traceObjectGraph ( ) ; // c l e a r o b j e c t s that are s t i l l white sweepUnusedObjects ( ) ; // o p t i o n a l memory defragmentation defragment ( ) ; } 9 / 24
Tricolor Abstraction ◮ White objects have not been visited ◮ Gray objects need to be visited ◮ Black objects have been visited ◮ After tracing, reachable objects are black and white objects are garbage 10 / 24
GC Algorithms – Tracing traceObjectGraph () { void // while there are s t i l l gray o b j e c t s ( ! grayObjects . isEmpty ( ) ) { while // get a gray o b je c t Object obj = grayObjects . removeFirst ( ) ; // i t e r a t e over a l l r e f e r e n c e f i e l d s for ( F i e l d f in g e t R e f F i e l d s ( obj )) { Object f i e l d V a l = g e t F i e l d ( obj , f ) ; // mark r e f e r e n c e d o b j e c t s ( c o l o r ( f i e l d V a l ) == white ) { i f markGray ( f i e l d V a l ) ; } } markBlack ( obj ) ; }} 11 / 24
GC Algorithms – Write Barrier void putFieldRef ( Object obj , F i e l d f , Object newVal ) { // snapshot − at − beginning b a r r i e r Object oldVal = g e t F i e l d ( obj , f ) ; i f ( c o l o r ( oldVal ) == white ) { markGray ( oldVal ) ; } // w r i t e new value to f i e l d p u t F i e l d ( obj , f , newVal ) ; } 12 / 24
Tracing Requirements The object graph can be traced correctly if ◮ a snapshot-at-beginning write barrier is used, and ◮ new objects are allocated non-white, and ◮ a consensus is established at the beginning of tracing 13 / 24
Tracing – Justification ◮ Objects must either be reachable from snapshot or newly allocated ◮ Differences in object graph views must stem from updates ⇒ write barrier ◮ Concurrent updates must see snapshot ◮ Works for our cache implementation ◮ Not guaranteed in JMM! 14 / 24
Tracing – JMM Counterexample x.f == A; Thread T1 Thread T2 Obj o1 = x.f; Obj o2 = x.f; ... ... x.f = B; x.f = C; Java memory model allows o1==C , o2==B ! 15 / 24
Sliding Consensus ◮ Consensus is established by invalidating all caches ◮ How to make this non-atomically? ◮ Sliding view root scanning ◮ Invalidate cache at root scanning ◮ Assuming double-barrier ◮ Both old and new value are shaded 16 / 24
Start of GC Cycle – Requirements ◮ Field updates from earlier GC cycles must be visible to write barriers of new GC cycle ◮ Field updates from earlier GC cycles must be visible to root scanning ◮ Field updates from earlier GC cycles must be perceived consistently 17 / 24
Start of GC Cycle – Consequences ◮ Clear separation of GC cycles ◮ Threads that are preempted while executing a write barrier delay start of a GC cycle 18 / 24
Start of GC Cycle – Future work ◮ Costs of implementation choices to be evaluated ◮ Avoid overlap of old and new barriers ◮ Handshake or mutual exclusion ◮ Enforce consistent perception in write-barrier ◮ Bypass cache or cache invalidation 19 / 24
Object Initalization ◮ Threads must see default values ◮ Avoid synchronization between allocation and potential uses ◮ Memory must not have been in use since last GC cycle ◮ Cache invalidation at GC cycle start ⇒ Cache cannot contain stale values ◮ Analogue consideration for final values 20 / 24
Internal Data Structures ◮ Inter-thread communication of GC algorithm ◮ Internal data structures can follow own memory model ◮ E.g., bypass cache ◮ Avoids merging application and run-time synchronization ◮ Depends on capabilities of platform 21 / 24
Conclusion I ◮ Cache that is consistent with JMM ◮ Moving of objects needs consistency enforcement ◮ Tracing works if JMM surprising behavior is avoided ◮ Start of GC cycle requires careful design 22 / 24
Conclusion II ◮ Object creation simple in some cases ◮ Run-time system synchronization can be separated from application synchronization 23 / 24
Thank you for your attention! 24 / 24
Recommend
More recommend