Strand Persistency Vaibhav Gogte, William Wang $ , Stephan Diestelhorst $ , Peter M. Chen, Satish Narayanasamy, Thomas F. Wenisch NVMW $ 03/12/2019
Promise of persistent memory (PM) Performance Density Non-volatility 2
Promise of persistent memory (PM) Performance * “Optane DC Persistent Memory will be Density offered in packages of up to 512GB per stick.” “… expanding memory per CPU socket to as much as 3TB.” Non-volatility * Source: www.extremetech.com 3
Promise of persistent memory (PM) Performance * “Optane DC Persistent Memory will be Density offered in packages of up to 512GB per stick.” “… expanding memory per CPU socket to as much as 3TB.” Non-volatility * Source: www.extremetech.com Byte-addressable, load-store interface to durable storage 4
Persistent memory system CPU Writeback caches DRAM Persistent Memory (PM) 5
Persistent memory system Failure CPU Writeback caches DRAM Persistent Memory (PM) 6
Persistent memory system Failure CPU Writeback caches Recovery DRAM Persistent Memory (PM) Recovery can inspect PM data-structures to restore system to a consistent state 7
Recovery requires PM access ordering CPU St a = x for recovery Writeback caches St b = y PM 8
Recovery requires PM access ordering Intel x86 primitives Consistency CPU St a = x model St a = x for recovery Writeback caches St b = y St b = y PM 9
Recovery requires PM access ordering Intel x86 primitives Consistency CPU St a = x model St a = x CLWB(a) Persistency for recovery Writeback caches model St b = y St b = y PM CLWB(b) 10
Recovery requires PM access ordering Intel x86 primitives Consistency CPU St a = x model St a = x CLWB(a) Persistency for recovery Writeback caches model SFENCE St b = y St b = y PM CLWB(b) 11
Recovery requires PM access ordering Intel x86 primitives Consistency CPU St a = x model St a = x CLWB(a) Persistency for recovery Writeback caches model SFENCE St b = y St b = y PM CLWB(b) Hardware systems provide primitives to express persist order to PM 12
Hardware imposes overly strict constraints St A = 1; CLWB (A) St B = 2; CLWB (B) St C = 3; CLWB (C) Ideal DAG A C B 13
Hardware imposes overly strict constraints St A = 1; CLWB (A) St A = 1; CLWB (A) SFENCE St B = 2; CLWB (B) St B = 2; CLWB (B) St C = 3; CLWB (C) St C = 3; CLWB (C) Ideal DAG DAG 1 A A C C B B 14
Hardware imposes overly strict constraints St A = 1 ; CLWB (A) St A = 1; CLWB (A) St A = 1; CLWB (A) St C = 3; CLWB (C) SFENCE St B = 2; CLWB (B) SFENCE St B = 2; CLWB (B) St C = 3; CLWB (C) St B = 2; CLWB (B) St C = 3; CLWB (C) Ideal DAG DAG 1 DAG 2 A A A C C C B B B 15
Hardware imposes overly strict constraints St A = 1 ; CLWB (A) St A = 1; CLWB (A) St A = 1; CLWB (A) St C = 3; CLWB (C) SFENCE St B = 2; CLWB (B) SFENCE St B = 2; CLWB (B) St C = 3; CLWB (C) St B = 2; CLWB (B) St C = 3; CLWB (C) Ideal DAG DAG 1 DAG 2 A A A C C C B B B Primitives in existing hardware systems overconstrain PM accesses 16
Contributions Employ strand persistency [Pelley14] • – Hardware ISA primitives to specify precise ordering constraints Comprises two primitives: PersistBarrier and NewStrand • – Can encode an arbitrary DAG Map language-level persistency models to ISA level primitives • – Leverage strand persistency to build persistency models efficiently 17
Contributions Employ strand persistency [Pelley14] • – Hardware ISA primitives to specify precise ordering constraints Comprises two primitives: PersistBarrier and NewStrand • – Can encode an arbitrary DAG Map language-level persistency models to ISA level primitives • – Leverage strand persistency to build persistency models efficiently Strand persistency improves perf. of language persistency models by 21.4% (avg.) 18
Outline • Contributions • Example: Failure atomicity • Existing hardware primitives • Strand persistency • Evaluation 19
Example: Failure atomicity Failure-atomicity : Which group of stores persist atomically? atomic_begin() x = 100; Failure-atomic region y = 200; atomic_end() 20
Example: Failure atomicity Failure-atomicity : Which group of stores persist atomically? atomic_begin() x = 100; Failure-atomic region y = 200; atomic_end() Failure-atomicity limits state that recovery can observe after failure 21
Undo-logging for failure atomicity persistUndoLog (L) Init: x = 0; y = 0 atomic_begin() mutateData (M) x = 1; y = 2; persistData (P) atomic_end() commitLog (C) 22
Undo-logging for failure atomicity persistUndoLog (L) Init: x = 0; y = 0 atomic_begin() mutateData (M) Failure- x = 1; atomic y = 2; persistData (P) atomic_end() commitLog (C) Undo logging steps ordered to ensure failure-atomicity 23
Undo-logging for failure atomicity persistUndoLog (L) Init: x = 0; y = 0 atomic_begin() mutateData (M) Failure- x = 1; atomic y = 2; persistData (P) atomic_end() commitLog (C) Undo logging steps ordered to ensure failure-atomicity 24
Hardware imposes stricter constraints Ideal ordering atomic_begin() Log(L x ,x) x = 1; Log(L y ,y) CLWB(L x ) CLWB(L y ) y = 2; Store(x,1) atomic_end() Store(y,2) 25
Hardware imposes stricter constraints Ideal ordering SFENCE ordering Log(L x ,x) CLWB(L x ) SFENCE atomic_begin() Log(L x ,x) Store(x,1) x = 1; Log(L y ,y) CLWB(L x ) CLWB(L y ) y = 2; Log(L y ,y) Store(x,1) atomic_end() CLWB(L y ) Store(y,2) SFENCE Store(y,2) 26
Hardware imposes stricter constraints Ideal ordering SFENCE ordering Log(L x ,x) CLWB(L x ) SFENCE atomic_begin() Log(L x ,x) Store(x,1) x = 1; Log(L y ,y) CLWB(L x ) CLWB(L y ) y = 2; Log(L y ,y) Store(x,1) atomic_end() CLWB(L y ) Store(y,2) SFENCE Store(y,2) 27
Hardware imposes stricter constraints Ideal ordering SFENCE ordering Log(L x ,x) CLWB(L x ) SFENCE atomic_begin() Log(L x ,x) Store(x,1) x = 1; Log(L y ,y) CLWB(L x ) CLWB(L y ) y = 2; Log(L y ,y) Store(x,1) atomic_end() CLWB(L y ) Store(y,2) SFENCE Store(y,2) 28
Strand persistency enables persist concurrency • Provides primitives to express precise persist order A Persist A B Persist B C Persist C 29
Strand persistency enables persist concurrency • Provides primitives to express precise persist order A Persist A PersistBarrier Orders persists within a thread ß B Persist B C Persist C 30
Strand persistency enables persist concurrency • Provides primitives to express precise persist order Strand 0 Strand 1 A Persist A C PersistBarrier Orders persists within a thread ß B Persist B Initiates new stream of persists ß NewStrand Persist C 31
Strand persistency enables persist concurrency • Provides primitives to express precise persist order Strand 0 Strand 1 A Persist A strand C PersistBarrier Orders persists within a thread ß B Persist B Initiates new stream of persists ß NewStrand Persist C 32
Strand persistency enables persist concurrency • Provides primitives to express precise persist order Strand 0 Strand 1 A Persist A strand C PersistBarrier Orders persists within a thread ß B Persist B Initiates new stream of persists ß NewStrand Persist C Persists on different strands can be issued concurrently to PM 33
What if ordering is needed across strands? • Conflicting accesses establish persist order across strands Strand 0 Strand 1 Persist A A PersistBarrier B Persist B 34
What if ordering is needed across strands? • Conflicting accesses establish persist order across strands Strand 0 Strand 1 Persist A A PersistBarrier B Persist B NewStrand Persist A A PersistBarrier C Persist C 35
What if ordering is needed across strands? • Conflicting accesses establish persist order across strands Strand 0 Strand 1 Persist A A PersistBarrier Inter-strand B Persist B order NewStrand Persist A A PersistBarrier C Persist C 36
Logging using strand persistency Log(L x ,x) CLWB(L x ) Strand 0 Strand 1 PersistBarrier atomic_begin() Log(L x ,x) Store(x,1) x = 1; Log(L y ,y) CLWB(L x ) CLWB(L y ) NewStrand y = 2; atomic_end() Log(L y ,y) Store(x,1) Store(y,2) CLWB(L y ) PersistBarrier Store(y,2) 37
Logging using strand persistency Log(L x ,x) CLWB(L x ) Strand 0 Strand 1 PersistBarrier atomic_begin() Log(L x ,x) Store(x,1) x = 1; Log(L y ,y) CLWB(L x ) CLWB(L y ) NewStrand y = 2; atomic_end() Log(L y ,y) Store(x,1) Store(y,2) CLWB(L y ) PersistBarrier Store(y,2) Need to implement log buffer that can manage concurrent log updates 38
Log space under strand persistency Persistent head atomically commits logs Log buffer Invalid Log 0 Log 1 Invalid Volatile tail for concurrent log creation 39
Log space under strand persistency Persistent head atomically commits logs Log buffer Invalid Log 0 Log 1 Invalid Volatile tail for concurrent log creation • Failure exposes log write reorderings – Identify valid logs in case of failure More details in the paper – Record order of log creation – Recovery rolls back partial updates using valid logs 40
Language persistency models to ISA primitives ISA primitives: PersistBarrier and NewStrand Hardware ISA 41
Recommend
More recommend