the synchronization power of the synchronization power of
play

The Synchronization Power of The Synchronization Power of Coalesced - PowerPoint PPT Presentation

The Synchronization Power of The Synchronization Power of Coalesced Memory Accesses oalesced Memory ccesses Phuong H Ha (Univ of Troms Norway) Phuong H. Ha (Univ. of Troms, Norway) Philippas Tsigas (Chalmers Univ. of Tech., Sweden) Otto


  1. The Synchronization Power of The Synchronization Power of Coalesced Memory Accesses oalesced Memory ccesses Phuong H Ha (Univ of Tromsø Norway) Phuong H. Ha (Univ. of Tromsø, Norway) Philippas Tsigas (Chalmers Univ. of Tech., Sweden) Otto J. Anshus (Univ. of Tromsø, Norway) DISC '08

  2. Problem Memory access mechanisms influence the system � synchronization capability. Conventional wisdom: single-word assignment has consensus � number 1 ⇒ stronger synch primitives (e g TAS FAA CAS) added ⇒ stronger synch. primitives (e.g. TAS, FAA, CAS) added. � Can we make single-word assignment stronger? ⇒ transistors saved from strong synch. primitives can be used to i d f h i i i b d enhance other functionality. Transistor distribution Transistor distribution [These figures are from NVIDA CUDA Programming Guide, version 2.0] DISC '08

  3. What is a memory word? y � A group of n bytes that can be stored or g p y retrieved in a single, basic operation. � n is called word size (in byte-addressable memory) � Words of size n must always start at addresses that are multiples of n . (Alignment restriction) [Hamacher et al. 2002, Hennessy et al. 2003] DISC '08

  4. Key idea 1 Size-varying word model (svword) Size varying word model (svword) bytes � Word size n can be any … … 1 2 3 4 5 6 integer i t … … p’s 2-byte write 1 2 3 4 5 6 � instead of powers of 2 as in … … conventional architectures conventional architectures q s 3 byte write q’s 3-byte write 1 1 2 2 3 3 4 4 5 5 6 6 [2,3,4] ⇒ p wrote first ⇒ agree on red time � Ex: solving 2-process � Ex: solving 2 process consensus using 2-byte Conventional architectures write and 3-byte write. bytes … … 3 4 5 6 7 8 � Feasibility: NVIDIA CUDA … … p’s 2-byte write 3 4 5 6 7 8 i t1 i t2 i t3 i t4 int1, int2, int3, int4 � … … q’s 4-byte write 3 4 5 6 7 8 [4,5,6,7] ⇒ q cannot determine if p has written! [4,5,6,7] ⇒ q cannot determine if p has written! DISC '08

  5. Key idea 2 � Some of the n bytes of a word Aligned-inconsecutive word model (aiword) may be left untouched in a may be left untouched n a b t bytes single-word assignment. … … 3 4 5 6 7 8 � Ex: solving 2-process consensus … … using 4-byte writes i 4 b t it p’s 4 byte write p s 4-byte write 3 3 4 4 5 5 6 6 7 7 8 8 � Feasibility: NVIDIA CUDA … … q’s 4-byte write 3 4 5 6 7 8 Coalesced memory accesses Coalesced memory accesses � [4,5,6] ⇒ p wrote first ⇒ agree on red time SIMD core 1 SIMD core 1 SIMD core 2 SIMD core 2 … Threads Threads 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 14 15 … Memory 0 1 2 3 4 5 6 7 8 9 10 11 12 13 aiword aiword aiword aiword DISC '08

  6. Our main technical contributions � Develop general models for coalesced memory D v l p n r l m d ls f r c l sc d m m r accesses. � Prove the exact consensus numbers of these P th t s s s b s f th s models: � size-varying word model (svword) size varying word model (svword) � aligned-inconsecutive word model (aiword) � the combination of these two models (asvword) � the combination of these two models (asvword) DISC '08

  7. Road-map � Size-varying word model (svword) z ary ng w r m ( w r ) � Aligned-inconsecutive word model (aiword) Ali d i ti d d l ( i d) � The combination of these two models (asvword) (asvword) DISC '08

  8. Size-varying word model (svword) y g ( ) � A svword consists of b consecutive memory units, b ∈ [1,B], B is a constant. b ∈ [1,B], B is a constant. � b -svword for short � b -svwrite = b -svword assignment � Alignment restriction: � Svwords of size b must start at addresses that are Svwords of size b must start at addresses that are multiples of b . � Ex: 2-svwrite, 3-svwrite and 5-svwrite 5-svwrite 5 svwrite … … 14 15 16 17 18 19 20 2-svwrite 2 svwrite 3-svwrite 3 svwrite DISC '08

  9. Svword’s consensus no. ≥ 3 � Idea: � 5 svwrite can partly overlap both 2 svwrite and 3 svwrite � 5-svwrite can partly overlap both 2-svwrite and 3-svwrite ⇒ can construct (binary) consensus objects for 3 processes � Ex: � Ex: Binary consensus (BC) for 3 processes Consensus for 3 processes p p 1 p 2 p p 3 p 1 , p 2 p 3 … … BC BC 14 15 16 17 18 19 20 p 1 ,p 2 time … … p 1 ’s 2-svwrite 14 15 16 17 18 19 20 time … … p 3 ’s 5-svwrite p 3 14 15 16 20 17 18 19 … … p 2 ’s 3-svwrite 14 15 16 17 18 19 20 BC p 1 ,p 2 ,p 3 [17,18,20] ⇒ p 3 ’s write → p 2 ’s write [14 15 16] ⇒ p ’s write → p ’s write ⇒ red wrote first ⇒ agree on red [14,15,16] ⇒ p 1 s write → p 3 s write ⇒ red wrote first ⇒ agree on red DISC '08

  10. Svword’s consensus no. ≤ 3 Idea � p’s critical assignment must � write to p’s private unit � partly overlap q’s critical assignment if p’s critical value ≠ q’s critical value � (Bivalency argument) ( y g ) b-svwrite accesses consecutive units ⇒ each b-svwrite can partly � overlap at most 2 other b-svwrites. p 1 ,p 2 ,p p 4 p 1 ,p 2 p 3 ,p 4 3 p 4 ’s svwrite ’ i p 3 ’s svwrite ’ i p 4 ’s svwrite ’ i … … … … … … … … … … … … … … p 1 ’s svwrite p 1 p 2 ’s svwrite p 2 p 1 ’s svwrite p 2 ’s svwrite p 3 ’s svwrite Svword’s consensus number is exactly 3 Svword s consensus number is exactly 3 DISC '08

  11. Road-map � Size-varying word model (svword) z ary ng w r m ( w r ) � Aligned-inconsecutive word model (aiword) Ali d i ti d d l ( i d) � The combination of these two models (asvword) (asvword) DISC '08

  12. Aligned-inconsecutive word (aiword) g ( ) � Memory is aligned to m -unit words, m is a constant. � m -aiword for short � m -aiword for short � A read/write operation accesses an arbitrary non-empty subset of the m units of an aiword subset of the m units of an aiword. � m -aiwrite = m -aiword assignment. � Alignment restriction � m -aiwords must start at addresses that are multiples of m . � Ex: 8-aiwrite 8-aiwrite 14 15 … 0 1 2 3 4 5 6 7 8 9 10 11 12 13 8-aiword w 8-aiword w DISC '08

  13. m -aiword’s consensus no. ≥ |(m+1)/2| Idea: � Construct a binary consensus object for N=|(m+1)/2| processes in � which (N-1) processes propose the same value. hi h (N 1) th l Construct a multivalued consensus object for N processes using the � binary consensus object. Ex: 9-aiword E 9 i d � Binary consensus (BC) for 4+1 processes Consensus for 5 processes p 0 p 0 p 1 p 1 p 2 p 2 p 3 p 3 p 4 p 4 p 0 ,p 1 , p p 2 p 2 ,p 3 3 BC 4 4 p 0 p 1 p 2 p 3 p 0 ,p 1 p 0 p 1 writing 0 1 2 3 p 4 schema 4 5 6 7 8 p 0 ,p 1 ,p 2 ,p 3 p 0 ,p 1 ,p 2 ,p 3 p 0 p 1 p 2 p 3 [0,4,8] ⇒ p 4 → p 0 [1,5,8] ⇒ p 1 → p 4 0 1 2 3 BC ⇒ red wrote first p 4 [2,6,8] ⇒ p 4 → p 2 4 5 6 7 8 p 0 ,p 1 ,p 2 ,p 3 ,p 4 time [3 7 8] ⇒ p → p [3,7,8] ⇒ p 4 → p 3 DISC '08

  14. m -aiword’s consensus no. ≤ |(m+1)/2| � Idea: � Lemma: p i ‘s critical assignment must atomically write to � Lemma: p i s critical assignment must atomically write to p i ’s own unit u i � shared units u i,j written only by p i and p j where p i ’s critical y y p p j p � ,j value cv i ≠ p j ’s critical value cv j . (Bivalency argument) ⇒ solving consensus for 2 subsets S 1 and S 2 , where c v 1 ≠ cv 2 solvin consensus for 2 subsets S and S where c v ≠ cv and n 1 +n 2 =N , needs to write atomically to m units, where m = N + n 1 n 2 ≥ 2N – 1 ⇒ N ≤ (m+1)/2 m N n 1 n 2 ≥ 2N 1 ⇒ N ≤ (m 1)/2 m-aiword’s consensus number is exactly |(m+1)/2| DISC '08

  15. Road-map � Size-varying word model (svword) z ary ng w r m ( w r ) � Aligned-inconsecutive word model (aiword) Ali d i ti d d l ( i d) � The combination of these two models (asvword) (asvword) DISC '08

  16. Asvword = aiword + svword An extension of aiword : � aiword’ s m units are replaced by m svword s of the same size b, b ∈ � {1,B}. { , } m.b-asvword for short � m.b-asvwrite = m.b -asvword assignment � m=t.B or B=t.m, t ∈ N*. � Alignment restriction � m.b -asvwords must start at addresses that are multiples of ( m.b) . � Ex: m=8, B=2: � 8.2-asvword vs. 8.1-asvword � 8.1-asvwrite 14 15 … b=1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 8.1-asvword 8.1-asvword 8.2-asvwrite … b=2 0 1 2 3 4 5 6 7 8 2 asvword 8.2-asvword DISC '08

  17. Asvword’s consensus no. when m ≤ B � Asvword’s consensus number is |(m+1)/2|, like aiword’s. � Idea: � Idea: � When B=t.m, t ∈ N*, the combination of m.1 -asvwrite and m.B - asvwrite does not provide any additional strength compared to m -aiwrite. � Ex: B=m=4 � p and q write to u � p and q write to u p , u q , u p,q using 4.1-asvwrite and 4.4-asvwrite. u u using 4 1 asvwrite and 4 4 asvwrite p’s 4.1-asvwrite 4.1-asvword u p u p,q b=1 q’s 4.4 -asvwrite must 4.4-asvword overwrites u p ! overwrites u p ! b=4 u p,q u q 4-svword q’s 4.4-asvwrite DISC '08

Recommend


More recommend