S HARED A DDRESS S PACE DSM consists of two components: ➀ Shared address space D ISTRIBUTED S YSTEMS [COMP9243] ➁ Replication and consistency of memory objects Shared address space: Lecture 3b: Distributed Shared Memory Node 1 Node 2 Slide 1 Slide 3 0x1000 0x1000 ➀ DSM ➁ Case study 0x2000 0x2000 ➂ Design issues ➃ Implementation issues Network ➜ Shared addresses are valid in all processes Transparent remote access: D ISTRIBUTED S HARED M EMORY (DSM) Node 1 Node 2 0x1000 0x1000 DSM: shared memory + multicomputer Shared global address space 0x2000 0x2000 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Slide 2 Slide 4 11 13 15 16 0 2 5 1 3 6 4 7 Network 10 12 14 9 8 Properties: ➜ Remote access is expensive compared to local memory access CPU 1 CPU 2 CPU 3 CPU 4 ➜ Individual operations can have very low overhead ➜ Threads can distinguish between local and remote access S HARED A DDRESS S PACE 1 S HARED A DDRESS S PACE 2
Why DSM?: ➜ Shared memory model: easiest to program to Middleware: ➜ Physical shared memory not possible on multicomputer ➜ Library: ➜ DSM emulates shared memory • Library routines to create/access shared memory Benefits of DSM: • Example: MPI-2, CRL Slide 5 Slide 7 ➜ Ease of programming (shared memory model) ➜ Language ➜ Eases porting of existing code • Shared memory encapsulated in language constructs ➜ Pointer handling • Extend language with annotations • Shared pointers refer to shared memory • Example: Orca, Linda, JavaSpaces, JavaParty, Jackal • Share complex data (lists, etc.) ➜ No marshalling DSM I MPLEMENTATIONS Hardware: Typical Implementation: ➜ Multiprocessor ➜ Most often implemented in user space (e.g., TreadMarks, CVM) ➜ Example: MIT Alewife, DASH ➜ User space: what’s needed from the kernel? OS with hardware support: • User-level fault handler Slide 6 Slide 8 [e.g., Unix signals] ➜ SCI network cards (SCI = Scalable Coherent Interconnect) • User-level VM page mapping and protection ➜ SCI maps extended physical address space to remote nodes [e.g., mmap() and mprotect() ] ➜ OS maps shared virtual address space to SCI range • Message passing layer OS and Virtual Memory: [e.g., socket API] ➜ Virtual memory (page faults, paging) ➜ Local address space vs Large address space DSM I MPLEMENTATIONS 3 DSM I MPLEMENTATIONS 4
Example: two processes sharing memory pages: Page migration and replication: Node 1 Node 2 Node 1 Node 2 0x1000 0x1000 0x1000 0x1000 Slide 9 Slide 11 Network Network Recovery from read fault: Occurrence of a read fault: Node 1 Node 2 Node 1 Node 2 0x1000 0x1000 0x1000 0x1000 Slide 10 Slide 12 Fault! Resume Network Network DSM I MPLEMENTATIONS 5 DSM M ODELS 6
DSM M ODELS Tuple Space: Shared page (coarse-grained): A Write A B Write B T Read T ➜ Traditional model C ➜ Ideal page size? � False sharing Look for ➜ Examples: Ivy, TreadMarks Insert a Insert a tuple that Slide 13 Slide 15 copy of B copy of A matches T Shared region (fine-grained): B Return C A ➜ More fine grained than sharing pages A (and optionally remove it) � Prevent false sharing B B Tuple instance C � Not regular memory access (transparency) A JavaSpace ➜ Examples: CRL (C Region Library), MPI-2 one-sided communication, Shasta L INDA E XAMPLE Shared variable: main() { ➜ Release and Entry based consistency ... ➜ Annotations eval("function", f()) ; � Fine grained eval("function", f()) ; � More complex for programmer ... for (i=0; i<100; i++) ➜ Examples: Munin, Midway out("data", i) ; Slide 14 Slide 16 Shared structure: ... ➜ Encapsulate shared data } ➜ Access only through predefined procedures (e.g., methods) f(){ � Tightly integrated synchronisation in("data", ?x) ; y = g(x) ; � Encapsulate (hide) consistency model out("function", x, y) ; � Lose familiar shared memory model } ➜ Examples: Orca (shared object), Linda (tuple space) What’s good about this? DSM M ODELS 7 A PPLICATIONS OF DSM 8
R EQUIREMENTS OF DSM Transparency: A PPLICATIONS OF DSM ➜ Location, migration, replication, concurrency ➜ Scientific parallel computing Reliability: • Bioinformatics (gene sequence analysis) ➜ Computations depend on availability of data • Simulations (climate modeling, economic modeling) Slide 17 Slide 19 Performance: • Data processing (physics, astronomy) ➜ Important in high-performance computing ➜ Graphics (image processing, rendering) ➜ Important for transparency ➜ Data server (distributed FS, Web server) ➜ Data storage Scalability: ➜ Important in wide-area ➜ Important for large computations DSM E NVIRONMENTS Consistency: ➜ Multiprocessor ➜ Access to DSM should be consistent • NUMA ➜ According to a consistency model ➜ Multicomputer Slide 18 Slide 20 Programmability: • Supercomputer ➜ Easy to program • Cluster ➜ Communication transparency • Network of Workstations • Wide-area R EQUIREMENTS OF DSM 9 C ASE S TUDY 10
U SING T READ M ARKS Compiling: ➜ Compile ➜ Link with TreadMarks libraries C ASE S TUDY Starting a TreadMarks Application: TreadMarks: app -- -h host1 -h host2 -h host3 -h host4 ➜ 1992 Rice University Slide 21 Slide 23 ➜ Page based DSM library Anatomy of a TreadMarks Program: ➜ C, C++, Java, Fortran ➜ Starting remote processes ➜ Lazy release consistency model ➜ Heterogeneous environment Tmk_startup(argc, argv); ➜ Allocating and sharing memory shared = (struct shared*) Tmk_Malloc(sizeof(shared)); Tmk_distribute(&shared, sizeof(shared)); D ESIGN I SSUES Granularity ➜ Page based, Page size: minimum system page size Replication ➜ Barriers ➜ Lazy release consistency Tmk_barrier(0); Scalability ➜ Meant for cluster or NOW (Network of Workstations) ➜ Acquire/Release Slide 22 Slide 24 Synchronisation primitives Tmk_lock_acquire(0); ➜ Locks (acquire and release), Barrier shared->sum += mySum; Heterogeneity Tmk_lock_release(0); ➜ Limited (doesn’t address endianness or mismatched word sizes) Fault Tolerance ➜ Research No Security U SING T READ M ARKS 11 T READ M ARKS I MPLEMENTATION 12
T READ M ARKS I MPLEMENTATION Consistency Protocol: ➜ Multiple writer Data Location: ➜ Twins ➜ Know who has diffs because of invalidations ➜ Reduce false sharing ➜ Each page has a statically assigned manager R RW twin x=1 x=1 Modification Detection: P1 P1 x(0) x(0) x(0) ➜ Page Fault Slide 25 Slide 27 ➜ If page is read-only then do consistency protocol 1. Write causes page fault 2. After page fault ➜ If not in local memory, get from manager Memory Management: RW twin RW x=1 P1 P1 ➜ Garbage collection of diffs diff x(1) x(0) x(1) x 0 1 3. Write is executed 4. At release or barrier Update Propagation: ➜ Modified pages invalidated at acquire Initialisation: ➜ Page is updated at access time ➜ Processes set up communication channels between themselves ➜ Updates are transferred as diffs ➜ Register SIGIO handler for communication Lazy Diffs: ➜ Allocate large block of memory ➜ Normally make diffs at release time Slide 26 Slide 28 • Same (virtual) address on each machine ➜ Lazy: make diffs only when they are requested • Mark as non-accessible • Assign manager process for each page, lock, barrier (round Communication: robin) ➜ UDP/IP or AAL3/4 (ATM) ➜ Register SEGV handler ➜ Light-weight, user-level protocols to ensure message delivery ➜ Use SIGIO for message receive notification T READ M ARKS I MPLEMENTATION 13 R EADING L IST 14
R EADING L IST Distributed Shared Memory: A Survey of Issues and Algorithms An overview of DSM and key issues as well as older DSM implementations. Slide 29 TreadMarks: Shared Memory Computing on Networks of Workstations An overview of TreadMarks, design decisions and implementation. H OMEWORK Slide 30 Do Assignment 1! H OMEWORK 15
Recommend
More recommend