1
play

1 Problem Distributed Directory Directory becomes a bottleneck P0 - PDF document

Motivation SMP system s Run parts of a program in parallel Share single address space Share data in that space Distributed Use threads for parallelism Use synchronization primitives to prevent Shared Memory race


  1. Motivation SMP system s – Run parts of a program in parallel – Share single address space • Share data in that space Distributed – Use threads for parallelism – Use synchronization primitives to prevent Shared Memory race conditions Can w e achieve this w ith m ulticom puters? – All communication and synchronization must be done with messages Paul Krzyzanowski • Distributed Systems DSM Take advantage of the MMU Goal: allow networked computers to • Page table entry for a page is valid if share memory the page is held (cached) locally • Attempt to access non-local page leads to a page fault • How do you make a distributed memory system appear local? • Page fault handler – Invokes DSM protocol to handle fault – Fault handler brings page from remote node • Physical memory on each node used to • Operations are transparent to hold pages of shared virtual address programmer space – DSM looks like any other virtual memory system Paul Krzyzanowski • Distributed Systems Paul Krzyzanowski • Distributed Systems Simplest design Simplest design Each page of virtual address space exists On page fault: on only one machine at a time – Consult central server to find which machine is currently holding the page -no caching – Directory Request the page from the current owner – Current owner invalidates PTE – Sends page contents – Recipient allocates frame, reads page, sets PTE – Informs directory of new location Paul Krzyzanowski • Distributed Systems Paul Krzyzanowski • Distributed Systems 1

  2. Problem Distributed Directory Directory becomes a bottleneck P0 P1 Page Location Page Location – All page query requests must go to this server 0000 P3 0001 P3 0004 P1 0005 P1 0008 P1 0009 P0 Solution 000C P2 000D P2 – Distributed directory … … … … – Distribute among all processors P2 P3 Page Location Page Location – Each node responsible for portion of address 0002 P3 0003 P3 space 0006 P1 0007 P1 – Find responsible processor: 000A P0 000B P2 • hash( page# )mod processors 000E -- 000F -- … … … … Paul Krzyzanowski • Distributed Systems Paul Krzyzanowski • Distributed Systems Design Considerations: granularity Design Considerations: replication • Memory blocks are typically a multiple • What if we allow copies of shared of a node’s page size pages on multiple nodes? – To integrate with VM system • Replication (caching) reduces average • Large pages are good cost of read operations – Cost of migration amortized over many – Simultaneous reads can be executed locally localized accesses across hosts • Write operations become more • BUT expensive – Increases chances that multiple objects reside in one page – Cached copies need to be invalidated or • Thrashing updated • False sharing • Worthwhile if reads/ writes is high Paul Krzyzanowski • Distributed Systems Paul Krzyzanowski • Distributed Systems Replication Replication Multiple readers, single writer Read operation : – One host can be granted a read-w rite copy If block not local – Or multiple hosts granted read-only copies • Acquire read-only copy of the block • Set access writes to read-only on any writeable copy on other nodes W rite operation : If block not local or no write permission • Revoke write permission from other writable copy (if exists) • Get copy of block from owner (if needed) • Invalidate all copies of block at other nodes Paul Krzyzanowski • Distributed Systems Paul Krzyzanowski • Distributed Systems 2

  3. Full replication Dealing with replication Extend model • Keep track of copies of the page – Multiple hosts have read/ write access – Directory with single node per page not enough – Need m ultiple-readers, m ultiple- w riters protocol – Maintain copyset • Set of all systems that requested copies – Access to shared data must be controlled to maintain consistency • Request for page copy – Add requestor to copyset – Send page contents • Request to invalidate page – Issue invalidation requests to all nodes in copyset and wait for acknowledgements Paul Krzyzanowski • Distributed Systems Paul Krzyzanowski • Distributed Systems Consistency Model Consistency Model Definition of when modifications to data Must be well-understood may be seen at a given processor – Determines how a programmer reasons about the correctness of a program Defines how m em ory w ill appear to a program m er – Determines what hardware and compiler – Places restrictions on what values can be optimizations may take place returned by a read of a memory location Paul Krzyzanowski • Distributed Systems Paul Krzyzanowski • Distributed Systems Sequential Semantics Sequential Semantics Provided by most (uniprocessor) Requirements programming languages/ systems – All memory operations must execute one at a time Program order – All operations of a single processor appear to The result of any execution is the same execute in program order as if the operations of all processors were executed in some sequential order and – Interleaving among processors is OK the operations of each individual processor appear in this sequence in the order specified by the program. � Lamport Paul Krzyzanowski • Distributed Systems Paul Krzyzanowski • Distributed Systems 3

  4. Sequential semantics Achieving sequential semantics I llusion is efficiently supported in uniprocessor system s P 2 P 1 P 3 – Execute operations in program order when they are to the same location or when one P 4 P 0 controls the execution of another – Otherwise, compiler or hardware can reorder Compiler: – Register allocation, code motion, loop transformation, … Hardware: memory – Pipelining, multiple issue, … Paul Krzyzanowski • Distributed Systems Paul Krzyzanowski • Distributed Systems Achieving sequential consistency Achieving sequential consistency Processor must ensure that the previous All writes to the same location must be memory operation is complete before visible in the same order by all processes proceeding with the next one – W rite atom icity requirem ent – Program order requirem ent – Value of a write will not be returned by a read until all updates/ invalidates are – Determining completion of write operations acknowledged • get acknowledgement from memory system • hold off on read requests until write is complete – If caching used – Totally ordered multicast • Write operation must invalidate or update messages to all cached copies. • ALL these messages must be acknowledged Paul Krzyzanowski • Distributed Systems Paul Krzyzanowski • Distributed Systems Improving performance Relaxed (weak) consistency Relax program order betw een all Break rules to achieve better performance operations to m em ory – Compiler and/ or programmer should know – Read/ writes to different memory operations what’s going on! can be reordered Consider: Relaxing sequential consistency – Operation in critical section (shared) – Weak consistency – One process reading/ writing – Nobody else accessing until process leaves critical section No need to propagate writes sequentially or at all until process leaves critical section Paul Krzyzanowski • Distributed Systems Paul Krzyzanowski • Distributed Systems 4

  5. Synchronization variable (barrier) Consistency guarantee • Operation for synchronizing memory • Access to synchronization variables are sequentially consistent • All local writes get propagated – All processes see them in the same order • All remote writes are brought in to the • No access to a synchronization variable local processor can be performed until all previous • Block until memory synchronized writes have completed • No read or write permitted until all previous accesses to synchronization variables are performed – Memory is updated Paul Krzyzanowski • Distributed Systems Paul Krzyzanowski • Distributed Systems Problems with weak consistency Can we do better? • Inefficiency Separate synchronization into two stages: – 1. acquire access – Synchronization • Because process finished memory accesses Obtain valid copies of pages or is about to start? – 2. release access • Systems must make sure Send invalidations for shared pages that were modified locally to nodes that have copies. – All locally-initiated writes have completed – All remote writes have been acquired acquire( R) / / start of critical section Do stuff release( R) / / end of critical section Eager Release Consistency ( ERC) Paul Krzyzanowski • Distributed Systems Paul Krzyzanowski • Distributed Systems Let’s get lazy Finer granularity Release requires Release consistency – Sending invalidations to copyset nodes – Synchronizes all data – And waiting for all to acknowledge – No relation between lock and data Delay this process Use object granularity instead of page • On release : granularity – Send invalidation only to directory – Each variable or group of variables can have a • On acquire : synchronization variable – Check with directory to see whether it needs a new – Propagate only writes performed in those sections copy – Cannot rely on OS and MMU anymore • Chances are not every node will need to do an acquire • Need smart compilers Reduces message traffic on releases Lazy Release Consistency ( LRC) Entry Consistency Paul Krzyzanowski • Distributed Systems Paul Krzyzanowski • Distributed Systems 5

Recommend


More recommend