Distributed Shared Memory (DSM) Robert Gasparyan, Angela Gong, Judson Wilson CS 240, Spring 2015
What is DSM? • Physically separate memory addressed as one shared address space • Memory shared on page-by-page basis CPU CPU CPU CPU Memory Memory Memory Memory Network Virtual Memory
Problem Result Thread ? ? ? Thread Thread Thread Thread CPU CPU CPU CPU • Make use of multiple machines • Manage dependencies
DSM: Simple Interface Shared Memory Thread Thread A B CPU CPU
Consistency Model Shared Shared Region Region Thread R/W R/W Thread A B Local Local Memory Memory CPU CPU
Release Consistency • Critical sections protected by same lock execute sequentially • All changes from previously protected regions guaranteed to be visible • Saves network traffic because don’t need to synchronize until lock is released
Design and Implementation
Deployment of Binaries Server A A A A Daemon Daemon Daemon Daemon Future capability:
Master: Locks and Page Info Master v: 28 o: 3 Worker 0 Worker 3 Worker 1 Worker 2
Locks Lock Requests • Spawn thread • Wait on • Reply after acquired Master Similar for unlock Worker
Page Servers: Page Contents • Background thread that distributes page data upon request • All workers have a page server Worker Page Server Worker Page Server Worker Page Server Worker Page Server
Patches Implicitly locked memory can be smaller than a page. Used diff-patches to merge modifications Acquire Modify Release - +
Transparent Interface Fault handlers, lock/unlock: • starts (lazy) to protect pages ◦ • fault to catch first read/write ◦ 1st: Upgrade to access → get latest version ◦ 2nd: Upgrade to → mark modified • does ◦ Pull versions, merge modified pages, create new version
Transparent Shared Memory Across Processes Across Network (DSM)
Benchmarks
Benchmark Setup • Compare performance of single machine versus DSM. Single Machine Using DSM
Matrix Multiplication -9 7 -1 1 -1 3 3 0 0 = 6 -5 2 2 -1 4 0 3 0 6 -4 1 2 2 1 0 0 3
Matrix Multiplication Single Machine Using DSM 1/ms 1/ms 0.0005 0.001 0.005 0.0003 0.000 0.0000 0 5 10 15 20 0 5 10 Number of Cores Number of Cores
Word Count Donald Ervin Knuth: The Art of Computer Programming: Generating all Combinations and Partitions
Word Count Single Machine Using DSM 1/ms 1/ms 0.0012 0.0007 0.0004 0.0062 0.0000 0.0000 0 5 10 15 20 0 5 10 Number of Cores Number of Cores
Demo!
Conclusion • We made transparent DSM! • Focus: Correctness first, then scalability • Tedious: ○ C data structures ○ Message passing / handling
Questions?
Bonus: Correctness Test Shared Buckets Private Buckets +1 +2 Worker 0 Worker 1 Invariant: Sum Shared = Sum Private Worker 2
Bonus: Correctness Test Nested Transfer Nested Increment All Counts Serial Transfer 1 Count
Recommend
More recommend