Distributed Shared Memory (DSM) Robert Gasparyan, Angela Gong, - PowerPoint PPT Presentation
Distributed Shared Memory (DSM) Robert Gasparyan, Angela Gong, Judson Wilson CS 240, Spring 2015 What is DSM? Physically separate memory addressed as one shared address space Memory shared on page-by-page basis CPU CPU CPU CPU
Distributed Shared Memory (DSM) Robert Gasparyan, Angela Gong, Judson Wilson CS 240, Spring 2015
What is DSM? • Physically separate memory addressed as one shared address space • Memory shared on page-by-page basis CPU CPU CPU CPU Memory Memory Memory Memory Network Virtual Memory
Problem Result Thread ? ? ? Thread Thread Thread Thread CPU CPU CPU CPU • Make use of multiple machines • Manage dependencies
DSM: Simple Interface Shared Memory Thread Thread A B CPU CPU
Consistency Model Shared Shared Region Region Thread R/W R/W Thread A B Local Local Memory Memory CPU CPU
Release Consistency • Critical sections protected by same lock execute sequentially • All changes from previously protected regions guaranteed to be visible • Saves network traffic because don’t need to synchronize until lock is released
Design and Implementation
Deployment of Binaries Server A A A A Daemon Daemon Daemon Daemon Future capability:
Master: Locks and Page Info Master v: 28 o: 3 Worker 0 Worker 3 Worker 1 Worker 2
Locks Lock Requests • Spawn thread • Wait on • Reply after acquired Master Similar for unlock Worker
Page Servers: Page Contents • Background thread that distributes page data upon request • All workers have a page server Worker Page Server Worker Page Server Worker Page Server Worker Page Server
Patches Implicitly locked memory can be smaller than a page. Used diff-patches to merge modifications Acquire Modify Release - +
Transparent Interface Fault handlers, lock/unlock: • starts (lazy) to protect pages ◦ • fault to catch first read/write ◦ 1st: Upgrade to access → get latest version ◦ 2nd: Upgrade to → mark modified • does ◦ Pull versions, merge modified pages, create new version
Transparent Shared Memory Across Processes Across Network (DSM)
Benchmarks
Benchmark Setup • Compare performance of single machine versus DSM. Single Machine Using DSM
Matrix Multiplication -9 7 -1 1 -1 3 3 0 0 = 6 -5 2 2 -1 4 0 3 0 6 -4 1 2 2 1 0 0 3
Matrix Multiplication Single Machine Using DSM 1/ms 1/ms 0.0005 0.001 0.005 0.0003 0.000 0.0000 0 5 10 15 20 0 5 10 Number of Cores Number of Cores
Word Count Donald Ervin Knuth: The Art of Computer Programming: Generating all Combinations and Partitions
Word Count Single Machine Using DSM 1/ms 1/ms 0.0012 0.0007 0.0004 0.0062 0.0000 0.0000 0 5 10 15 20 0 5 10 Number of Cores Number of Cores
Demo!
Conclusion • We made transparent DSM! • Focus: Correctness first, then scalability • Tedious: ○ C data structures ○ Message passing / handling
Questions?
Bonus: Correctness Test Shared Buckets Private Buckets +1 +2 Worker 0 Worker 1 Invariant: Sum Shared = Sum Private Worker 2
Bonus: Correctness Test Nested Transfer Nested Increment All Counts Serial Transfer 1 Count
Recommend
More recommend
Explore More Topics
Stay informed with curated content and fresh updates.