WHAT NON-VOLATILE MEMORY MEANS FOR THE FUTURE OF DATABASE SYSTEMS @ANDY_PAVLO
1973
1974
1978
1986
1994
2010
2016
The Future
Non Non-Vola olatile tile M Memor mory • Persistent storage with byte-addressable operations. • Fast read/write latencies. • No difference between random vs. sequential access. 1 1
Wh What does s NVM VM mean for D DBMSs? MSs? • Thinking of NVM as just a faster SSD is not interesting. • We want to use NVM as permanent storage for the database, but this has major implications. – Operating System Support – Cloud Provider Provisioning – Database Management System Architectures 1 2
Existing Existing NVM-Only NVM-Only Hybrid Hybrid Systems Systems Storage Storage DBMS DBMS 1 3
Chap Chapter I r I – Exis xisting ting Systems • Investigate how existing systems perform with NVM for write-heavy transaction processing (OLTP) workloads. • Evaluate two types of DBMS architectures. – Disk-oriented ( MySQL ) – In-Memory ( H-Store ) A PROLEGOMENON ON OLTP DATABASE SYSTEMS FOR NON-VOLATILE MEMORY ADMS@VLDB 2015 1 4
DISK-ORIENTED IN-MEMORY Buffer Pool Table Heap Table Heap Log Snapshots Log Snapshots 1 5
In Intel l Labs abs NV NVM Em Emula ulator or • Instrumented motherboard that slows down access to the memory controller with tunable latencies. • Special assembly to emulate upcoming Xeon instructions for flushing cache lines. STORE STORE L1 Cache L2 Cache PCOMMIT 1 6
Exp xperim rimental E al Evalu aluation tion • Compare architectures on Intel Labs NVM emulator. • Yahoo! Cloud Serving Benchmark: – 10 million records (~10GB) – 8x database / memory – Variable skew 1 7
YCSB // // Read-Only Workload 2x Latency Relative to DRAM 200,000 H-Store MySQL 150,000 100,000 50,000 0 TXN/SEC SKEW AMOUNT (HIGH → LOW) 1 8
YCSB // // 50% Reads / 50% Writes Workload 2x Latency Relative to DRAM 40,000 H-Store MySQL 30,000 8x La 8x Latency 20,000 10,000 0 TXN/SEC SKEW AMOUNT (HIGH → LOW) 1 9
LESSONS 1 NVM Latency does not have a large impact. Logging is a major 2 performance bottleneck. Legacy DBMSs are not ot 3 prepared to run on NVM. 2 0
What would Larry Ellison do?
Chap Chapter II r II – NVM VM-on only S ly Stor orag age • Evaluate storage and recovery methods for a system that only has NVM. • Testbed DBMS with a pluggable storage engines. • We had to build our own NVM-aware memory allocator. LET'S TALK ABOUT STORAGE & RECOVERY METHODS FOR NON-VOLATILE MEMORY DATABASE SYSTEMS SIGMOD 2015 2 3
DBM DBMS Ar Architec ectures es In-Place Copy-on-Write Copy-on-Write Log-Structured Log-Structured Table Heap Table Heap Table Heap No Table Heap No Table Heap Log + Snapshots No Logging No Logging Log-only Storage Log-only Storage 24
In In-Place Place En Engin ine UPDATE table SET val=ABC WHERE id=123 Table Heap New Tuple 2 NVM Log Snapshots Delta Record 1 3 New Tuple 2 5
NVM VM-Op Optim timiz ized Ar d Archit chitectur tures • Use non-volatile pointers to only record what changed rather than how it changed. • Be careful about how & when things get flushed from CPU caches to NVM. 2 6
NVM VM-Awar are In In-Place Place En Engin ine UPDATE table SET val=ABC WHERE id=123 Table Heap New Tuple 2 Log Record TxnId Pointer Log Tuple Pointers 1 2 7
Evalu aluation tion • Testbed system using the Intel NVM hardware emulator. • Yahoo! Cloud Serving Benchmark – 2 million records + 1 million transactions – High-skew setting 2 8
YCSB // // 10% Reads / 90% Writes Workload 2x Latency Relative to DRAM ↑ 63 63% 1,200,000 Traditional NVM-Optimized 800,000 ↑ 50% 0% ↑ 122 22% 400,000 0 In-Place Copy-on-Write Log-Structured TXN/SEC 2 9
YCSB // // 10% Reads / 90% Writes Workload 2x Latency Relative to DRAM 400 Traditional NVM-Optimized 300 ↓ 25 25% 200 ↓ 20% 0% 100 ↓ 40% 0% 0 In-Place Copy-on-Write Log-Structured NVM STORES (M) 3 0
YCSB // // Elapsed time to replay log with varying log sizes 2x Latency Relative to DRAM 1000 Traditional NVM-Optimized 100 10 No Recovery 1 Needed 0.1 0.01 10^3 10^4 10^5 10^3 10^4 10^5 10^3 10^4 10^5 In-Place Copy-on-Write Log-Structured RECOVERY TIME (MS) 3 1
LESSONS 1 Using NVM correctly improves throughput & reduces weadown. Avoid block-oriented 2 components. NVM-only systems are 3 15-20 years away 3 2
What would Nikita Kahn do?
Chap Chapter III r III – Hybrid ybrid DBMS • Design and build a new in-memory DBMS that will be ready for NVM when it becomes available. • Hybrid Storage + Hybrid Workloads – DRAM + NVM oriented architecture – Fast Transactions + Real-time Analytics 3 5
Adap daptiv tive S Stor orage age UPDATE myTable Or Orig iginal D inal Data Ada dapt pted D ed Data SET A = 123, B = 456, C = 789 A B C D A B C D WHERE D = “xxx” Hot Cold SELECT AVG (B) A B C D FROM myTable WHERE C < “yyy” BRIDGING THE ARCHIPELAGO BETWEEN ROW-STORES AND COLUMN-STORES FOR HYBRID WORKLOADS SIGMOD 2016 3 6
LESSONS
Peloton The Self-Driving Database Management System 3 8
NVM Ready Query Compilation Vectorized Execution Autonomous Apache Licensed 3 9
Anthony Todd Joy Prashanth Lin Dana Michael Tomasic Mowry Arulraj Menon Ma Van Aken Zhang Matthew Ran Jiexi Jianhong Ziqi Yingjun Runshen Perron Xian Lin Li Wang Wu Zhu http://pelotondb.org
@ANDY _ PAVLO
Recommend
More recommend