Using Rust to Build a Distributed Transactional Key-Value Database LiuTang | tl@pingcap.com
About me ● Chief Architect at PingCAP ● TiDB and TiKV ● Open source projects ○ LedisDB ○ go-mysql ○ go-mysql-elasticsearch ○ rust-prometheus ○ ...
Agenda ● Introduction ● Hierarchy ○ Storage ○ Raft ○ Transaction ○ RPC Framework ○ Monitor ○ Test ● Combine them all
When we want to build a distributed transactional key-value database...
Consistency Performance Scalability � Stability ACID HA Others…
A High Building, A Low Foundation
Language
Let’s start from scratch!!!
RocksDB Immutable Memory WAL Memory Table Table Flush Memory Disk Compaction SST Info Log Level 0 Manifest SST SST …... SST Level 1 Current SST SST …... SST Level 2
https://github.com/pingcap/rust-rocksdb
Raft Client State State State Machine Machine Machine Raft Raft Raft Module Module Module a = 1 a = 1 a = 1 b = 2 b = 2 b = 2 Log Log Log a = 1 b = 2 a = 1 b = 2 a = 1 b = 2
Multi-Raft Key Space A - B Raft Group Region 1 Region 1 Region 1 Raft Group Region 2 Region 2 Region 2 B - C Raft Group Region 3 Region 3 Region 3 C - D
Multi-Raft - Scalability A B C D Region 1 Region 1 Region 1 Region 2 Region 2 Region 2
Multi-Raft - Scalability A B C D Region 1 Region 1 Region 1 Region 2 Region 2 Region 2 Region 2 Raft ConfChange - AddNode
Multi-Raft - Scalability A B C D Region 1 Region 1 Region 1 Region 2 Region 2 Region 2 Raft ConfChange - RemoveNode
https://github.com/pingcap/raft-rs
Transaction
Transaction How to keep consistency crossing multi-Raft Groups? let mut txn = store.begin() let value1 = txn.get(region1_key) let value2 = txn.get(region2_key) // do something with value txn.set(region1_key, new_value1) txn.set(region2_key, new_value2) txn.commit() // or txn.rollback()
Transaction 1. Inspired by Google Percolator 2. Optimized Two Phase Commit (2 PC) 3. Multiversion Concurrency Control (MVCC) 4. Snapshot Isolation 5. Optimistic Transaction
gRPC ● Mode ○ Unary ○ Client streaming ○ Server streaming ○ Duplex streaming ● Using Futures to wrap the asynchronous C gRPC API let f = unary(service, method, request); let resp = f.wait();
https://github.com/pingcap/grpc-rs
Prometheus ● Type ○ Counter ○ Gauge ○ Histogram lazy_static! { static ref HTTP_COUNTER: Counter = register_counter!( "http_request_total", "Total number of HTTP request." ).unwrap(); } HTTP_COUNTER.inc();
https://github.com/pingcap/rust-prometheus
Testing
Testing - Failure Injection // Ingest a failure fn function_foo() { fail_point!("foo"); } // Run and Trigger the failure FAILPOINTS=foo=panic cargo run
https://github.com/pingcap/fail-rs
Architecture
Architecture Prometheus Client gRPC Txn API Txn API Txn API MVCC MVCC MVCC Raft Raft Raft RocksDB RocksDB RocksDB
https://github.com/pingcap/tikv
Beyond TiKV
A Distributed Relational Database
TiDB Applications MySQL Drivers(e.g. JDBC) MySQL Protocol TiDB RPC TiKV
A Distributed Analytical Database
TiSpark Spark Driver Job Spark Cluster Worker Worker Worker RPC TiKV
Hybrid Transactional/Analytical Processing Database
PD PD Data location TSO/Data location PD PD Cluster Meta data Spark Driver TiDB TiKV TiKV Job API TiDB API Worker TiKV TiDB TiKV Worker TiDB TiKV Worker TiKV ... TiDB ... Spark Cluster ... TiDB Cluster TiKV Cluster (Storage) TiSpark
Thank you! https://github.com/pingcap/tidb https://github.com/pingcap/tikv We are hiring… @China @Silicon Valley @Home
Recommend
More recommend