Validation in Optimistic Concurrency Control ACM SIGMOD 2015 - PowerPoint PPT Presentation

Validation in Optimistic Concurrency Control ACM SIGMOD 2015 Programming Contest Alexey Karyakin (Crisis) David R. Cheriton School of Computer Science University of Waterloo

The Problem Alexey Karyakin 2 06/02/15

Database and Query Statistics I • Cardinalities Primary key size, transaction and row insertion/deletion rate Queries / validation, predicates / query, length of queried txn range, flush batch size • Probabilities Satisfied predicates (per type), queries, validations How often a column is used in queries Alexey Karyakin 3 06/02/15

Database and Query Statistics II • Only recent transactions are validated (~10 3 transactions, ~10 5 rows) • 10 3 validations per batch, 30-50 queries per validation, 10 predicates in each query • Probability of a satisfied equality predicate is very low: ~10 -6 • Probability of a satisfied query is also low: 10 -3 Overall, satisfied validation probability: ~5% Alexey Karyakin 4 06/02/15

Strategy • Join ~10 5 rows against ~10 5 validation queries in each batch • Build an index on one side and iterate the other side • Which side (records or validation queries) to index? Index data and iterate over queries in each batch • Select the predicate with best selectivity for index lookup • Some queries cannot use hash index (no equalities) Resort to table scan Alexey Karyakin 5 06/02/15

Data Organization • Primary key stores existing (not yet deleted) record ids STX B-tree • Transaction records are log-structured New records are appended to the front Old records are deleted (“forget”) from the back only • No heap allocation or STL containers Alexey Karyakin 6 06/02/15

Indexing • A hash table is built for each column Small at start, automatically expanded • Each record has a link field for each column Records are chained in historic order • Looked-up record has to be validated Non-indexed columns Hash collisions Alexey Karyakin 7 06/02/15

Indexing Alexey Karyakin 8 06/02/15

Selectivity Estimation • Only one column may be used for index lookup Find the field with better selectivity Some columns have only a few unique values – scan is better • Column selectivity estimates the number of unique values • The same hash table is used • Average allocation distance The number of records inserted between updating a hash slot Alexey Karyakin 9 06/02/15

Table Scan • Transaction range is scanned backwards, starting with the most-recently inserted record • Most non-equality predicates (<, <=, >, >=) have high probability of evaluating to true • However, sometimes the queried value is outside of the range Min and max values are computed for blocks of records and are used to accelerate the scan Alexey Karyakin 10 06/02/15

Parallel Processing • Data is split into work elements which are transferred between threads via producer-consumer queues • Three-phase evaluation of each batch • Small number of threads in the pool • No locking or synchronization when accessing data Alexey Karyakin 11 06/02/15

Data Flow list of queries evaluated to true Work queue ... Table 0 queue ... input Main thread Work queue ... Table N queue Work queue output Stage 1: reading and parsing input, distributing data between table queues, batch processing coordination Alexey Karyakin 12 06/02/15

Data Flow list of queries evaluated to true Work queue ... Table 0 queue ... input Main thread Work queue ... Table N queue Work queue output output Stage 2: updating data and indexes, initial phase of query evaluation Alexey Karyakin 13 06/02/15

Data Flow list of queries evaluated to true Work queue ... Table 0 queue ... input Main thread Work queue ... Table N queue Work queue output Stage 3: complete query evaluation Alexey Karyakin 14 06/02/15

Results • Query evaluation is heavier than data updating • Most (90%) queries are evaluated using an index • CPU cache misses is the main performance cost • The degree of parallelism is low due to high access skew between tables (one table was especially heavy) • Cost of reading input and parsing is high, compared to actual processing Alexey Karyakin 15 06/02/15

Questions This page intentionally left blank Thank you. Alexey Karyakin 16 06/02/15

Validation in Optimistic Concurrency Control ACM SIGMOD 2015 - PowerPoint PPT Presentation

Validation in Optimistic Concurrency Control ACM SIGMOD 2015 Programming Contest Alexey Karyakin (Crisis) David R. Cheriton School of Computer Science University of Waterloo The Problem Alexey Karyakin 2 06/02/15 Database and Query

TicToc: Time Traveling Optimistic Concurrency Control Authors: Xiangyao Yu, Andrew Pavlo, Daniel

Concurrency Control Ensuring Isolation 354 Concurrency control Concurrency To increase

COMP31212: Concurrency Topics 4.1: Concurrency Patterns - Monitors Topic 4.1: Concurrency

Infinite Resources for Optimistic Concurrency Control with NOCC Theo Jepsen, Leandro Pacheco de

Distributed Systems (ICE 601) Transactions & Concurrency Control - Part1 Dongman Lee ICU

Distributed Systems (ICE 601) Concurrency Control - Part2 Dongman Lee ICU Class Overview

Optimistic Fair Priced Oblivious Transfer A. Rial B. Preneel Katholieke Universiteit Leuven -

Exa- to Yotta-scale Data An Optimistic View Rob Farber PNNL Optimistic about Storage Bandwidth

Concurrency What is concurrency? In computer science, concurrency is a property of systems which

Validation of National Burn Severity Validation of National Burn Severity Validation of National

Form Validation 1 CS380 What is form validation? 2 validation: ensuring that form's values

Mostly-Optimistic Concurrency Control for Highly Contended Dynamic Workloads on 1000 cores

Optimistic Concurrency Control Database Systems: The Complete Book Ch 18.8, 19 1

Mostly-Optimistic Concurrency Control for Highly Contended Dynamic Workloads on a Thousand Cores

Optimistic Concurrency Control April 13, 2017 1 Serializability Executing transactions

TicToc: Time Traveling Optimistic Concurrency Control Xiangyao Yu 1 , Andrew Pavlo 2 , Daniel

Applications and network performance Network Operations glen.turner@aarnet.edu.au How long?

LongRunningTransactionsinServiceOriented Environments infm3::SR

sled and rio Rust DB + io_uring = @sadisticsystems sled.rs who am I building Rust databases

Blockchain Tech UNSW COMP9243 18s1 Michael Sproul Warning Blockchain is all the rage, but

This course is important for... End users of DBS DB application programmers Database

Bitcoin: P2P Digital Currency Pern Hui Chia Q2S NTNU Trial lecture, 7 Dec 2012, Trondheim

Introduction to Database Systems Chapter 1 Instructor: Wang-Chien Lee wlee@wpi.edu or

Bitcoin Yongdae Kim 1 Cypherpunk v 1970 v

Sambuz

Useful Links

Newsletter

Mail Us

Validation in Optimistic Concurrency Control ACM SIGMOD 2015 - PowerPoint PPT Presentation

Validation in Optimistic Concurrency Control ACM SIGMOD 2015 Programming Contest Alexey Karyakin (Crisis) David R. Cheriton School of Computer Science University of Waterloo The Problem Alexey Karyakin 2 06/02/15 Database and Query

TicToc: Time Traveling Optimistic Concurrency Control Authors: Xiangyao Yu, Andrew Pavlo, Daniel

Concurrency Control Ensuring Isolation 354 Concurrency control Concurrency To increase

COMP31212: Concurrency Topics 4.1: Concurrency Patterns - Monitors Topic 4.1: Concurrency

Infinite Resources for Optimistic Concurrency Control with NOCC Theo Jepsen, Leandro Pacheco de

Distributed Systems (ICE 601) Transactions &amp; Concurrency Control - Part1 Dongman Lee ICU

Distributed Systems (ICE 601) Concurrency Control - Part2 Dongman Lee ICU Class Overview

Optimistic Fair Priced Oblivious Transfer A. Rial B. Preneel Katholieke Universiteit Leuven -

Exa- to Yotta-scale Data An Optimistic View Rob Farber PNNL Optimistic about Storage Bandwidth

Concurrency What is concurrency? In computer science, concurrency is a property of systems which

Validation of National Burn Severity Validation of National Burn Severity Validation of National

Form Validation 1 CS380 What is form validation? 2 validation: ensuring that form's values

Mostly-Optimistic Concurrency Control for Highly Contended Dynamic Workloads on 1000 cores

Optimistic Concurrency Control Database Systems: The Complete Book Ch 18.8, 19 1

Mostly-Optimistic Concurrency Control for Highly Contended Dynamic Workloads on a Thousand Cores

Optimistic Concurrency Control April 13, 2017 1 Serializability Executing transactions

TicToc: Time Traveling Optimistic Concurrency Control Xiangyao Yu 1 , Andrew Pavlo 2 , Daniel

Applications and network performance Network Operations glen.turner@aarnet.edu.au How long?

LongRunningTransactionsinServiceOriented Environments infm3::SR

sled and rio Rust DB + io_uring = @sadisticsystems sled.rs who am I building Rust databases

Blockchain Tech UNSW COMP9243 18s1 Michael Sproul Warning Blockchain is all the rage, but

This course is important for... End users of DBS DB application programmers Database

Bitcoin: P2P Digital Currency Pern Hui Chia Q2S NTNU Trial lecture, 7 Dec 2012, Trondheim

Introduction to Database Systems Chapter 1 Instructor: Wang-Chien Lee wlee@wpi.edu or

Bitcoin Yongdae Kim 1 Cypherpunk v 1970 v

Sambuz

Useful Links

Newsletter

Mail Us

Distributed Systems (ICE 601) Transactions & Concurrency Control - Part1 Dongman Lee ICU