Databases Have Forgotten About Single Node Performance, A - PowerPoint PPT Presentation

Databases Have Forgotten   About Single Node Performance, A Wrongheaded Trade Off Matvey Arye, PhD Core Database Engineer mat@timescale.com · github.com/timescale

How computing used to look

How computing looks today

What happened? @timescale

Databases changed from   scaling up to scaling out

We entered the MapReduce Era

But… MapReduce • Grep: 150 Seconds specializes in • Sort: 890 Seconds long jobs that • Average job completion time of touch lots of data Google production workloads: 634 Seconds (Source: Google MapReduce Paper)

Why the high-latency?

Network Latencies • Connection setup/teardown • TCP ramp-up • Network transfer

Consensus is Expensive • Dreaded two-phase commit • Snapshot Isolation even harder • Co-odinators often bottlenecks

The “Straggler” Problem

The high-latency of distributed databases is nothing new Graph Connected Component Analysis Page Rank (20 iterations) Core Twitter_r Core Twitter_rv uk_2007_5 uk_2007_5 s v s Spark 128 1784s 8000s+ Spark 128 857s 1759s Giraph 128 200s 8000s+ Giraph 128 596s 1235s GraphLab 128 242s 714s 249s GraphLab 128 833s Graphx 128 251s 800s Graphx 128 419s 462s Laptop 1 153s 417s Laptop 1 300s 651s Laptop* 1 15s 30s Source: Scalability! But at what COST? - McSherry et al.

But that was 10 years ago The world has changed Then Now 160 GB Spinning Rust 60 TB SSDs 2 Core CPU 64 Core CPU 4GB RAM TBs of RAM Edge computing GPUs, TPUs

Financial & Datacenter And data Marketing & DevOps needs have become real-time. Web / Mobile Events Transportation & Logistics Industrial Machines

@timescale

Shorter latency requirements mean   we need to focus again on scaling up

So how do we improve single-node performance ?   (Lessons from TimescaleDB)

Postgres 9.6.2 on Azure standard DS4 v2 (8 cores), SSD (premium LRS storage) Each row has 12 columns (1 timestamp, indexed 1 host ID, 10 metrics)

Insight #1 Partition, even on a single machine.

Partitioning on a single node. Hypertable Time   Chunk (sub-table) (older) Space

Chunks are “right-sized” Recent (hot) chunks fit in memory

TimescaleDB vs. PostgreSQL (single-row inserts) 144K METRICS / S 14.4K INSERTS / S TimescaleDB 0.5, Postgres 9.6.2 on Azure standard DS4 v2 (8 cores), SSD (LRS storage) Each row has 12 columns (1 timestamp, indexed 1 host ID, 10 metrics)

TimescaleDB vs. PostgreSQL ( batch inserts) 1.11M METRICS / S >20x TimescaleDB 0.5, Postgres 9.6.2 on Azure standard DS4 v2 (8 cores), SSD (LRS storage) Each row has 12 columns (1 timestamp, indexed 1 host ID, 10 metrics)

Other benefits to partitioning

Avoid querying chunks via constraint exclusion SELECT time, device_id, temp FROM data   WHERE time > ‘2017-08-22 18:18:00+00’

Efficient retention policies Drop chunks, don’t delete rows ⇒ avoids vacuuming

Insight #2 Partition, even on a single machine   across many disks

Single node: Scaling up via adding disks How Benefit Faster inserts • Chunks spread across many disks (elastically!) Parallelized queries either RAIDed or via distinct tablespaces •

Elasticity on a single node

Insight #3 Query Optimizations

Avoid querying chunks via constraint exclusion (more) SELECT time, device_id, temp FROM data   Won’t exclude WHERE time > now() - interval ’24 hours’ chunks in plain PostgreSQL

Example Query Optimization CREATE INDEX ON readings(time); Will this use SELECT date_trunc(‘minute’, time) as bucket, avg(cpu) the index? FROM readings GROUP BY bucket ORDER BY bucket DESC LIMIT 10;

Example Query Optimization CREATE INDEX ON readings(time); Timescale SELECT date_trunc(‘minute’, time) as bucket, understands avg(cpu) FROM readings time GROUP BY bucket ORDER BY bucket DESC LIMIT 10;

4/20/2018 out.full.10concurrency.cache.withwhere.svg Flame Graph How to find bottlenecks? po.. post.. post.. libsys.. post.. postgr.. libsystem_kerne.. postgr.. libsystem_kernel.d.. postgr.. li.. postgres`LruInsert postgr.. postgres`LruInsert postgres`FileAccess po.. postgr.. po.. postgres`FileAccess li.. postgres`FileSeek po.. po.. postgr.. post.. lib.. postgres`FileSeek postgres`_mdnblocks po.. po.. postgr.. postgresòp. postgres`_mdnblocks postgres`mdnblocks po.. pos.. postgr.. postgres`pr. postgres`mdnblocks postgres`smgrnblocks po.. pos.. postgr.. postgres`pr. postgres`smgrnblocks postgres`RelationGetNumbe.. postgr.. po.. pos.. postgr.. postgres`pr. postgres`RelationGetNumberOfBlock.. postgresèstimate_rel_size po.. postgres`get_relat.. postgres`pr. postgres`get_relation_info postgres`relation_excluded_by_constr.. postgres`build_simple_rel postgres`set_append_rel_size postgres`build_simple_rel postgres`set_rel_size po.. postgresàdd_base_rels_to_query postgres`set_base_rel_sizes pos.. postgresàdd_base_rels_to_query postgres`make_one_rel postgres`.. postgres`query_planner postgres`.. postgres`grouping_planner postgres`subquery_planner postgres`standard_planner timescaledb0.10.0dev.so`timescaledb_planner postgres`planner postgres`pg_plan_query postgresÈxplainOneQuery postgresÈxplainQuery postgres`standard_ProcessUtility timescaledb0.10.0dev.so`prev_ProcessUtility timescaledb0.10.0dev.so`timescaledb_ddl_command_start po.. postgres`ProcessUtility pos.. postgres`PortalRunUtility pos.. postgres`FillPortalStore pos.. postgres`PortalRun postgresèxec_simple_query postgres`PostgresMain postgres`0x10f97ed8e postgres`BackendStartup postgres`ServerLoop postgres`PostmasterMain postgres`0x10f89e519 libdyld.dylib`start postgres`0x10 Function: postgres`standard_planner (110 samples, 90.91%) file:///Users/arye/Downloads/out.full.10concurrency.cache.withwhere.svg 1/1

Insight #4 HA != Horizontally-distributed

Single Node + Replicas = High-availability

Replication much cheaper than horizontal- • Avoids consensus penalty distribution • Can more easily be asynchronous • Still provides HA

Results

Results vs Cassandra (5 nodes): Query Timescale Cassandra Type Scanned (h) Devices metrics query time (ms) Multiplier groupby-hour 24 all 1 27866.50 9.66 groupby-hour 24 all 5 35827.60 37.52 groupby-hour 24 all 10 44912.30 45.05 high-cpu 24 all 1 49,807 32.95 cpu-max 12 8 10 126.90 11.84 cpu-max 12 1 10 20.54 37.41 high-cpu 24 1 1 56 18.65 groupby-minute 12 1 1 33.80 0.91 groupby-minute 1 8 1 23.50 1.97 groupby-minute 12 1 5 27.60 10.25 groupby-minute 1 8 5 20.40 17.90 lastpoint all all 10 266.10 2285.30 groupby-orderby-limit all all 1 75.20 3181.62 Higher multiplier indicates worse Cassandra performance.

TimescaleDB vs. Cassandra   Insert metrics/s (5 nodes) Cassandra 150K Timescale 745K

Results vs Mongo: Query Timescale Mongo Type Scanned (h) Devices metrics query time (ms) Multiplier groupby-hour 24 all 1 29,968 7.63 groupby-hour 24 all 5 39,157 5.51 groupby-hour 24 all 10 49,058 5.50 high-cpu 24 all 1 51,323 5.24 cpu-max 12 8 10 260 1.14 cpu-max 12 1 10 28 1.32 high-cpu 24 1 1 95 1.17 groupby-minute 12 1 1 62 0.77 groupby-minute 1 8 1 28 1.95 groupby-minute 12 1 5 69 0.94 groupby-minute 1 8 5 26 2.17 lastpoint all all 10 453 101.72 groupby-orderby-limit all all 1 149 1667.25 Higher multiplier indicates worse Mango performance.

TimescaleDB vs. Mongo Insert metrics/s Mongo 807K Timescale 994K

Lessons for DB designers • Consider (insights!): • Concentrate on scale-up • Partitioning even on single-node • Even across disks • Performance analysis • High-level: Query optimization • Low-level: Profiling • High-availability is possible on single-node

Lessons for DB users • Absolute performance as important as scaling numbers. • Don’t go horizontally distributed unless you have to. • HA not same as horizontal scalability. • Replication cheaper than distribution. • SQL and ACID are extremely useful.

Open Source (Apache 2.0) • github.com/timescale/timescaledb Join the Community • slack.timescale.com

Timescale is hiring! Core Database Engineers • R&D Engineers • Solutions Engineers • Evangelists • Customer Success • careers.timescale.com

Databases Have Forgotten About Single Node Performance, A - PowerPoint PPT Presentation

Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off Matvey Arye, PhD Core Database Engineer mat@timescale.com github.com/timescale How computing used to look How computing looks today What happened?

The Forgotten Presidents The Forgotten Presidents The Forgotten Presidents The Forgotten

Title node 1 branch 1 branch 2 node 2 root branch 3 node 3 branch 4 node 4 Title node

Anonymity and Censorship Resistance Entry node Middle node Exit node Tor user Tor Node Tor

1 Agenda Quick'Intro' Node.js:'The'Beginning' What'Is'Node.js? Why'Use'Node.js?

Node.js Workshop Tom Hughes-Croucher Chief Evangelist / Node Tech Lead @sh1mmer tom@joyent.com

Warmup Exercise while (node != NULL) { ! Consider a binary tree if (node->m_data == value) {

Dev Lab: Node + Express What is Node? Node.js = JavaScript + File I/O + A Package Manager or:

Creating Databases and Tables Introduction to Databases in Python Creating Databases

SAP IGS SAP IGS THE 'VULNERABLE' FORGOTTEN COMPONENT THE 'VULNERABLE' FORGOTTEN COMPONENT Yvan

The Right to be The Right to be Forgotten Forgotten Engineering & Public Policy Rebecca

Inductive Inductive Inductive Inductive Databases Databases Databases Databases and

Lecture 11: Persistent Memory Databases 1 / 71 Persistent Memory Databases Recap

NODE.JS ANTI-PATTERNS and bad practices ADOPTION OF NODE.JS KEEPS GROWING CHAMPIONS Walmart,

1 Agenda Node&Modules Module&Loaders Node&Packages

Menzies Distributing the world. Problem The whole world in one server API GET node/#id Returns

Recursive Structures in Python class Node: data: int next: Node An attribute can refer to

Performance of Parallel Programs Wolfgang Schreiner Research Institute for Symbolic Computation

Fourth-Quarter and Full-Year Results 2008 Zurich February 11, 2009 Cautionary statement

Sparse Representations Joel A. Tropp Department of Mathematics The University of Michigan

AGENDA 1. Review of Metro Hartford Future 2. Vision and Benchmarks 3. Goals 4. Key

Prediction Serving Joseph E. Gonzalez Asst. Professor, UC Berkeley jegonzal@cs.berkeley.edu

Evaluating the out-of-sample prediction performance of panel data models 12th Spanish STATA

Workload-Driven Architectural Evaluation Evaluation in Uniprocessors Decisions made only after

Multigrid absolute value preconditioning Andrew Knyazev 2 (speaker) Eugene Vecharynski 1 1

Databases Have Forgotten About Single Node Performance, A - PowerPoint PPT Presentation

Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off Matvey Arye, PhD Core Database Engineer mat@timescale.com github.com/timescale How computing used to look How computing looks today What happened?

The Forgotten Presidents The Forgotten Presidents The Forgotten Presidents The Forgotten

Title node 1 branch 1 branch 2 node 2 root branch 3 node 3 branch 4 node 4 Title node

Anonymity and Censorship Resistance Entry node Middle node Exit node Tor user Tor Node Tor

1 Agenda Quick'Intro' Node.js:'The'Beginning' What'Is'Node.js? Why'Use'Node.js?

Node.js Workshop Tom Hughes-Croucher Chief Evangelist / Node Tech Lead @sh1mmer tom@joyent.com

Warmup Exercise while (node != NULL) { ! Consider a binary tree if (node-&gt;m_data == value) {

Dev Lab: Node + Express What is Node? Node.js = JavaScript + File I/O + A Package Manager or:

Creating Databases and Tables Introduction to Databases in Python Creating Databases

SAP IGS SAP IGS THE 'VULNERABLE' FORGOTTEN COMPONENT THE 'VULNERABLE' FORGOTTEN COMPONENT Yvan

The Right to be The Right to be Forgotten Forgotten Engineering &amp; Public Policy Rebecca

Inductive Inductive Inductive Inductive Databases Databases Databases Databases and

Lecture 11: Persistent Memory Databases 1 / 71 Persistent Memory Databases Recap

NODE.JS ANTI-PATTERNS and bad practices ADOPTION OF NODE.JS KEEPS GROWING CHAMPIONS Walmart,

1 Agenda Node&amp;Modules Module&amp;Loaders Node&amp;Packages

Menzies Distributing the world. Problem The whole world in one server API GET node/#id Returns

Recursive Structures in Python class Node: data: int next: Node An attribute can refer to

Performance of Parallel Programs Wolfgang Schreiner Research Institute for Symbolic Computation

Fourth-Quarter and Full-Year Results 2008 Zurich February 11, 2009 Cautionary statement

Sparse Representations Joel A. Tropp Department of Mathematics The University of Michigan

AGENDA 1. Review of Metro Hartford Future 2. Vision and Benchmarks 3. Goals 4. Key

Prediction Serving Joseph E. Gonzalez Asst. Professor, UC Berkeley jegonzal@cs.berkeley.edu

Evaluating the out-of-sample prediction performance of panel data models 12th Spanish STATA

Workload-Driven Architectural Evaluation Evaluation in Uniprocessors Decisions made only after

Multigrid absolute value preconditioning Andrew Knyazev 2 (speaker) Eugene Vecharynski 1 1

Warmup Exercise while (node != NULL) { ! Consider a binary tree if (node->m_data == value) {

The Right to be The Right to be Forgotten Forgotten Engineering & Public Policy Rebecca

1 Agenda Node&Modules Module&Loaders Node&Packages