ClickHouse Deep Dive Aleksei Milovidov ClickHouse use cases A - PowerPoint PPT Presentation

ClickHouse Deep Dive Aleksei Milovidov

ClickHouse use cases A stream of events › Actions of website visitors › Ad impressions › DNS queries › E-commerce transactions › … We want to save info about these events and then glean some insights from it 2

ClickHouse philosophy › Interactive queries on data updated in real time › Cleaned structured data is needed › Try hard not to pre-aggregate anything › Query language: a dialect of SQL + extensions 3

Sample query in a web analytics system Top-10 referers for a website for the last week. SELECT Referer, count(*) AS count FROM hits WHERE CounterID = 111 AND Date BETWEEN ‘2018-04-18’ AND ‘2018-04-24’ GROUP BY Referer ORDER BY count DESC LIMIT 10 4

How to execute a query fast ? Read data fast › Only needed columns: CounterID, Date, Referer › Locality of reads (an index is needed!) › Data compression 5

How to execute a query fast ? Read data fast › Only needed columns: CounterID, Date, Referer › Locality of reads (an index is needed!) › Data compression Process data fast › Vectorized execution (block-based processing) › Parallelize to all available cores and machines › Specialization and low-level optimizations 6

Index needed! The principle is the same as with classic DBMSes A majority of queries will contain conditions on   CounterID and (possibly) Date (CounterID, Date) fits the bill Check this by mentally sorting the table by primary key Differences › The table will be physically sorted on disk › Is not a unique constraint 7

Index internals (CounterID, Date) CounterID Date Referer primary.idx .mrk .bin .mrk .bin .mrk .bin … … 111 2017-07-22 111 2017-10-04 N 111 2018-04-20 N+8192 222 2013-02-16 N+16384 222 2013-03-12 … … (One entry each 8192 rows) 8

Things to remember about indexes Index is sparse › Must fit into memory › Default value of granularity (8192) is good enough › Does not create a unique constraint › Performance of point queries is not stellar Table is sorted according to the index › There can be only one › Using the index is always beneficial 9

How to keep the table sorted Inserted events are (almost) sorted by time But we need to sort by primary key! MergeTree: maintain a small set of sorted parts Similar idea to an LSM tree 10

How to keep the table sorted Primary key Part To on disk insert M N N+1 Insertion number 11

How to keep the table sorted Primary key Part Part on disk on disk M N N+1 Insertion number 12

How to keep the table sorted Primary key Part Part [M, N] [N+1] Merge in the background M N N+1 Insertion number 13

How to keep the table sorted Primary key Part [M, N+1] M N+1 Insertion number 14

Things to do while merging Replace/update records › ReplacingMergeTree › CollapsingMergeTree Pre-aggregate data › AggregatingMergeTree Metrics rollup › GraphiteMergeTree 15

  MergeTree partitioning ENGINE = MergeTree … PARTITION BY toYYYYMM(Date) › Table can be partitioned by any expression (default: by month) › Parts from different partitions are not merged › Easy manipulation of partitions   ALTER TABLE DROP PARTITION   ALTER TABLE DETACH/ATTACH PARTITION › MinMax index by partition columns   16

Things to remember about MergeTree Merging runs in the background › Even when there are no queries! Control total number of parts › Rate of INSERTs › MaxPartsCountForPartition and DelayedInserts metrics are your friends 17

When one server is not enough › The data won’t fit on a single server… › You want to increase performance by adding more servers… › Multiple simultaneous queries are competing for resources… 18

When one server is not enough › The data won’t fit on a single server… › You want to increase performance by adding more servers… › Multiple simultaneous queries are competing for resources… ClickHouse: Sharding + Distributed tables! 19

Reading from a Distributed table SELECT FROM distributed_table GROUP BY column SELECT FROM local_table GROUP BY column Shard 1 Shard 2 Shard 3 20

Reading from a Distributed table Full result Partially aggregated result Shard 1 Shard 2 Shard 3 21

NYC taxi benchmark CSV 227 Gb, ~1.3 bln rows SELECT passenger_count, avg(total_amount)   FROM trips GROUP BY passenger_count Shards 1 3 140 Time, s. 1,224 0,438 0,043 Speedup x2.8 x28.5 22

Inserting into a Distributed table INSERT INTO distributed_table Shard 1 Shard 2 Shard 3 23

Inserting into a Distributed table Async insert into shard # sharding_key % 3 INSERT INTO local_table Shard 1 Shard 2 Shard 3 24

Inserting into a Distributed table SET insert_distributed_sync=1; INSERT INTO distributed_table…; Split by sharding_key and insert Shard 1 Shard 2 Shard 3 25

Things to remember about Distributed tables It is just a view › Doesn’t store any data by itself Will always query all shards   Ensure that the data is divided into shards uniformly › either by inserting directly into local tables › or let the Distributed table do it   (but beware of async inserts by default) 26

When failure is not an option › Protection against hardware failure › Data must be always available for reading and writing 27

When failure is not an option › Protection against hardware failure › Data must be always available for reading and writing ClickHouse: ReplicatedMergeTree engine! › Async master-master replication › Works on per-table basis 28

Replication internals Inserted block number INSERT Replica 1 fetch fetch Replication queue Replica 2 merge (ZooKeeper) Replica 3 merge 29

Replication and the CAP–theorem What happens in case of network failure (partition)? › Not consistent ❋   As is any system with async replication ❋ But you can turn linearizability on › Highly available (almost) ❋   Tolerates the failure of one datacenter, if ClickHouse replicas   are in min 2 DCs and ZK replicas are in 3 DCs. ❋ A server partitioned from ZK quorum is unavailable for writes 30

Putting it all together SELECT FROM distributed_table SELECT FROM replicated_table Shard 1 Shard 2 Shard 3 Replica 1 Replica 1 Replica 1 Shard 1 Shard 2 Shard 3 Replica 2 Replica 2 Replica 2 31

Things to remember about replication Use it! › Replicas check each other › Unsure if INSERT went through?   Simply retry - the blocks will be deduplicated › ZooKeeper needed, but only for INSERTs   (No added latency for SELECTs) Monitor replica lag › system.replicas and system.replication_queue tables are your friends 32

Brief recap › Column–oriented › Fast interactive queries on real time data › SQL dialect + extensions › Bad fit for OLTP, Key–Value, blob storage › Scales linearly › Fault tolerant › Open source! 33

Thank you Questions? Or reach us at: › clickhouse-feedback@yandex-team.com › Telegram: https://t.me/clickhouse_en › GitHub: https://github.com/yandex/ClickHouse/ › Google group: https://groups.google.com/group/clickhouse 34

ClickHouse Deep Dive Aleksei Milovidov ClickHouse use cases A - PowerPoint PPT Presentation

ClickHouse Deep Dive Aleksei Milovidov ClickHouse use cases A stream of events Actions of website visitors Ad impressions DNS queries E-commerce transactions We want to save info about these events and then glean some

ClickHouse for Time-Series Alexander Zaitsev Agenda What is special about time series What is

Low Cost Transactional and Analytics with MySQL + Clickhouse Have your Cake and Eat it Too

DEEP DIVE DEEP DIVE INT INTO O SEO SEO Private and Confidential. Property of Whereoware, LLC.

RTGEN (AGC) & ICCP Deep Dive February 23, 2012 Shari Brown and Matt Beck CBA Project Staff

DataPower DataPower-MQ Integration MQ Integration Deep Dive Deep Dive Robin Wiley (Robin

A Deep Dive into the Dark Web Coen Schuijt UvA OS3 February 5 th , 2019 February 5 th , 2019

2019 CoC System Goals Deep-Dive for r Goal 1 Laura Bass, Director of Programs, Facing Forward to

Deep Dive Into the Form 1023 Application for 501c3 Tax-Exemption Lorri Dunsmore November 2, 2017

Deep Dive Into Mann-Whitney and Spearman Rank Deliverance Bougie Sr. Statistician August 2018

2019 System Goals Deep-Dive Data Carmelo Barbaro Executive Director, Poverty Lab Urban Labs,

Interactive Deep-dive : Visualizing Terrorism Data Ella Kim, Leah Kim Project Idea Evolution of

Next Generation ACO Model Open Door Forum: Financial Deep Dive March 31, 2015 Agenda

ClickHouse In Real Life Case Studies and Best Practices Alexander Zaitsev, LifeStreet/Altinity

Clickhouse at MessageBird Analysing billions of events in real-time* Aleksandar Aleksandrov &

Opensource Column Store Databases: MariaDB ColumnStore vs. ClickHouse Alexander Rubin

Altinity Building Multi-Petabyte Data Warehouses with ClickHouse Alexander Zaitsev LifeSteet,

Clock Network Synthesis with Concurrent Gate Insertion Jingwei Lu, Wing-Kai Chow and Chiu-Wing

AVL Trees Cost of the BST Operations 1 Our Goal Develop a data structure that has guaranteed

Today, Lecture 27: Algorithms for sorting and efficiency analysis (Ch. 8) Insertion

403: Algorithms and Data Structures Analysis of Insertion Sort Fall 2016 UAlbany Computer

Topic 23 Red Black Trees "People in every direction No words exchanged No time to exchange

SmartCuckoo: A Fast and Cost-Efficient Hashing Index Scheme for Cloud Storage Systems Yuanyuan

CSE 326: Data Structures AVL Trees Richard Anderson, Steve Seitz Winter 2014 1 Announcements

In-Flight IPv6 Extension Header Insertion Considered Harmful

ClickHouse Deep Dive Aleksei Milovidov ClickHouse use cases A - PowerPoint PPT Presentation

ClickHouse Deep Dive Aleksei Milovidov ClickHouse use cases A stream of events Actions of website visitors Ad impressions DNS queries E-commerce transactions We want to save info about these events and then glean some

ClickHouse for Time-Series Alexander Zaitsev Agenda What is special about time series What is

Low Cost Transactional and Analytics with MySQL + Clickhouse Have your Cake and Eat it Too

DEEP DIVE DEEP DIVE INT INTO O SEO SEO Private and Confidential. Property of Whereoware, LLC.

RTGEN (AGC) &amp; ICCP Deep Dive February 23, 2012 Shari Brown and Matt Beck CBA Project Staff

DataPower DataPower-MQ Integration MQ Integration Deep Dive Deep Dive Robin Wiley (Robin

A Deep Dive into the Dark Web Coen Schuijt UvA OS3 February 5 th , 2019 February 5 th , 2019

2019 CoC System Goals Deep-Dive for r Goal 1 Laura Bass, Director of Programs, Facing Forward to

Deep Dive Into the Form 1023 Application for 501c3 Tax-Exemption Lorri Dunsmore November 2, 2017

Deep Dive Into Mann-Whitney and Spearman Rank Deliverance Bougie Sr. Statistician August 2018

2019 System Goals Deep-Dive Data Carmelo Barbaro Executive Director, Poverty Lab Urban Labs,

Interactive Deep-dive : Visualizing Terrorism Data Ella Kim, Leah Kim Project Idea Evolution of

Next Generation ACO Model Open Door Forum: Financial Deep Dive March 31, 2015 Agenda

ClickHouse In Real Life Case Studies and Best Practices Alexander Zaitsev, LifeStreet/Altinity

Clickhouse at MessageBird Analysing billions of events in real-time* Aleksandar Aleksandrov &amp;

Opensource Column Store Databases: MariaDB ColumnStore vs. ClickHouse Alexander Rubin

Altinity Building Multi-Petabyte Data Warehouses with ClickHouse Alexander Zaitsev LifeSteet,

Clock Network Synthesis with Concurrent Gate Insertion Jingwei Lu, Wing-Kai Chow and Chiu-Wing

AVL Trees Cost of the BST Operations 1 Our Goal Develop a data structure that has guaranteed

Today, Lecture 27: Algorithms for sorting and efficiency analysis (Ch. 8) Insertion

403: Algorithms and Data Structures Analysis of Insertion Sort Fall 2016 UAlbany Computer

Topic 23 Red Black Trees &quot;People in every direction No words exchanged No time to exchange

SmartCuckoo: A Fast and Cost-Efficient Hashing Index Scheme for Cloud Storage Systems Yuanyuan

CSE 326: Data Structures AVL Trees Richard Anderson, Steve Seitz Winter 2014 1 Announcements

In-Flight IPv6 Extension Header Insertion Considered Harmful

RTGEN (AGC) & ICCP Deep Dive February 23, 2012 Shari Brown and Matt Beck CBA Project Staff

Clickhouse at MessageBird Analysing billions of events in real-time* Aleksandar Aleksandrov &

Topic 23 Red Black Trees "People in every direction No words exchanged No time to exchange