Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky - PowerPoint PPT Presentation

Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky SIGMOD17 / SIGMOD18 Niv Dayan

Log-Structured KV-Stores

Why Log-Structured KV-Stores?

Why Log-Structured KV-Stores? fast writes

Why Log-Structured KV-Stores? memory storage

Why Log-Structured KV-Stores?

Why Log-Structured KV-Stores? byte -addressable block -addressable

write data

In-Place Writes write data

In-Place Writes B-trees write data

Log-Structured Writes

Log-Structured Writes buffer writes

Log-Structured KV-Stores fast writes buffer writes

Log-Structured KV-Stores fast writes fast reads massive data

Background

Background buffer The Log-Structured Merge-Tree

Background buffer LSM-tree

buffer

writes buffer

key value pairs buffer

key value Sherlock: a fictional detective Waldo: an inconspicuous traveler buffer

buffer gets full

level buffer sort & flush 0 1

level buffer sort & flush … sorted runs 0 1

0 buffer 1 sort-merge 2

level 0 buffer exponentially increasing capacities o n e 1 level 1 I / O p e r r u n level 2 2 level 3 3

where’s level Waldo 0 buffer b i n a 1 r y s e a r c h i n g 2 3

where’s level Waldo 0 buffer pointers o n e 1 I / O p e r r u n 2 3

where’s level Waldo Bloom 0 buffer pointers filters 1 2 3

where’s level Waldo Bloom 0 buffer pointers filters true 1 negative 2 3

where’s level Waldo Bloom 0 buffer pointers filters true 1 negative false 2 positive 3

where’s level Waldo Bloom 0 buffer pointers filters true 1 negative false 2 positive true 3 positive

Bloom 0 buffer pointers filters merging frequency 1 2 3

merging writes reads

merging Leveling Tiering write-optimized read-optimized

Leveling Tiering read-optimized write-optimized

Leveling Tiering read-optimized write-optimized gather

Leveling Tiering read-optimized write-optimized gather merge & flush

Leveling Tiering read-optimized write-optimized gather

Leveling Tiering read-optimized write-optimized gather merge

Leveling Tiering read-optimized write-optimized gather merge flush

Leveling Tiering read-optimized write-optimized gather merge

Leveling Tiering read-optimized write-optimized log R ( N )

Leveling Tiering read-optimized write-optimized 1 run per level R runs per level log R ( N ) size ratio

Leveling Tiering read-optimized write-optimized 1 run per level R runs per level size ratio R

Leveling Tiering read-optimized write-optimized 1 run per level 1 run per level size ratio R

Leveling Tiering read-optimized write-optimized 1 run per level T runs per level size ratio R

Leveling Tiering read-optimized write-optimized O(l Nl ) runs per level 1 run per level sorted log array size ratio R

log Tiering Leveling sorted array

log Tiering size ratio R Leveling sorted array

R log Tiering size ratio R Leveling sorted R array

Monkey Dostoevsky

M onkey: O ptimal N avigable Key -Value Store SIGMOD17

M onkey: O ptimal N avigable Key -Value Store SIGMOD17 Niv Dayan Manos Athanassoulis   Stratos Idreos

M onkey: O ptimal N avigable Key -Value Store SIGMOD17 Bloom data filters

Bloom data bits/entry filters x x x

false Bloom data positive rate filters O(e -x ) O(e -x ) O(e -x )

false Bloom positive rate filters O(e -x ) O( e -x · log R ( N )) I/O O(e -x ) = O(e -x )

false Bloom positive rate filters O(e -x ) O(e -x ) O(e -x ) most memory

false Bloom positive rate filters O(e -x ) O(e -x ) O(e -x ) most memory saves at most 1 I/O!

reallocate

same memory - fewer false positives reallocate

relax false positive rates 0 < p 0 < 1 0 < p 1 < 1 0 < p 2 < 1

model relax read false positive rates = f( p 0 , p 1 …) cost 0 < p 0 < 1 0 < p 1 < 1 memory = f( p 0 , p 1 …) footprint 0 < p 2 < 1

model relax L read ∑ false positive rates = p i cost 1 0 < p 0 < 1 0 < p 1 < 1 L memory T L − i ⋅ ln( p i ) N ∑ = − ln(2) 2 footprint 0 < p 2 < 1 i

model relax optimize L read ∑ false positive rates = p i cost 1 0 < p 0 < 1 0 < p 1 < 1 L memory T L − i ⋅ ln( p i ) N ∑ = in terms of p 0 , p 1 … − ln(2) 2 footprint 0 < p 2 < 1 i

false positive rate p 0 ≈ O( e -x / R 2 ) p 1 ≈ O( e -x / R 1 ) O( e -x / R 0 ) p 2 ≈

false positive rate geometric O( e -x /R 2 ) progression = O(e - x ) I/O O( e -x /R 1 ) O( e -x /R 0 )

O( e -x · log R ( N )) > O( e - x ) I/O

O( e -x · log R ( N )) O( e - x ) I/O

O( e -x · log R ( N )) read latency (ms) RocksDB Monkey O( e - x ) I/O number of entries (log scale)

Existing Monkey

Existing Monkey Dostoevsky

tiering Monkey leveling

I/O overheads with leveling point long range short range writes

point false positive rates O( e - x / R 2 ) exponentially O( e - x / R ) decreasing O( e - x )

false positive rates O(e - x / R 2 ) O(e - x / R ) O(e - x ) largest level point

point long range short range writes largest level O(e - x )

Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky - PowerPoint PPT Presentation

Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky SIGMOD17 / SIGMOD18 Niv Dayan Log-Structured KV-Stores Log-Structured KV-Stores Why Log-Structured KV-Stores? Why Log-Structured KV-Stores? fast writes Why Log-Structured

(142733/102960-Log[4])+(614851/73920-2 Log[64]) h 2 +(2329/1680-Log[4]) h 4 -h 10 /20160

A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE

Chandra data reduction The CDFs Giorgio, Margherita, Elisabeta, Eleonora, Lazarus, Enrica,

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

Outline Scaling Scalinga Plenitude of Power Laws Scaling-at-large Scaling-at-large

UP UP AND OUT: SCALING SOFTWARE WITH AKKA Jonas Bonr CTO Typesafe @jboner Scaling software

Analysis of Scaling Algorithms for Matrix & Operator Scaling Contents Scaling Algorithms

Syslog and Log Rotate Computer Center, CS, NCTU Log files Execution information of each

Distributed ephemeral log service Log entries are replicated,dispersed See Ivy,

Section 3.7 Derivatives of logarithmic functions 1 Rules of exponentials and logarithms 1.

Machine Learning Fall 2017 Structured Prediction (structured perceptron, HMM, structured SVM)

Hash- Tables Introduction Dictionary Dictionary stores key-value pairs Find( k ) Insert( k

PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees Pandian Raju 1 ,

Pebbles DB: Building Key-Value Stores using Fragmented Log- Structured Merge Trees(II) Peter

Key-Value Stores Key-value stores are popular. web searching, social networks, e-commerce,

COLD STORES PLC - Company Profile Cold Stores manufacturers and markets a unique brand of ice

How to migrate da data from Mo Mong ngoDB B to Postgres with h ToroDB Wh Who we are Ex

Node.js MVC Construction Rabbit.js MVC @ Rabbit.js a fast

Peter Doschkinow ORACLE Deutschland B.V. & Co. KG The following is intended to outline our

Continuous Database Evolution Prof. Dr. Uta Strl Darmstadt University of Applied Sciences

PERFORMANCE FAULT TOLERANCE AVAILABILITY FEATURE VELOCITY PERFORMANCE FAULT TOLERANCE

Macaques Capuchin monkeys Lemurs Lorises Bush babies Pointed muzzles with wet noses and

Character Analysis: Monkey King Ryan C 4/8/2020 English 10 He wants to be god He is

Making Tweet Monkey by codefoster Who am I? codefoster codefoster.com || @codefoster ||

Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky - PowerPoint PPT Presentation

Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky SIGMOD17 / SIGMOD18 Niv Dayan Log-Structured KV-Stores Log-Structured KV-Stores Why Log-Structured KV-Stores? Why Log-Structured KV-Stores? fast writes Why Log-Structured

(142733/102960-Log[4])+(614851/73920-2 Log[64]) h 2 +(2329/1680-Log[4]) h 4 -h 10 /20160

A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE

Chandra data reduction The CDFs Giorgio, Margherita, Elisabeta, Eleonora, Lazarus, Enrica,

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

Outline Scaling Scalinga Plenitude of Power Laws Scaling-at-large Scaling-at-large

UP UP AND OUT: SCALING SOFTWARE WITH AKKA Jonas Bonr CTO Typesafe @jboner Scaling software

Analysis of Scaling Algorithms for Matrix &amp; Operator Scaling Contents Scaling Algorithms

Syslog and Log Rotate Computer Center, CS, NCTU Log files Execution information of each

Distributed ephemeral log service Log entries are replicated,dispersed See Ivy,

Section 3.7 Derivatives of logarithmic functions 1 Rules of exponentials and logarithms 1.

Machine Learning Fall 2017 Structured Prediction (structured perceptron, HMM, structured SVM)

Hash- Tables Introduction Dictionary Dictionary stores key-value pairs Find( k ) Insert( k

PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees Pandian Raju 1 ,

Pebbles DB: Building Key-Value Stores using Fragmented Log- Structured Merge Trees(II) Peter

Key-Value Stores Key-value stores are popular. web searching, social networks, e-commerce,

COLD STORES PLC - Company Profile Cold Stores manufacturers and markets a unique brand of ice

How to migrate da data from Mo Mong ngoDB B to Postgres with h ToroDB Wh Who we are Ex

Node.js MVC Construction Rabbit.js MVC @ Rabbit.js a fast

Peter Doschkinow ORACLE Deutschland B.V. &amp; Co. KG The following is intended to outline our

Continuous Database Evolution Prof. Dr. Uta Strl Darmstadt University of Applied Sciences

PERFORMANCE FAULT TOLERANCE AVAILABILITY FEATURE VELOCITY PERFORMANCE FAULT TOLERANCE

Macaques Capuchin monkeys Lemurs Lorises Bush babies Pointed muzzles with wet noses and

Character Analysis: Monkey King Ryan C 4/8/2020 English 10 He wants to be god He is

Making Tweet Monkey by codefoster Who am I? codefoster codefoster.com || @codefoster ||

Analysis of Scaling Algorithms for Matrix & Operator Scaling Contents Scaling Algorithms

Peter Doschkinow ORACLE Deutschland B.V. & Co. KG The following is intended to outline our