Class 14: Log-Structured-Merge Trees Instructor: Manos Athanassoulis - - PowerPoint PPT Presentation

class 14 log structured merge trees
SMART_READER_LITE
LIVE PREVIEW

Class 14: Log-Structured-Merge Trees Instructor: Manos Athanassoulis - - PowerPoint PPT Presentation

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis CS460: Intro to Database Systems Class 14: Log-Structured-Merge Trees Instructor: Manos Athanassoulis https://midas.bu.edu/classes/CS460/ based on slides from Niv


slide-1
SLIDE 1

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

CS460: Intro to Database Systems

Class 14: Log-Structured-Merge Trees

Instructor: Manos Athanassoulis

https://midas.bu.edu/classes/CS460/

based on slides from Niv Dayan

slide-2
SLIDE 2

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Useful when?

Massive dataset Rapid updates/insertions Fast lookups LSM-trees are for you.

slide-3
SLIDE 3

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Time 1980 1990 2000 2010

Why now?

Patrick O'Neil UMass Boston Invented in 1996

slide-4
SLIDE 4

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Outline

1. Storage devices 2. Indexing problem & basic solutions 3. Basic LSM-trees 4. Leveled LSM-trees 5. Tiered LSM-trees 6. Bloom filters

slide-5
SLIDE 5

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Storage devices

slide-6
SLIDE 6

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Main Memory expensive, fast Disk cheap, slow All data Metadata & frequently accessed data

The Memory Hierarchy

slide-7
SLIDE 7

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

≈100 ns ≈10 ms ≈5-6 order of magnitude difference

slide-8
SLIDE 8

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Why is disk slow?

Random access is slow move disk head Sequential access is faster let disk spin Disk head

slide-9
SLIDE 9

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

4 kilobyte chunks Coarse access granularity Blocks 64 byte chunks Fine access granularity Words

slide-10
SLIDE 10

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

4 kilobyte chunks Coarse access granularity Blocks 64 byte chunks Fine access granularity Words

slide-11
SLIDE 11

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Outline

1. Storage devices 2. Indexing problem & basic solutions 3. Basic LSM-trees 4. Leveled LSM-trees 5. Tiered LSM-trees 6. Bloom filters

slide-12
SLIDE 12

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Outline

1. Storage devices 2. Indexing problem & basic solutions 3. Basic LSM-trees 4. Leveled LSM-trees 5. Tiered LSM-trees 6. Bloom filters

slide-13
SLIDE 13

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Indexing Problem & Basic Solutions

slide-14
SLIDE 14

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Indexing Problem

names phone numbers

Structure on disk? Lookup cost? Insertion cost?

slide-15
SLIDE 15

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Data Structure Lookup cost Insertion cost Sorted array Log B-tree Basic LSM-tree Leveled LSM-tree Tiered LSM-tree

Results Catalogue

Compare and contrast data structures. What to use when?

slide-16
SLIDE 16

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Data Structure Lookup cost Insertion cost Sorted array Log B-tree Basic LSM-tree Leveled LSM-tree Tiered LSM-tree

Results Catalogue

Compare and contrast data structures. What to use when?

slide-17
SLIDE 17

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Modeling Performance

≈100 ns ≈10 ms

Measure bottleneck: Number of block reads/writes (I/O)

≈1 ns

4 kilobyte Blocks 64 byte Words

slide-18
SLIDE 18

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Sorted Array

N entries B entries fit into a disk block Array spans N/B disk blocks Lookup method & cost? Binary search: O log%

& '

I/Os Insertion cost? Push entries: O

( ' ) & '

I/Os

Block 1 Block 2 … Block N/B Anne Bob Yulia Arnold Corrie Zack Barbara Doug Zelda Buffer James Sara Array size Pointer

slide-19
SLIDE 19

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Lookup cost Insertion cost Sorted array O(log2(N/B)) O(N/B2) Log B-tree Basic LSM-tree Leveled LSM-tree Tiered LSM-tree

Results Catalogue

slide-20
SLIDE 20

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Lookup cost Insertion cost Sorted array O(log2(N/B)) O(N/B2) Log B-tree Basic LSM-tree Leveled LSM-tree Tiered LSM-tree

Results Catalogue

slide-21
SLIDE 21

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Log (append-only array)

Block 1 Block 2 … Block N/B Doug Yulia Anne Zelda Zack Bob Arnold Barbara Corrie Buffer James Sara Array size Pointer

N entries B entries fit into a disk block Array spans N/B disk blocks Lookup method & cost? Scan: O

" #

Insertion cost? Append: O

$ #

slide-22
SLIDE 22

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Lookup cost Insertion cost Sorted array O(log2(N/B)) O(N/B2) Log O(N/B) O(1/B) B-tree Basic LSM-tree Leveled LSM-tree Tiered LSM-tree

Results Catalogue

slide-23
SLIDE 23

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Lookup cost Insertion cost Sorted array O(log2(N/B)) O(N/B2) Log O(N/B) O(1/B) B-tree Basic LSM-tree Leveled LSM-tree Tiered LSM-tree

Results Catalogue

slide-24
SLIDE 24

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

B-tree

Lookup method & cost? Tree search: O log%

& %

Anne Arnold Yulia Zack Corrie Doug Bob Barbara Anne Bob Corrie … … Yulia

… …

Anne … …

Depth: O(logB(N/B))

Insertion method & cost? Tree search & append: O log%

& %

slide-25
SLIDE 25

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Lookup cost Insertion cost Sorted array O(log2(N/B)) O(N/B2) Log O(N/B) O(1/B) B-tree O(logB(N/B)) O(logB(N/B)) Basic LSM-tree Leveled LSM-tree Tiered LSM-tree

Results Catalogue

slide-26
SLIDE 26

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

B-trees

Goetz Graefe Microsoft, HP Fellow, now Google ACM Software System Award

“It could be said that the world’s information is at our fingertips because of B-trees”

slide-27
SLIDE 27

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

B-trees are no longer sufficient

Cheaper to store data Workloads more insert-intensive We need better insert-performance.

slide-28
SLIDE 28

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Lookup cost Insertion cost Sorted array O(log2(N/B)) O(N/B2) Log O(N/B) O(1/B) B-tree O(logB(N/B)) O(logB(N/B)) Basic LSM-tree Leveled LSM-tree Tiered LSM-tree

Results Catalogue

Goal to combine sub-constant insertion cost logarithmic lookup cost

slide-29
SLIDE 29

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-trees

slide-30
SLIDE 30

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree

Level 1 2 3 Buffer Sorted arrays

slide-31
SLIDE 31

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree

Level 1 2 3 Buffer Sorted arrays Design principle #1:

  • ptimize for insertions by buffering
slide-32
SLIDE 32

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree

Level 1 2 3 Buffer Sorted arrays

… … …

Inserts Design principle #1:

  • ptimize for insertions by buffering
slide-33
SLIDE 33

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree

Level 1 2 3 Buffer Sorted arrays

... ... …

sort & flush buffer Inserts Design principle #1:

  • ptimize for insertions by buffering
slide-34
SLIDE 34

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree

Level 1 2 3 Buffer Sorted arrays

... ... … … ... …

sort & flush buffer Inserts Design principle #1:

  • ptimize for insertions by buffering
slide-35
SLIDE 35

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree

Level 1 2 3 Buffer Sorted arrays

... ... … … ... …

sort & flush buffer Inserts Design principle #1:

  • ptimize for insertions by buffering

Design principle #2:

  • ptimize for lookups by sort-merging arrays
slide-36
SLIDE 36

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree

Level 1 2 3 Buffer Sorted arrays

... ... …

Sort-merge

… ... … … ... ... … …

Inserts sort & flush buffer Design principle #1:

  • ptimize for insertions by buffering

Design principle #2:

  • ptimize for lookups by sort-merging arrays
slide-37
SLIDE 37

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree

Level 1 2 3 Buffer Sorted arrays

X1 ... …

Sort-merge & Eliminate duplicates

… X2 … … ... ... … …

Inserts sort & flush buffer Design principle #1:

  • ptimize for insertions by buffering

Design principle #2:

  • ptimize for lookups by sort-merging arrays
slide-38
SLIDE 38

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree

Level 1 2 3 Buffer Sorted arrays

X1 ... …

Sort-merge & Eliminate duplicates

… X2 … … X2 ... … …

Inserts sort & flush buffer Design principle #1:

  • ptimize for insertions by buffering

Design principle #2:

  • ptimize for lookups by sort-merging arrays
slide-39
SLIDE 39

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree

Level 1 2 3 Buffer Sorted arrays

X1 ... …

Sort-merge & Eliminate duplicates & Discard original arrays

… X2 …

Inserts sort & flush buffer Design principle #1:

  • ptimize for insertions by buffering

Design principle #2:

  • ptimize for lookups by sort-merging arrays

… X2 ... … …

slide-40
SLIDE 40

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree – Example

Level 1 2 3 Buffer Sorted arrays

slide-41
SLIDE 41

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree – Example

Level 1 2 3 Buffer Sorted arrays

4 6 9

inserts

slide-42
SLIDE 42

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree – Example

Level 1 2 3 Buffer Sorted arrays

4 6 9

sort & flush buffer inserts

slide-43
SLIDE 43

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree – Example

Level 1 2 3 Buffer Sorted arrays

4 6 9

inserts

slide-44
SLIDE 44

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree – Example

Level 1 2 3 Buffer Sorted arrays

3 4 8 4 6 9

inserts

slide-45
SLIDE 45

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree – Example

Level 1 2 3 Buffer Sorted arrays

4 6 9

inserts

3 4 8

sort & flush buffer

slide-46
SLIDE 46

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree – Example

Level 1 2 3 Buffer Sorted arrays

4 6 9

inserts

3 4 8

slide-47
SLIDE 47

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree – Example

Level 1 2 3 Buffer Sorted arrays

4 6 9

inserts

3 4 8 3 4 6 8 9

Sort-merge

slide-48
SLIDE 48

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree – Example

Level 1 2 3 Buffer Sorted arrays

41 6 9

inserts

3 42 8 3 42 6 8 9

Sort-merge & Eliminate duplicates

slide-49
SLIDE 49

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree – Example

Level 1 2 3 Buffer Sorted arrays

4 6 9

inserts

3 4 8 3 4 6 8 9

Sort-merge & Eliminate duplicates & Discard original arrays

slide-50
SLIDE 50

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree – Example

Level 1 2 3 Buffer Sorted arrays inserts

3 4 6 8 9

INSTITUTE FOR APPLIED COMPUTATIONAL SCIENCE

slide-51
SLIDE 51

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree – Example

Level 1 2 3 Buffer Sorted arrays

2 7 8

inserts

3 4 6 8 9

slide-52
SLIDE 52

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree – Example

Level 1 2 3 Buffer Sorted arrays inserts

3 4 6 8 9 2 7 8

sort & flush buffer

slide-53
SLIDE 53

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree – Example

Level 1 2 3 Buffer Sorted arrays inserts

3 4 6 8 9 2 7 8

slide-54
SLIDE 54

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree

Level 1 2 3 Buffer Sorted arrays

… … …

Capacity 1 2 4 8

... ... ... ... ... … ... ... ... ... ... ... ... ... … ... ... ... ... ... …

Levels have exponentially increasing capacities.

slide-55
SLIDE 55

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree – Lookup cost

Level 1 2 3 Buffer Sorted arrays

… … …

Capacity 1 2 4 8

... ... ... ... ... … ... ... ... ... ... ... ... ... … ... ... ... ... ... …

Lookup method? Search youngest to oldest.

O log%

& '

How? Binary search.

O log%

& '

Lookup cost?

O log%

& ' %

Search youngest to oldest.

O log%

& '

Binary search.

O log%

& '

O log%

& ' %

slide-56
SLIDE 56

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Basic LSM-tree – Insertion cost

Level 1 2 3 Buffer Sorted arrays

… … …

Capacity 1 2 4 8

... ... ... ... ... … ... ... ... ... ... ... ... ... … ... ... ... ... ... …

O log% & ' O 1 ' ) log% & ' O 1 '

How many times is each entry copied? What is the price of each copy? Total insert cost?

slide-57
SLIDE 57

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Lookup cost Insertion cost Sorted array O(log2(N/B)) O(N/B) Log O(N/B) O(1/B) B-tree O(logB(N/B)) O(logB(N/B)) Basic LSM-tree O(log2(N/B)2) O(1/B ! log2(N/B)) Leveled LSM-tree Tiered LSM-tree

Results Catalogue

slide-58
SLIDE 58

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Lookup cost Insertion cost Sorted array O(log2(N/B)) O(N/B) Log O(N/B) O(1/B) B-tree O(logB(N/B)) O(logB(N/B)) Basic LSM-tree O(log2(N/B)2) O(1/B ! log2(N/B)) Leveled LSM-tree Tiered LSM-tree

Results Catalogue

Better insert cost and worst lookup cost compared with B-trees

slide-59
SLIDE 59

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Lookup cost Insertion cost Sorted array O(log2(N/B)) O(N/B) Log O(N/B) O(1/B) B-tree O(logB(N/B)) O(logB(N/B)) Basic LSM-tree O(log2(N/B)2) O(1/B ! log2(N/B)) Leveled LSM-tree Tiered LSM-tree

Results Catalogue

Better insert cost and worst lookup cost compared with B-trees Can we improve lookup cost?

slide-60
SLIDE 60

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Declining Main Memory Cost

slide-61
SLIDE 61

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Declining Main Memory Cost

Store a fence pointer for every block in main memory

Fence pointers array

1

10 15 …

Block 1 Block 2 Block 3 … 1 10 15 … 3 11 16 … 6 13 18 …

slide-62
SLIDE 62

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Lookup cost Insertion cost Sorted array O(log2(N/B)) O(N/B) Log O(N/B) O(1/B) B-tree O(logB(N/B)) O(logB(N/B)) Basic LSM-tree O(log2(N/B)2) O(1/B ! log2(N/B)) Leveled LSM-tree Tiered LSM-tree

Results Catalogue – with fence pointers

slide-63
SLIDE 63

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Lookup cost Insertion cost Sorted array O(log2(N/B)) O(N/B) Log O(N/B) O(1/B) B-tree O(logB(N/B)) O(logB(N/B)) Basic LSM-tree O(log2(N/B)2) O(1/B ! log2(N/B)) Leveled LSM-tree Tiered LSM-tree

Results Catalogue – with fence pointers

slide-64
SLIDE 64

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Lookup cost Insertion cost Sorted array O(1) O(N/B) Log O(N/B) O(1/B) B-tree O(logB(N/B)) O(logB(N/B)) Basic LSM-tree O(log2(N/B)2) O(1/B ! log2(N/B)) Leveled LSM-tree Tiered LSM-tree

Results Catalogue – with fence pointers

slide-65
SLIDE 65

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Lookup cost Insertion cost Sorted array O(1) O(N/B) Log O(N/B) O(1/B) B-tree O(logB(N/B)) O(logB(N/B)) Basic LSM-tree O(log2(N/B)2) O(1/B ! log2(N/B)) Leveled LSM-tree Tiered LSM-tree

Results Catalogue – with fence pointers

slide-66
SLIDE 66

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Lookup cost Insertion cost Sorted array O(1) O(N/B) Log O(N/B) O(1/B) B-tree O(logB(N/B)) O(logB(N/B)) Basic LSM-tree O(log2(N/B)2) O(1/B ! log2(N/B)) Leveled LSM-tree Tiered LSM-tree

Results Catalogue – with fence pointers

slide-67
SLIDE 67

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Lookup cost Insertion cost Sorted array O(1) O(N/B) Log O(N/B) O(1/B) B-tree O(1) O(1) Basic LSM-tree O(log2(N/B)2) O(1/B ! log2(N/B)) Leveled LSM-tree Tiered LSM-tree

Results Catalogue – with fence pointers

slide-68
SLIDE 68

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Lookup cost Insertion cost Sorted array O(1) O(N/B) Log O(N/B) O(1/B) B-tree O(1) O(1) Basic LSM-tree O(log2(N/B)2) O(1/B ! log2(N/B)) Leveled LSM-tree Tiered LSM-tree

Results Catalogue – with fence pointers

slide-69
SLIDE 69

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Lookup cost Insertion cost Sorted array O(1) O(N/B) Log O(N/B) O(1/B) B-tree O(1) O(1) Basic LSM-tree O(log2(N/B)) O(1/B ! log2(N/B)) Leveled LSM-tree Tiered LSM-tree

Results Catalogue – with fence pointers

slide-70
SLIDE 70

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Lookup cost Insertion cost Sorted array O(1) O(N/B) Log O(N/B) O(1/B) B-tree O(1) O(1) Basic LSM-tree O(log2(N/B)) O(1/B ! log2(N/B)) Leveled LSM-tree Tiered LSM-tree

Results Catalogue – with fence pointers

Quick sanity check: suppose N = 242 and B = 210

slide-71
SLIDE 71

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Lookup cost Insertion cost Sorted array O(1) O(232) Log O(232) O(2-10) B-tree O(1) O(1) Basic LSM-tree O(5) O(2-10 ! 5) Leveled LSM-tree Tiered LSM-tree

Results Catalogue – with fence pointers

Quick sanity check: suppose N = 242 and B = 210

slide-72
SLIDE 72

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Leveled LSM-tree

Lookup cost Update cost

slide-73
SLIDE 73

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Leveled LSM-tree

Level 1 2 3 Buffer Sorted arrays

… … …

Lookup cost depends on number of levels How to reduce it? Capacity T0 T1 T2 T3 Increase size ratio T

slide-74
SLIDE 74

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Leveled LSM-tree

Level 1 2 3 Buffer Sorted arrays

… … …

Capacity 1 4 16 64 Lookup cost depends on number of levels How to reduce it? E.g. size ratio of 4 Increase size ratio T

slide-75
SLIDE 75

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Leveled LSM-tree

Level 1 2 3 Buffer Sorted arrays

… … …

Capacity 1 4 16 64 inserts Lookup cost depends on number of levels How to reduce it? E.g. size ratio of 4 Increase size ratio T

slide-76
SLIDE 76

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Leveled LSM-tree

Level 1 2 3 Buffer Sorted arrays

… … …

Capacity 1 4 16 64 Lookup cost depends on number of levels How to reduce it? E.g. size ratio of 4 flush

… … …

inserts Increase size ratio T

slide-77
SLIDE 77

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Leveled LSM-tree

Level 1 2 3 Buffer Sorted arrays

… … …

Capacity 1 4 16 64 Lookup cost depends on number of levels How to reduce it? E.g. size ratio of 4 flush & sort-merge

… … … … … …

inserts Increase size ratio T

slide-78
SLIDE 78

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Leveled LSM-tree

Level 1 2 3 Buffer Sorted arrays

… … …

Capacity 1 4 16 64 Lookup cost depends on number of levels How to reduce it? E.g. size ratio of 4 flush & sort-merge

… … … … … … … … …

inserts Increase size ratio T

slide-79
SLIDE 79

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Leveled LSM-tree

Level 1 2 3 Buffer Sorted arrays

… … …

Capacity 1 4 16 64 Lookup cost depends on number of levels How to reduce it? E.g. size ratio of 4 flush & sort-merge

… … … … … … … … … … … …

inserts Increase size ratio T

slide-80
SLIDE 80

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Leveled LSM-tree

Level 1 2 3 Buffer Sorted arrays

… … …

Capacity 1 4 16 64 Lookup cost depends on number of levels How to reduce it? E.g. size ratio of 4 move

… … … … … … … … … … … …

inserts Increase size ratio T

slide-81
SLIDE 81

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Leveled LSM-tree

Level 1 2 3 Buffer Sorted arrays

… … …

Capacity 1 4 16 64 Lookup cost depends on number of levels How to reduce it? E.g. size ratio of 4 inserts

… … … … … … … … … … … …

Increase size ratio T

slide-82
SLIDE 82

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Leveled LSM-tree

Level 1 2 3 Buffer Sorted arrays

… … …

Capacity 1 4 16 64

… … … … … … … … … … … …

inserts

Lookup cost? O log%

& '

Insertion cost? O

% ' ( log% & '

slide-83
SLIDE 83

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Leveled LSM-tree

Lookup cost? O log%

& '

What happens as we increase the size ratio T? What happens when size ratio T is set to be N/B? Lookup cost becomes: Insert cost becomes: O(1) O(N/B2) The LSM-tree becomes a sorted array! Insertion cost? O

% ' ( log% & '

slide-84
SLIDE 84

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Lookup cost Insertion cost

Sorted array Basic LSM-tree L e v e l i n g

slide-85
SLIDE 85

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Lookup cost Insertion cost Sorted array O(1) O(N/B) Log O(N/B) O(1/B) B-tree O(1) O(1) Basic LSM-tree O(log2(N/B)) O(1/B ! log2(N/B)) Leveled LSM-tree O(logT(N/B)) O(T/B ! logT(N/B)) Tiered LSM-tree

Results Catalogue – with fence pointers

slide-86
SLIDE 86

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Tiered LSM-tree

Lookup cost Insertion cost

slide-87
SLIDE 87

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Tiered LSM-tree

Level 1 2 3 Buffer Sorted arrays

… … …

Capacity T0 T1 T2 T3 Reduce the number of levels by increasing the size ratio. Do not merge within a level.

slide-88
SLIDE 88

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Tiered LSM-tree

Level 1 2 3 Buffer Sorted arrays

… … …

Capacity 1 4 16 64 Reduce the number of levels by increasing the size ratio. Do not merge within a level. E.g. size ratio of 4

slide-89
SLIDE 89

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Tiered LSM-tree

Level 1 2 3 Buffer Sorted arrays

… … …

Capacity 1 4 16 64 flush

… … …

inserts Reduce the number of levels by increasing the size ratio. Do not merge within a level. E.g. size ratio of 4

slide-90
SLIDE 90

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Tiered LSM-tree

Level 1 2 3 Buffer Sorted arrays

… … …

Capacity 1 4 16 64 flush

… … …

inserts

… … …

Reduce the number of levels by increasing the size ratio. Do not merge within a level. E.g. size ratio of 4

slide-91
SLIDE 91

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Tiered LSM-tree

Level 1 2 3 Buffer Sorted arrays

… … …

Capacity 1 4 16 64 flush inserts

… … … … … … … … …

Reduce the number of levels by increasing the size ratio. Do not merge within a level. E.g. size ratio of 4

slide-92
SLIDE 92

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Tiered LSM-tree

Level 1 2 3 Buffer Sorted arrays

… … …

Capacity 1 4 16 64 flush inserts

… … … … … … … … … … … …

Reduce the number of levels by increasing the size ratio. Do not merge within a level. E.g. size ratio of 4

slide-93
SLIDE 93

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Tiered LSM-tree

Level 1 2 3 Buffer Sorted arrays

… … …

Capacity 1 4 16 64 inserts

… … … … … … … … … … … …

sort-merge

… … … … … … … … … … … …

Reduce the number of levels by increasing the size ratio. Do not merge within a level. E.g. size ratio of 4

slide-94
SLIDE 94

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Tiered LSM-tree

Level 1 2 3 Buffer Sorted arrays

… … …

Capacity 1 4 16 64 inserts

… … … … … … … … … … … …

Reduce the number of levels by increasing the size ratio. Do not merge within a level. E.g. size ratio of 4

slide-95
SLIDE 95

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Tiered LSM-tree

Level 1 2 3 Buffer Sorted arrays

… … …

Capacity 1 4 16 64

… … … … … … … … … … … …

inserts

Lookup cost? O " # log'

( )

Insertion cost? O

* ) # log' ( )

slide-96
SLIDE 96

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Insertion cost? O

" # $ log( ) #

Tiered LSM-tree

Lookup cost? O * $ log(

) #

What happens as we increase the size ratio T? What happens when size ratio T is set to be N/B? Lookup cost becomes: Insert cost becomes: O(N/B) O(1/B) The tiered LSM-tree becomes a log!

slide-97
SLIDE 97

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Lookup cost Insertion cost

Sorted array Basic LSM-tree L e v e l i n g Tiering Log

slide-98
SLIDE 98

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Lookup cost Insertion cost Sorted array O(1) O(N/B) Log O(N/B) O(1/B) B-tree O(1) O(1) Basic LSM-tree O(log2(N/B)) O(1/B ! log2(N/B)) Leveled LSM-tree O(logT(N/B)) O(T/B ! logT(N/B)) Tiered LSM-tree O(T ! logT(N/B)) O(1/B ! logT(N/B))

Results Catalogue – with fence pointers

slide-99
SLIDE 99

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Bloom filters

slide-100
SLIDE 100

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Declining Main Memory Cost

slide-101
SLIDE 101

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Bloom Filters

filters array

… … ... … … … … … X … … …

Answers set-membership queries Smaller than array, and stored in main memory Purpose: avoid accessing disk if entry is not in array Subtlety: may return false positives.

Bloom filter

slide-102
SLIDE 102

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Bloom Filters

filters array

… … ... … … … … … X … … …

Answers set-membership queries Smaller than array, and stored in main memory Purpose: avoid accessing disk if entry is not in array Subtlety: may return false positives.

Bloom filter

Lookup for X

slide-103
SLIDE 103

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Bloom Filters

filters array

… … ... … … … … … X … … …

Answers set-membership queries Smaller than array, and stored in main memory Purpose: avoid accessing disk if entry is not in array Subtlety: may return false positives.

Bloom filter

Lookup for X

slide-104
SLIDE 104

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Bloom Filters

filters array

… … ... … … … … … X … … …

Answers set-membership queries Smaller than array, and stored in main memory Purpose: avoid accessing disk if entry is not in array Subtlety: may return false positives.

Bloom filter

Lookup for X Access on disk

slide-105
SLIDE 105

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Bloom Filters

filters array

… … ... … … … … … X … … …

Answers set-membership queries Smaller than array, and stored in main memory Purpose: avoid accessing disk if entry is not in array Subtlety: may return false positives.

Bloom filter

Lookup for Y

slide-106
SLIDE 106

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Bloom Filters

filters array

… … ... … … … … … X … … …

Answers set-membership queries Smaller than array, and stored in main memory Purpose: avoid accessing disk if entry is not in array Subtlety: may return false positives.

Bloom filter

Lookup for Y

slide-107
SLIDE 107

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Bloom Filters

filters array

… … ... … … … … … X … … …

Answers set-membership queries Smaller than array, and stored in main memory Purpose: avoid accessing disk if entry is not in array Subtlety: may return false positives.

Bloom filter

Lookup for Y

slide-108
SLIDE 108

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Bloom Filters

filters array

… … ... … … … … … X … … …

Answers set-membership queries Smaller than array, and stored in main memory Purpose: avoid accessing disk if entry is not in array Subtlety: may return false positives.

Bloom filter

Lookup for Z

slide-109
SLIDE 109

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Bloom Filters

filters array

… … ... … … … … … X … … …

Answers set-membership queries Smaller than array, and stored in main memory Purpose: avoid accessing disk if entry is not in array Subtlety: may return false positives.

Bloom filter

Lookup for Z

slide-110
SLIDE 110

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Bloom Filters

filters array

… … ... … … … … … X … … …

Answers set-membership queries Smaller than array, and stored in main memory Purpose: avoid accessing disk if entry is not in array Subtlety: may return false positives.

Bloom filter

Lookup for Z Access on disk

slide-111
SLIDE 111

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Bloom Filters

The more main memory, the fewer false positives cheaper lookups

slide-112
SLIDE 112

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Bloom Filters

Lookup cost Insertion cost

The more main memory, the fewer false positives cheaper lookups

slide-113
SLIDE 113

CAS CS 460 [Fall 2019] - https://midas.bu.edu/classes/CS460/ - Manos Athanassoulis

Conclusions

Write-optimized Highly tunable Backbone of many modern systems Trade-off between lookup and insert cost (tiering/leveling, size ratio) Trade main memory for lookup cost (fence pointers, Bloom filters) Thank you!