Skewed Binary Search Trees 6 2 10 1 3 7 12 4 8 11 13 5 9 - - PowerPoint PPT Presentation

skewed binary search trees
SMART_READER_LITE
LIVE PREVIEW

Skewed Binary Search Trees 6 2 10 1 3 7 12 4 8 11 13 5 9 - - PowerPoint PPT Presentation

Skewed Binary Search Trees 6 2 10 1 3 7 12 4 8 11 13 5 9 14 15 Gerth Stlting Brodal University of Aarhus Joint work with Gabriel Moruz presented at ESA06 CPH STL Workshop, University of Copenhagen, October 30, 2006. 1


slide-1
SLIDE 1

Skewed Binary Search Trees

13 14 15 11 12 7 8 9 10 3 4 5 1 2 6

Gerth Stølting Brodal University of Aarhus Joint work with Gabriel Moruz presented at ESA’06

CPH STL Workshop, University of Copenhagen, October 30, 2006.

1

slide-2
SLIDE 2

Perfectly Balanced Search Trees

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Skewed Binary Search Trees

2

slide-3
SLIDE 3

Skewed Binary Search Trees

x

⌊α(n − 1)⌋

1 − α

⌈(1 − α)(n − 1)⌉

α

Skewed Binary Search Trees

3

slide-4
SLIDE 4

Skewed Binary Search Trees

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

α = 0.5

Skewed Binary Search Trees

4

slide-5
SLIDE 5

Skewed Binary Search Trees

13 14 15 11 12 7 8 9 10 3 4 5 1 2 6

α = 0.4

Skewed Binary Search Trees

5

slide-6
SLIDE 6

Skewed Binary Search Trees

9 7 4 5 8 6 1 2 3 10 11 12 13 14 15

α = 0.2

Skewed Binary Search Trees

6

slide-7
SLIDE 7

Skewed Binary Search Trees

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

α = 0.05

Skewed Binary Search Trees

7

slide-8
SLIDE 8

Skewed Binary Search Trees

— Average Node Depth

≤ 1 −α log2 α − (1 − α) log2(1 − α)

  • H(α)

· n + 1 n · log2(n + 1) − 2

Nievergelt and E. M. Reingold, 1972

Skewed Binary Search Trees

8

slide-9
SLIDE 9

1/H(α)

α 1/H(α) 1 0.8 0.6 0.4 0.2 10 8 6 4 2

Skewed Binary Search Trees

9

slide-10
SLIDE 10

Comparisons

α Comparisons 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 3.5e+08 3e+08 2.5e+08 2e+08 1.5e+08 1e+08 5e+07

n = 50.000

Skewed Binary Search Trees

10

slide-11
SLIDE 11

Running Time

α Running time 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.28 0.275 0.27 0.265 0.26 0.255 0.25 0.245 0.24 0.235

Best running time achieved for α ≈ 0.3 !?

Skewed Binary Search Trees

11

slide-12
SLIDE 12

Conclusion

Skewed binary search trees can beat Perfectly balanced binary search trees !

Skewed Binary Search Trees

12

slide-13
SLIDE 13

Why ?

Skewed Binary Search Trees

13

slide-14
SLIDE 14

Why ?

The costs going left and right are different ! Possible reasons

  • Number of instructions
  • Branch mispredictions
  • Cache faults (what is a good memory layout?)
  • ...

Skewed Binary Search Trees

14

slide-15
SLIDE 15

Expected Cost

cost(α) = (α · {left cost} + (1 − α) · {right cost})/H(α)

α cost(α) 1 0.8 0.6 0.4 0.2 8 7 6 5 4 3 2 1

left cost = 1 and right cost = 3

Skewed Binary Search Trees

15

slide-16
SLIDE 16

Expected Cost

cost(α) = (α · {left cost} + (1 − α) · {right cost})/H(α)

α cost(α) 1 0.8 0.6 0.4 0.2 8 7 6 5 4 3 2 1

left cost = 1 and right cost = 0 .. 28

Skewed Binary Search Trees

16

slide-17
SLIDE 17

Experimental setup

  • AMD Athlon XP 2400+
  • 2.0 GHz
  • 256 KB L2 cache
  • 64 KB L1 data cache
  • 64 KB L1 instruction cache
  • 1GB RAM
  • Linux 2.6.8.1
  • GCC 3.3.2
  • Tree nodes = 12 bytes
  • No unsuccesful searches

Skewed Binary Search Trees

17

slide-18
SLIDE 18

Search Code

while(root!=NULLV) { if(key==t[root].key) return root; if(key>t[root].key) root=t[root].right; else root=t[root].left; }

Skewed Binary Search Trees

18

slide-19
SLIDE 19

Branch Mispredictions

α Branch mispredictions 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 8e+06 7e+06 6e+06 5e+06 4e+06 3e+06 2e+06

n = 50.000

Skewed Binary Search Trees

19

slide-20
SLIDE 20

Simple Layouts

7 1 3 4 5 6 2

4 5 1 7 6 2 3

Random – O( log n

H(α)) I/Os

1 2 3 4 5 6 7

Inorder – O( log n

H(α) − log B) I/Os

3 1 5 2 4 6 7

BFS – O( log n

H(α) − log B) I/Os

3 5 6 7 4 1 2

DFSr – O( α+(1−α)/B

H(α)

· log n) I/Os.

Skewed Binary Search Trees

20

slide-21
SLIDE 21

Running Time for Simple Layouts

Rand Inord BFS DFSr DFSl α Running time 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.4 0.35 0.3 0.25 0.2 0.15 0.1

DFS < Inorder < BFS < Random DFS achieves the best performance for α ≈ 0.2 !

Skewed Binary Search Trees

21

slide-22
SLIDE 22

Cache Faults for Simple Layouts

Rand Inord BFS DFSr DFSl α Cache misses 1 0.8 0.6 0.4 0.2 3e+07 2.5e+07 2e+07 1.5e+07 1e+07 5e+06

DFS ≈ expected left cost = 1 and right cost = 1/B.

Skewed Binary Search Trees

22

slide-23
SLIDE 23

Blocked Layouts — k-level blocking

13 14 15 11 12 7 8 9 10 3 4 5 1 2 6

k = 2

6 2 10 1 3 4 5 7 8 9 12 11 13 14 15

  • layout the nodes of the first k levels
  • recurse on subtrees
  • a search uses O(logB n/H(α)) I/Os

Skewed Binary Search Trees

23

slide-24
SLIDE 24

Blocked Layouts — pqDFSk

13 14 15 11 12 7 8 9 10 3 4 5 1 2 6

9 2 1 1 15 5 2 3 1 1 3 2 1 5 3

k = 3

6 10 2 12 13 14 15 1 5 4 3 9 8 7 11

  • layout the k heavest nodes in order of decreasing size
  • recurse on subtrees in order of decreasing size
  • a search uses O(logBα+1 n) I/Os

Skewed Binary Search Trees

24

slide-25
SLIDE 25

Blocked Layouts — veb

13 14 15 11 12 7 8 9 10 3 4 5 1 2 6

9 2 1 1 15 5 2 3 1 1 2 1 5 3 3

top

6 10 2 12 13 14 15 1 7 8 9 3 4 5 11

  • top = ⌈√n⌉ heavest nodes
  • recurse on top and bottom trees in order of decreasing size
  • a search uses O(logBα+1 n) I/Os

Skewed Binary Search Trees

25

slide-26
SLIDE 26

Running Time for Blocked Layouts

pqDFS bDFS vEB DFSr α Running time 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.22 0.21 0.2 0.19 0.18 0.17 0.16 0.15 0.14 0.13

vEB achieves the fastest running time for α ≈ .25

Skewed Binary Search Trees

26

slide-27
SLIDE 27

Cache Faults for Blocked Layouts

pqDFS bDFS vEB DFSr α Cache misses 1 0.8 0.6 0.4 0.2 1e+07 8e+06 6e+06 4e+06 2e+06

vEB achieves the smallest number of cache faults

Skewed Binary Search Trees

27

slide-28
SLIDE 28

Experimental Summary

α Comparisons 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 3.5e+08 3e+08 2.5e+08 2e+08 1.5e+08 1e+08 5e+07 pqDFS bDFS vEB DFSr α Running time 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.22 0.21 0.2 0.19 0.18 0.17 0.16 0.15 0.14 0.13 α Branch mispredictions 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 8e+06 7e+06 6e+06 5e+06 4e+06 3e+06 2e+06 pqDFS bDFS vEB DFSr α Cache misses 1 0.8 0.6 0.4 0.2 1e+07 8e+06 6e+06 4e+06 2e+06

Skewed Binary Search Trees

28

slide-29
SLIDE 29

Conclusion

Skewed binary search trees beat Perfectly balanced binary search trees because The costs going left and right are different !

Skewed Binary Search Trees

29