The Dictionary ADT The dictionary ADT models a searchable collection - - PowerPoint PPT Presentation

the dictionary adt
SMART_READER_LITE
LIVE PREVIEW

The Dictionary ADT The dictionary ADT models a searchable collection - - PowerPoint PPT Presentation

Dictionary ADT methods: The Dictionary ADT The dictionary ADT models a searchable collection findElement(k): if the dictionary has an item with key k, of key-element items returns its element, else, returns the special element The main


slide-1
SLIDE 1

1

1

The dictionary ADT models a searchable collection

  • f key-element items

The main operations of a dictionary are searching, inserting, and deleting items Multiple items with the same key are allowed Applications:

address book credit card authorization mapping host names (e.g., cs16.net) to internet

addresses (e.g., 128.148.34.101)

English dictionary

The Dictionary ADT

2

findElement(k): if the dictionary has an item with key k,

returns its element, else, returns the special element NO_SUCH_KEY

insertItem(k, o): inserts item (k, o) into the dictionary removeElement(k): if the dictionary has an item with key k,

removes it from the dictionary and returns its element, else returns the special element NO_SUCH_KEY

size(), isEmpty() keys(), Elements() findAllElements(k), removeAllElements(k)

Dictionary ADT methods:

3

Implementing a Dictionary with an Unordered Sequence

  • searching and removing takes O(n) time
  • inserting takes O(1) time
  • applications to log files (frequent insertions, rare searches

and removals)

4

Implementing a Dictionary with an Ordered Sequence

  • searching takes O(log n) time (binary search)
  • inserting and removing takes O(n) time
  • application to look-up tables (frequent searches,

rare insertions and removals)

slide-2
SLIDE 2

2

5

Binary Search

  • narrow down the search range in stages
  • “high-low” game
  • Example: findElement(7)

1 3 4 5 7 8 9 11 14 16 18 19 1 3 4 5 7 8 9 11 14 16 18 19 1 3 4 5 7 8 9 11 14 16 18 19 1 3 4 5 7 8 9 11 14 16 18 19

m l h m l h m l h l=m =h

6

Pseudocode for Binary Search

Algorithm BinarySearch(S, k, low, high) if low > high then return NO_SUCH_KEY else mid ← (low+high) / 2 if k = key(mid) then return key(mid) else if k < key(mid) then return BinarySearch(S, k, low, mid-1) else return BinarySearch(S, k, mid+1, high)

2 4 5 7 8 9 12

14

17 19 22 25 27 28 33 37 2 4 5 7 8 9 12 14 17 19 22

25

27 28 33 37 2 4 5 7 8 9 12 14 17

19

22 25 27 28 33 37 low high mid high mid low low mid

7

Running Time of Binary Search

  • The range of candidate items to be searched is halved

after each comparison In the array-based implementation, access by rank takes O(1) time, thus binary search runs in O(log n) time

8

Running Time of Binary Search

  • binary search runs in O(log n) time for findElement(k)
  • If we need to find all elements with a given key

(findAllElements(k)), it runs in O(log n + s), where s is the number of element of elements in the iterator returned.

– Simply do a binary search to find an element equal to k. Then step back through the array until you reach the first element equal to k. Finally, step forward through the area adding each element to the iterator until you reach the first element that is not equal to k. This takes O(logn) time for the search and then at most s time to search back to the beginning of the run of k’s and s time return all of the elements

  • k. Therefore we have a solution running in at most O(logn+s) time.
slide-3
SLIDE 3

3

9

New Insertion Sort

What is the running time of the insertion sort, using a sequence implemented with an array? If we use a binary search to do the insertions?

10

Binary Search Tree

Searching Cost of Searching Insertion Deletion

6 9 2 4 1 8

< > =

11

Binary Search Trees

  • A binary search tree is a binary tree T such that

– each internal node stores an item (k, e) of a dictionary. – keys stored at nodes in the left subtree of v are less than or equal to k. – keys stored at nodes in the right subtree of v are greater than

  • r equal to k.

– external nodes do not hold elements but serve as place holders.

12

Gregor Fabio Nicole Bob Frank

slide-4
SLIDE 4

4

13

10 3 1 17 8 5 9 15 20 Question: How can we traverse the tree so that we visit the elements in increasing key order?

14

Searching: findElement(k): Inserting: insertItem(k, o): Removing: removeElement(k):

Operations

15

Search

  • To search for a key k,

we trace a downward path starting at the root

  • The next node visited

depends on the

  • utcome of the

comparison of k with the key of the current node

  • If we reach a leaf, the

key is not found and we return NO_SUCH_KEY

  • Example:

findElement(4)

Algorithm findElement(k, v) if T.isExternal (v) return NO_SUCH_KEY if k < key(v) return findElement(k, T.leftChild(v)) else if k = key(v) return element(v) else { k > key(v) } return findElement(k, T.rightChild(v)) 6 9 2 4 1 8

< > =

16

Search Example I

  • A successful search traverses a path starting at the root

and ending at an internal node

  • How about findAllelements(k)?

Successful findElement(76)

76>44 76<88 76>65 76<82

slide-5
SLIDE 5

5

17

Algorithm findAllElements(k, v, c): Input: The search key k, a node of the binary search tree v and a container c Output: An iterator containing the found elements if v is an external node then return c.elements() if k = key(v) then c.addElement(v) return findAllElements(k,T.rightChild(v), c) else if k < key(v) then return findAllElements(k,T,leftChild(v)) else {we know k > key(v)} return findAllElements(k,T,rightChild(v)) Note that after finding k, if it occurs again, it will be in the left most internal node of the right subtree.

Search Example I

18

Search Example II

  • An unsuccessful search traverses a path starting at the

root and ending at an external node Unsuccessful findElement(25)

25<44 25>17 25<32 25<28 leaf node

19

Cost of Search: Worst Case

a account Africa apple arc

Average # of comparisons in the worst case:

2 4 1 3

Path to node i has length i, to get there we do 2i+1 comparisons Successful search Avg cost= (1/n)∑ (2i+1) = n

20

Cost of Search: Worst Case

a account Africa apple arc

Average # of comparisons in the worst case:

2 4 1 3

An unsuccessful search takes 2n comparisons for n internal nodes Unsuccessful search

slide-6
SLIDE 6

6

21

Cost of Search: Best Case

1 2 3 4 5 6 7 1 2 3 4 5 1 2 3 4 5 6 7 8 9

Leaves are on the same level or on an adjacent level. Length of path from root to node i = ⎣log i⎦ For a successful search, we do 2 comparison at each node along the path plus one at the end.

Σ2 ⎣log i⎦+1 =

i=1 n Average # of comparisons in the best case n 1 O(log n)

Comparisons to node i: 2 ⎣log i⎦ +1

22

Cost of Search: Best Case

1 2 3 4 5 6 7 1 2 3 4 5 1 2 3 4 5 6 7 8 9

Leaves are on the same level or on an adjacent level. Length of path from root to node i = ⎣log i⎦ For a failed search, we do 2 comparison at each node along the path plus two at the end. Only paths to external nodes count.

n+1 2E FAILED = n+1 2(I + 2n) = O(log n)

23

Insertion I

  • To perform insertItem(k, e), let w be the node returned by

TreeSearch(k, T.root())

  • If w is external, we know that k is not stored in T. We call

expandExternal(w) on T and store (k, e) in w

24

expandExternal(v): new1 and new 2 are the new nodes if isExternal(v) v.left ← new1 v.right ← new2 size ← size +2

expandExternal(v):

Transform v from an external node into an internal node by creating two new children

B D A C E B D A C E new1 new2

slide-7
SLIDE 7

7

25

Insertion II

  • If w is internal, we know another item with key k is stored at w. We call

the algorithm recursively starting at T.rightChild(w) or T.leftChild(w)

  • They idea is to store the new item in an external node which either

precedes or follows the items with the same key in an inorder traversal.

26

Construct a Tree

What would be the result of constructing a tree from repeated insertions of the following sequences? a. 5,8,3,7,1,9,2,4,6 b. 1,2,3,4,5,6,7,8,9 c. 5,4,6,3,7,2,8,1,9 When do you think trees work best?

27

Construct a Tree 2

What happens with repeated #’s? 5,8,3,7,1,5,9,5,2,4,5 What do you get from an inorder search of this tree?

28

Deletion

10 5 15 18 3 8

k

No children: simply remove

10 5 18 3 8

k

With one child: redirect (as indicated)

5 18 3 8

k

With two children: replace k with the child most to the right in its left sub- tree (or vice-versa), that is, the child that either precedes or follows it in an inorder traversal.

slide-8
SLIDE 8

8

29

Deletion I

  • To perform operation

removeElement(k), we search for key k

  • Assume key k is in the

tree, and let v be the node storing k

  • If node v has a leaf child

w, we remove v and w from the tree with operation removeAboveExternal(w)

  • Example: remove 4

6 9 2 4 1 8 5

v w

6 9 2 5 1 8

< >

30

removeAboveExternal(v):

B D A C E F G B D A C G B D A C G

31

removeAboveExternal(v): if isExternal(v) { p ← parent(v) s ← sibling(v) if isRoot(v) s.parent ← null and root ← s else { g ← parent(p) if (p is leftChild(g) g.left ← s else g.right ← s s.parent ← g } size ← size - 2 }

B D A C E F G B D A C G

B A G

v

32

slide-9
SLIDE 9

9

33

  • We consider the case

where the key k to be removed is stored at a node v whose children are both internal

– we find the internal node w that follows v in an inorder traversal – we copy key(w) into node v – we remove node w and its left child z (which must be a leaf) by means of

  • peration

removeAboveExternal(z)

  • Example: remove 3

3 1 8 6 9 5

v w z

2 5 1 8 6 9

v

2

Deletion II

34 35

Practice, practice, practice…

  • a. Delete the 3 from the tree you got

in the (a).

  • (a) 5,8,3,7,1,9,2,4,6
  • b. Now delete node 5.

36

Summary: Consider a dictionary with n items implemented by means

  • f a binary search tree of

height h

the space used is O(n) methods findElement ,

insertItem and removeElement take O(h) time

The height h is O(n) in the worst case and O(log n) in the best case

Cost of Inserting and Deleting = Cost of Search

slide-10
SLIDE 10

10

37

Performance of a dictionary implementation with a binary search tree

Tree T with height h for n key-element items

  • uses O(n) space
  • size, isEmpty : O(1)
  • findElement, insertItem,

removeElement : O(h)

  • findAllElements, removeAllElements :

O(h + s)

– s = size of the iterators returned

38

Conclusion

  • To achieve good running time,

we need to keep the tree balanced, i.e., with O(logn) height.

  • Various balancing schemes will be

explored next.