binary search trees
play

Binary Search Trees These slides are not fully polished: - some - PowerPoint PPT Presentation

Binary Search Trees These slides are not fully polished: - some transitions are rough - some topics are not covered -they probably contain mistakes Be aware of this as you use them. Reflecting on Dictionaries Cost Worst-case complexity o


  1. Binary Search Trees These slides are not fully polished: - some transitions are rough - some topics are not covered -they probably contain mistakes Be aware of this as you use them.

  2. Reflecting on Dictionaries

  3. Cost  Worst-case complexity o assuming the dictionary contains n entries Unsorted array Array sorted by key Linked list Hash Table O(1) lookup O(n) O(log n) O(n) average and amortized O(1) O(1) O(n) O(1) insert amortized average and amortized  Hash dictionaries are clearly the best implementation  O(1) lookup and insertion are hard to beat!

  4. Cost  Hash dictionaries are clearly the best implementation  O(1) lookup and insertion are hard to beat! or are they?  It’s O(1) average Always read Always read o we could be (very) unlucky and incur an O(n) cost the fine prints! the fine prints!  e.g., if we use a poor hash function  It’s O(1) amortized o from time to time, we need to resize the table  then the operation costs O(n) Using hash dictionaries is too risky or not good enough for applications that require a  Operations like finding the entry with guaranteed (short) response time the minimum key cost O(n) o we have to check every entry But they are great for applications that don’t have such constraints

  5. Goal  Develop a data structure that has guaranteed O(log n) worst-case complexity for lookup, insert and find_min  always! o O(1) would be great but we can’t get that Unsorted Array sorted Linked list Hash Table array by key O(1) O(n) O(log n) O(n) O(log n) lookup average and amortized O(1) O(1) insert O(n) O(1) O(log n) amortized average and amortized O(n) O(1) O(n) O(n) O(log n) find_min Exercise Exercise Exercise Exercise

  6. Getting Started  The only O(log n) so far is lookup in sorted arrays Unsorted Array sorted Linked list Hash Table array by key O(1) O(n) O(log n) O(n) O(log n) lookup average and amortized O(1) O(1) O(n) O(1) O(log n) insert amortized average and amortized find_min O(n) O(1) O(n) O(n) O(log n)  That’s binary search o Let’s start there

  7. Searching Sorted Data

  8. Searching for a Number  Consider the following sorted array 0 1 2 3 4 5 6 7 8 9 -2 0 4 7 12 19 22 42 65  When searching for a number x using binary search, we always start by looking at the midpoint, index 4 12 We always look at this element  Then, 3 things can happen o x = 12 (and we are done) o x < 12 o x > 12

  9. Searching for a Number  If x < 12, the next index we look at is necessarily 2  If x > 12, the next index we look at is necessarily 7 12 if x < 12 if x > 12 4 42 Next, we may look at these elements 0 1 2 3 4 5 6 7 8 9 -2 0 4 7 12 19 22 42 65

  10. Searching for a Number  Assume x < 12, so we look at 4 o if x = 4, we are done o if x < 4, we necessarily look at 0 o if x > 4, we necessarily look at 7 12 if x < 12 if x > 12 4 42 if x < 4 if x > 4 Then, we may look 0 7 at these elements 0 1 2 3 4 5 6 7 8 9 -2 0 4 7 12 19 22 42 65

  11. Searching for a Number  Assume x < 4, so we look at 0 o if x = 0, we are done o if x < 0, we necessarily look at 0 12 if x < 12 if x > 12 4 42 if x < 4 if x > 4 0 7 if x < 2 Then, we may look -2 at this element 0 1 2 3 4 5 6 7 8 9 -2 0 4 7 12 19 22 42 65

  12. Searching for a Number  We can map out all possible sequences of elements binary search may examine, for any x We are essentially This is called a decision tree : hoisting the array by at every step, it tells us how its midpoint, its two sides 12 to decide what to do next by their midpoint, etc if x < 12 if x > 12 4 42 if x < 4 if x > 4 if x < 42 if x > 42 0 7 22 65 if x < 2 if x < 22 -2 19 0 1 2 3 4 5 6 7 8 9 -2 0 4 7 12 19 22 42 65

  13. Searching for a Number  An array provides direct access to all elements o This is overkill for binary search o At any point, it needs direct access to at most two elements 12 if x < 12 if x > 12 4 42 if x < 4 if x > 4 if x < 42 if x > 42 0 7 22 65 if x < 2 if x < 22 -2 19 0 1 2 3 4 5 6 7 8 9 -2 0 4 7 12 19 22 42 65

  14. Searching for a Number  We can achieve the same access pattern by pairing up each element with two pointers o one to each of the two elements that may be examined next 12 4 42 0 7 22 65 -2 19 Arrays gave us more power  We are losing direct access to arbitrary elements, than needed o but it retains access to the elements that matter to binary search

  15. A Type Declaration  We can capture this pattern in a type declaration typedef struct tree_node tree; struct tree_node { tree* left; A struct tree_node int data; left data right tree* right; }; 12 or just node 4 42 0 7 22 65 -2 19

  16. typedef struct tree_node tree; struct tree_node { tree* left; The End of the Line int data; tree* right; }; left data right  What should 12 the blank left/right 4 42 fields point to? 0 7 22 65 -2 19  o NULL  each sequence of left/right pointers works like a NULL-terminated list  o a dummy node We used dummy nodes to get  unmanageable direct access to the end of a list

  17. Searching  Searching for 7 left data right 12 o 7 < 12: go left o 7 > 4: go right o 7 = 7: found 4 42 0 7 22 65  -2 19  Cost o O(log n) o Same steps as binary search

  18. Searching  Searching for 5 left data right 12 o 5 < 12: go left o 5 > 4: go right o 5 > 7: go left 4 42  nowhere to go o not there 0 7 22 65  -2 19  Cost o O(log n) o Same steps as binary search

  19. Insertion  Inserting 5 left data right 12 o 5 < 12: go left o 5 > 4: go right o 5 > 7: go left 4 42  put it there 0 7 22 65 -2 5 19  Cost We put 5 where is should have been if it were there o O(log n) This is what we were after!

  20. Trees

  21. Terminology the root 12 an inner node an inner node 4 42 a tree 0 7 22 65 -2 19 a leaf a leaf a branch (or subtree )

  22. Terminology 12 a node 4 42 its left child its right child a tree 0 7 22 65 their parent -2 19

  23. Concrete Tree Diagrams 12 4 42 0 7 22 65 -2 19

  24. Pictorial Abstraction  A generic tree  The empty tree Empty

  25. What Trees Look Like  A tree can be o either empty EMPTY o or a root with a tree on its left and a tree on its right  Every tree reduces to these two cases

  26. A Minimal Tree Invariant  Just check that the data field is never NULL bool is_tree(tree* T) { EMPTY // Code for empty tree if (T == NULL) return true; // Code for non-empty tree return is_tree(T->left) && T->data != NULL && is_tree(T->right); }   What else should we check? o a node does not point to an ancestor o a node has at most one parent 

  27. The BST Invariant  A BST is a valid tree whose nodes are ordered bool is_bst(tree* T) { return is_bst(T) && is_ordered(T); We will see later } how to implement this

  28. Looking Up Entries

  29. Implementing lookup entry bst_lookup(tree* T, key k) //@requires is_bst(T); //@ensures … { EMPTY // Code for empty tree if (T == NULL) return NULL; // Code for non-empty tree if (k == T->data) return T->data; if (k < T->data) return bst_lookup(T->left, k); //@assert k > T->data; return bst_lookup(T->right, k); }  But < and > work only for integers!  we want a dictionary that uses trees o to store entries of any type o and look them up using keys of any type

  30. A Client Interface  The BST dictionary will need a client interface that o requests the client to provide types entry and key o declares a function to extract the key of an entry o declares a function to compare two keys Client Interface // typedef ______* entry; // typedef ______ key; key entry_key(entry e) /*@requires e != NULL; @*/ ; bool key_compare(key k1, key k2) /*@ensures -1 <= \result && \result <= 1; @*/ ;  We could make it fully generic o but let’s keep things simple

  31. Implementing lookup entry bst_lookup(tree* T, key k) //@requires is_bst(T); //@ensures \result == NULL || key_compare(entry_key(\result), k) == 0; { EMPTY // Code for empty tree if (T == NULL) return NULL; // Code for non-empty tree int cmp = key_compare(k, entry_key(T->data)); if (cmp == 0) return T->data; if (cmp < 0) return bst_lookup(T->left, k); //@assert cmp > 0; return bst_lookup(T->right, k); }  We can now even provide a useful postcondition

  32. Checking Ordering

  33. Ordered Trees – I bool is_ordered(tree* T) //@requires is_tree(T); y { // Code for empty tree x z if (T == NULL) return true; // Code for non-empty tree return (T->left == NULL || T->left->data < T->data) && (T->right== NULL || T->data < T->right->data) && is_ordered(T->left) && is_ordered(T->right); } 42 12 49  0 88 6 99

Recommend


More recommend