Data Structures — Lecture Notes for CS 3110: Design and Analysis of Algorithms Norbert Zeh Faculty of Computer Science, Dalhousie University, 6050 University Ave, Halifax, NS B3H 2Y5, Canada nzeh@cs.dal.ca July 8, 2014
i Contents 1 ( a, b ) -Trees 1 1.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Representing ( a, b ) -Trees . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Searching ( a, b ) -Trees . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3.1 The Find Operation . . . . . . . . . . . . . . . . . . . . . . . 6 1.3.2 Minimum and Maximum . . . . . . . . . . . . . . . . . . . . 9 1.3.3 Predecessor and Successor . . . . . . . . . . . . . . . . . . . 9 1.3.4 Range Searching . . . . . . . . . . . . . . . . . . . . . . . . 11 1.4 Updating ( a, b ) -Trees . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.4.1 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.4.2 Deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 1.5 Building ( a, b ) -Trees . . . . . . . . . . . . . . . . . . . . . . . . . . 22 1.6 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . 25 1.7 Chapter Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2 Data Structuring 27 2.1 Orthogonal Line-Segment Intersection . . . . . . . . . . . . . . . . 27 2.2 Three-Sided Range Searching . . . . . . . . . . . . . . . . . . . . . 31 2.3 General Line-Segment Intersection . . . . . . . . . . . . . . . . . . 33 2.4 Chapter Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3 Dynamic Order Statistics 39 3.1 Definition of the Problem . . . . . . . . . . . . . . . . . . . . . . . 39 3.2 Counting Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.2.1 Counting Line-Segment Intersections . . . . . . . . . . . . . 40 3.2.2 Orthogonal Range Counting and Dominance Counting . . . . . . . . . . . . . . . . . . . . . . 42 3.3 The Order Statistics Tree . . . . . . . . . . . . . . . . . . . . . . . . 45 3.3.1 Range Queries? . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.3.2 An Augmented ( a, b ) -Tree . . . . . . . . . . . . . . . . . . . 45 3.3.3 Updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.3.4 Select Queries . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.4 Chapter Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4 Priority Search Trees 55 4.1 Three-Sided Range Searching and Interval Overlap Queries . . . . 55 4.2 Priority Search Trees . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.2.1 Answering Range Queries on an ( a, b ) -Tree . . . . . . . . . 57 4.2.2 Searching by x and y . . . . . . . . . . . . . . . . . . . . . . 57 4.2.3 Using a Priority Queue for y -Searching . . . . . . . . . . . . 59 4.2.4 Combining Search Tree and Priority Queue . . . . . . . . . 61 4.3 Answering Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 4.4 Updating Priority Search Trees . . . . . . . . . . . . . . . . . . . . 67 4.4.1 Insertions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.4.2 Deletions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.4.3 Node Splits . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
ii 4.4.4 Node Fusions . . . . . . . . . . . . . . . . . . . . . . . . . . 74 4.5 Answering Interval Overlap Queries . . . . . . . . . . . . . . . . . . 76 4.6 An Improved Update Bound . . . . . . . . . . . . . . . . . . . . . . 77 4.6.1 Weight-Balanced ( a, b ) -Trees . . . . . . . . . . . . . . . . . 77 4.6.2 An Amortized Update Bound . . . . . . . . . . . . . . . . . 80 4.7 Chapter Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 5 Range Trees 83 5.1 Higher-Dimensional Range Searching . . . . . . . . . . . . . . . . . 84 5.2 Priority Search Trees? . . . . . . . . . . . . . . . . . . . . . . . . . 84 5.3 Two-Dimensional Range Trees . . . . . . . . . . . . . . . . . . . . . 85 5.3.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 5.3.2 Two-Dimensional Range Queries . . . . . . . . . . . . . . . 87 5.3.3 Building Two-Dimensional Range Trees . . . . . . . . . . . 87 5.4 Higher-Dimensional Range Trees . . . . . . . . . . . . . . . . . . . 90 5.5 Chapter Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
1 Chapter 1 ( a, b ) -Trees In this chapter, we discuss a rather elegant tree structure for representing sorted data: ( a, b ) -trees. It is in spirit the same as a red-black tree or AVL-tree, that is, yet another balanced search tree. However, it is not a binary search tree, whose height is kept logarithmic by clever rotations; its rebalancing rules are much more transparent, which is why I hope that you feel more comfortable arguing about this structure than about red-black trees. The discussion of ( a, b ) -trees is divided into different subsections. In Sec- tion 1.1, we define what ( a, b ) -trees are and prove a number of useful properties, including that their height is O ( lg n ) , as long as a and b are constants. Details of how ( a, b ) -trees are represented using standard programming language con- structs are provided in Section 1.2. In Section 1.3, we argue that a number of query operations can be performed in logarithmic time on ( a, b ) -trees, including searching for an element. Finally, in Section 1.4, we discuss how to insert and delete elements into and from an ( a, b ) -tree. 1.1 Definition As binary search trees, ( a, b ) -trees are rooted trees. For a rooted tree T and a node v in T , we denote the subtree rooted at v by T v . The nodes in T v are the descendants of v ; we use Desc ( v ) to denote this set. The data items stored at the leaves of T v are denoted by Items ( v ) . The keys of these items are denoted by Keys ( v ) . Note the fine distinction we make between keys and data items. If there was no such distinction, search trees would be rather useless: If we search for element 14 and simply return it if we find it, we know little more than before. The only additional information we have gained is that element 14 is indeed in our set. So the point is that you should think about the items we store in the dictionary as a record, much like in a database; the key of an item is just one of the pieces of information stored in the record. For example, you may think about implementing Dalhousie’s banner system. Then the elements we store in our database—that is, in our search tree—are records storing different pieces of information about each student. When we search for a student’s record, we may locate this record, for example, using the student’s banner ID as the search key; but the information we are interested in may be the student’s transcript, email address, etc. So, by locating the record, we have gained more information than we had before the search. Having said that there is a distinction between keys and elements, we will use our search tree to store numbers; the key of a number is the number itself. This is to keep the discussion simple. However, you should keep in mind that a data item and its key are usually two different things. An ( a, b ) -tree is now defined as follows:
2 Chapter 1. ( a, b ) -Trees Definition 1.1 For two integers 2 ≤ a < b , an ( a, b ) -tree is a rooted tree with the following properties: (AB1) The leaves are all at the same level (distance from the root). (AB2) The data items are stored at the leaves, sorted from left to right. (AB3) Every internal node that is not the root has between a and b children. (AB4) If the root is not a leaf, it has between 2 and b children. (AB5) Every node v stores a key key ( v ) . For a leaf, key ( v ) is the key of the data item associated with this leaf. For an internal node, key ( v ) = min ( Keys ( v )) . 1 1 34 43 1 15 34 41 43 66 76 90 1 7 11 15 23 34 36 37 41 42 43 44 51 66 71 76 77 78 81 90 92 97 Figure 1.1. A ( 2, 4 ) -tree. An example of a ( 2, 4 ) -tree is shown in Figure 1.1. The first two natural ques- tions we would ask about an ( a, b ) -tree is what its size is if it stores n data items, and what its height is. For, as in any search tree, a search operation will traverse a path from the root to a leaf; that is, the height has a significant impact on the running time of a search operation on an ( a, b ) -tree. The following two lemmas give favourable answers to these questions. Lemma 1.1 An ( a, b ) -tree storing n items has height between log b n and log a ( n/2 ) + 1 . Proof. Assuming that the height of the tree is h , we prove below that the number, n , of data items stored in the tree is between 2a h − 1 and b h . Using elementary arithmetic, we obtain the desired bounds on h from this claim: n ≤ b h log b n ≤ h
Recommend
More recommend