hashing
play

Hashing 14 September 2020 OSU CSE 1 Performance of Set (and Map ) - PowerPoint PPT Presentation

Hashing 14 September 2020 OSU CSE 1 Performance of Set (and Map ) How long does it take to execute each of the methods of Set2 (similarly Map2 ), which use a Queue as the data representation? Assume that each call to a Queue kernel


  1. Hashing 14 September 2020 OSU CSE 1

  2. Performance of Set (and Map ) • How long does it take to execute each of the methods of Set2 (similarly Map2 ), which use a Queue as the data representation? • Assume that each call to a Queue kernel method executes in constant time , i.e., that the duration of a call is independent of the values of all the arguments, including the receiver 14 September 2020 OSU CSE 2

  3. Standard Methods • For almost every type in the OSU CSE components, including Queue , each of the three Standard methods ( newInstance , clear , and transferFrom ) takes constant time to execute 14 September 2020 OSU CSE 3

  4. Queue Kernel Methods Method (Op) Execution Time (T Op ) enqueue(x) T enqueue = c 1 dequeue(x) T dequeue = c 2 length T length = c 3 14 September 2020 OSU CSE 4

  5. Set Kernel Methods Method Execution Time add(x) remove(x) contains(x) size 14 September 2020 OSU CSE 5

  6. Set Kernel Methods Method Execution Time add(x) remove(x) Look at the method body in Set2 , and contains(x) figure out how much work it does... size 14 September 2020 OSU CSE 6

  7. Set Kernel Methods Method Execution Time add(x) c 4 remove(x) It simply enqueues its argument; plus, there is some contains(x) constant-time overhead just to make the call to add . size 14 September 2020 OSU CSE 7

  8. Set Kernel Methods Method Execution Time add(x) c 4 remove(x) contains(x) Look at the method body in Set2 , and figure out how much size work it does... 14 September 2020 OSU CSE 8

  9. Set Kernel Methods Method Execution Time add(x) c 4 remove(x) c 5 •| this | + c 6 contains(x) It has to search through a Queue size containing all the Set ’s elements. 14 September 2020 OSU CSE 9

  10. Set Kernel Methods Method Execution Time Raising the question: a worst case, an add(x) c 4 average case, ...? remove(x) c 5 •| this | + c 6 contains(x) size 14 September 2020 OSU CSE 10

  11. Set Kernel Methods Method Execution Time add(x) c 4 remove(x) c 5 •| this | + c 6 contains(x) c 7 •| this | + c 8 size c 9 14 September 2020 OSU CSE 11

  12. Linear Search • Linear search is the algorithm that examines—potentially— every item in a collection (e.g., code like moveToFront in Set2 and Map2 ) until it finds what it’s looking for – The name reflects the fact that its execution time is a linear function of the size of the collection (e.g., c 7 •| this | + c 8 ) 14 September 2020 OSU CSE 12

  13. Some Common Execution Times T(n) Execution (“running”) time of some code as a function of the “size” of its input. n 14 September 2020 OSU CSE 13

  14. Some Common Execution Times T(n) “Size” of the input for some code. n 14 September 2020 OSU CSE 14

  15. Some Common Execution Times Constant time , e.g., T(n) c n 14 September 2020 OSU CSE 15

  16. Some Common Execution Times Log time , e.g., T(n) a•log(n) + b n 14 September 2020 OSU CSE 16

  17. Some Common Execution Times Linear time , e.g., T(n) a•n + b n 14 September 2020 OSU CSE 17

  18. Some Common Execution Times n log n time , e.g., T(n) a•n•log(n) + b n 14 September 2020 OSU CSE 18

  19. Some Common Execution Times Quadratic time , e.g., T(n) a•n 2 + b•n + c n 14 September 2020 OSU CSE 19

  20. Some Common Execution Times Exponential time , e.g., T(n) 2 n n 14 September 2020 OSU CSE 20

  21. Faster Execution? • Option 1 (preferred): Reduce the order of magnitude of the running time – Example: Change from quadratic time to linear time, or linear time to log time • Option 2 (better than nothing): Reduce the constant factor that multiplies the dominant term of the running time – Example: Change from a larger slope for a linear function to a smaller slope 14 September 2020 OSU CSE 21

  22. Faster Execution Reduce by order T(n) of magnitude : a•n + b n 14 September 2020 OSU CSE 22

  23. Faster Execution Reduce by order T(n) of magnitude : c•log(n) + d n 14 September 2020 OSU CSE 23

  24. Faster Execution Reduce by a T(n) constant factor : a•n + b n 14 September 2020 OSU CSE 24

  25. Faster Execution Reduce by a T(n) constant factor : (a/10)•n + b n 14 September 2020 OSU CSE 25

  26. Example: Faster Linear Search • Goal: Reduce the constant factor in the execution time of linear search, i.e., reduce it from a•n + b to something like (a/10)•n + b • Approach: Reduce the number of items that need to be examined to find the one you’re looking for, because, e.g.: (a/10)•n + b = a•(n/10) + b 14 September 2020 OSU CSE 26

  27. Hashing: The Intuition • Instead of searching through all the items, store the items in many smaller buckets and search through only one bucket that 1. Can be quickly identified, and 2. Must contain the item you’re looking for 14 September 2020 OSU CSE 27

  28. Hashing: The Intuition • Instead of searching through all the items, store the items in many smaller buckets and search through only one bucket that 1. Can be quickly identified, and 2. Must contain the item you’re looking for 14 September 2020 OSU CSE 28

  29. Hashing: The Intuition • Instead of searching through all the items, store the items in many smaller buckets and search through only one bucket that 1. Can be quickly identified, and 2. Must contain the item you’re looking for 14 September 2020 OSU CSE 29

  30. How To Identify The Bucket • Suppose you need to search through n items of type T , and you decide to organize the items into m buckets • Given x of type T , compute from it some integer value h(x) • Look in bucket number h(x) mod m 14 September 2020 OSU CSE 30

  31. How To Identify The Bucket • Suppose you need to search through n items of type T , and you decide to organize the items into m buckets • Given x of type T , compute from it some The buckets have indices integer value h(x) 0, 1, ..., m-1 in an • Look in bucket number h(x) mod m array of buckets called a hashtable . 14 September 2020 OSU CSE 31

  32. How To Identify The Bucket The function that maps each value of type T to an • Suppose you need to search through n integer is called the items of type T , and you decide to hash function . organize the items into m buckets • Given x of type T , compute from it some integer value h(x) • Look in bucket number h(x) mod m 14 September 2020 OSU CSE 32

  33. How To Identify The Bucket By “reducing” the hash function result modulo m , • Suppose you need to search through n you are guaranteed to get items of type T , and you decide to the index of some bucket. organize the items into m buckets • Given x of type T , compute from it some integer value h(x) • Look in bucket number h(x) mod m 14 September 2020 OSU CSE 33

  34. How To Identify The Bucket The insight for hashing: if you put the item in this bucket • Suppose you need to search through n when you store it, then it is the items of type T , and you decide to only place you need to look for it organize the items into m buckets when searching. • Given x of type T , compute from it some integer value h(x) • Look in bucket number h(x) mod m 14 September 2020 OSU CSE 34

  35. Set Representation With Hashing • Suppose the data representation for a new Set implementation, say Set4 , uses an instance variable like this: /** * Buckets for hashing. */ private Set<T>[] hashTable; 14 September 2020 OSU CSE 35

  36. Set Representation With Abstract Set : Hashing • Suppose the data representation for a new Data representation using Set implementation, say Set4 , uses an several “little Set s”: instance variable like this: /** * Buckets for hashing. */ private Set<T>[] hashTable; 14 September 2020 OSU CSE 36

  37. Set Representation With Can we really do this: Hashing use Set s in the representation of a Set ? • Suppose the data representation for a new Why is it not circular? Set implementation, say Set4 , uses an instance variable like this: /** * Buckets for hashing. */ private Set<T>[] hashTable; 14 September 2020 OSU CSE 37

  38. Details • Suppose further (for illustration purposes) that: – T = Integer – h(x) = x – m = | $this .hashTable| = 3 14 September 2020 OSU CSE 38

  39. Details Here and in upcoming contracts, we’ll model • Suppose further (for illustration purposes) Java arrays as that: mathematical string s. – T = Integer – h(x) = x – m = | $this .hashTable| = 3 14 September 2020 OSU CSE 39

  40. Examples Abstract Concrete (this) ($this.hashTable) {} <{}, {}, {}> {13} {5, 13} {-2, 13} 14 September 2020 OSU CSE 40

  41. Examples Abstract Concrete (this) ($this.hashTable) {} <{}, {}, {}> {13} <{}, {13}, {}> {5, 13} {-2, 13} 14 September 2020 OSU CSE 41

  42. Examples Why is 13 in bucket 1 ? h(x) mod m Abstract Concrete = h(13) mod 3 (this) ($this.hashTable) = 13 mod 3 = 1 {} <{}, { }, {}> {13} <{}, {13}, {}> {5, 13} {-2, 13} 14 September 2020 OSU CSE 42

  43. Examples Abstract Concrete (this) ($this.hashTable) {} <{}, {}, {}> {13} <{}, {13}, {}> {5, 13} <{}, {13}, {5}> {-2, 13} 14 September 2020 OSU CSE 43

  44. Examples Abstract Concrete (this) ($this.hashTable) {} <{}, {}, {}> {13} <{}, {13}, {}> {5, 13} <{}, {13}, {5}> {-2, 13} <{}, {-2, 13}, {}> 14 September 2020 OSU CSE 44

Recommend


More recommend