hash tables last today next hashing unbounded arrays
play

Hash Tables LAST TODAY NEXT Hashing Unbounded arrays - PowerPoint PPT Presentation

Hash Tables LAST TODAY NEXT Hashing Unbounded arrays Implementing Genericity Amortized analysis Hash tables Introduction to C1 (genericity) Implicit contract for casting (void*) x where x has type tp* //@ensures


  1. Hash Tables

  2. LAST TODAY NEXT • Hashing Unbounded arrays Implementing • Genericity Amortized analysis Hash tables

  3. Introduction to C1 (genericity)

  4. Implicit contract for casting • (void*) x where x has type tp* 
 //@ensures \hastag(tp*, x) • (tp*) y, where y has type void* 
 //@requires \hastag(tp*, y)

  5. Only operations you allowed on p of type void* • Cast to another type: (int*) p • Compare to another void* value: p == q where q is of type void* • Compare to NULL: p == NULL

  6. Hashing

  7. Reflecting on arrays • As a way to keep a collection of elements of the same type, like a set • As a mapping from indices to values like a dictionary • Operations: insert, lookup { goal: make these operations efficient

  8. Dictionaries (also known as maps, associative arrays) • An array is a mapping from indices to elements where 
 A[i] = e . key entry • Dictionary: mapping from keys to entries where key can be any kind of information • zipcode (key) to neighborhood name (entry) • Andrew id (key) to home address (entry) • SSN (key) to tax id (entry)

  9. Implementing dictionaries unsorted (key, entry) linked list with (key,entry) array sorted (key,entry) array by key data lookup O(n) O(n) O(log n) O(1) O(n) O(1) insert amortized Can we implement dictionaries such that both lookup and insert are about O(1)?

  10. Example: Storing zipcodes using an array with length 5 Some fun zip codes: 0 key value 90210 Beverly Hills 1 10101 New York 20500 White House 2 44444 Newton Falls, OH 3 94043 Googleplex 15213 CMU 4 15217 Squirrel Hill 15122 Kennywood

  11. Example: Storing zipcodes using an array with length 5 Some fun zip codes: key value 90210 Beverly Hills 10101 New York key index hash value 20500 White House 44444 Newton Falls, OH zipcode zipcode % 5 zipcode % 5 94043 Googleplex 15213 CMU 15217 Squirrel Hill 15122 Kennywood

  12. Design choices for handling collisions • Open addressing (e.g. linear probing) • Separate chaining

  13. Example: linear probing Look for an empty slot somewhere predictable: next position, then next-next … 0 “White House” 15217 Squirrel Hill 1 “Beverly Hills ” 20500 White House 2 “Squirrel Hill” 90210 Beverly Hills 10101 New York 3 “New York” 4

  14. Example: linear probing How do you know something is not in the table? 0 “White House” 15217 Squirrel Hill 1 “Beverly Hills ” 20500 White House 2 “Squirrel Hill” 90210 Beverly Hills 10101 New York 3 “New York” 4

  15. Example: separate chaining 0 1 2 3 4

  16. Cost analysis of separate chaining If we have an array of size m and a total of n entries, how much does it take to lookup an entry?

  17. Worst possible layout n 0 1 m 2 … 3 4 O(n)

  18. Best possible layout n/m 0 … … 1 m 2 … 3 … 4 … O(n/m)

  19. Cost analysis of separate chaining Can we arrange so that n/m is constant? use resizing as we did in unbounded arrays

  20. Implementing dictionaries unsorted (key, value) linked list with (key,value) array sorted (key,value) Hash tables array by key data O(n/m) Average lookup O(n) O(log n) O(n) O(1) average and amortized O(n/m) Average O(1) O(n) O(1) insert amortized O(1) average and amortized

Recommend


More recommend