first things first
play

First things first Project 1 Hashing Still working on grading - PDF document

First things first Project 1 Hashing Still working on grading Definitely by tomorrow Introduction Second things second Third things third Project 2 Exam 2: This Wednesday try targets are now up.


  1. First things first… • Project 1 Hashing – Still working on grading – Definitely by tomorrow Introduction Second things second… Third things third… • Project 2 • Exam 2: This Wednesday – try targets are now up. – Topics: • Java I/O – Mail about VFSystem.find() • Recursion – Minimum Submission • Analysis of Algorithms • Entry.java and Document.java • Searching • Due this Friday, February 6 th • Sorting • REMEMBER MINIMUM SUBMISSION RULE!!! • Review Session – Final Submission – Tonight 4-6 Building 70 Auditorium • Due Sunday, February 15 th Exam Topics Exam Topics • Java I/O • Recursion – 4 Basic Classes: – What is recursion • Reader, Writer – for character data – Step through a recursive function • InputStream, OutputStream – for byte data – Avoid this guy… • Wrapper Classes – for high level I/O • Do not memorize methods…will provide Javadocs as needed. 1

  2. Exam Topics Exam Topics • Speaking of recursion • Analysis of algorithms – Note also that you can always turn a recursive – Big O solution into a iterative solution by creating and – Big Theta maintaining a “state stack” • Difference between the two • Which is exactly what a recursive system does under – Calculating the hood. • Loop within a loop • …and no, this will not be on the exam!!! Exam Topics Exam Topics • Sorting • Searching – Simple Sort – Linear Search • Insertion • Θ (n) • Selection – Binary Search • Bubble • Θ (log n) • All Θ (n 2 ) – average case – Divide and Conquer Sorts • Merge Sort • Quicksort • Both Θ (n log n) – average case Exam Topics Searching • Questions? • Suppose are given a collection of items and we will need to see if a given object is in the collection: – Linear Search • Θ (n) – Binary Search – Binary Search Tree • Θ (log n) • Can we do better? 2

  3. Hashing Hashing Terminology • What if the object itself can give its location in the collection Object Object Index into buckets bucket array Hash function Hash table • This is called Hashing About Hashing functions About Hashing functions • Converts object to index into bucket array. • Hashing rules • Goal – Hashing function called on same object must – Distribute objects equally among buckets always return same value – Bad function – Ideal hashing function will produce “almost • Add first 3 character codes of a string random-like” values when applied on different – Good function objects. • Add all character codes of a string • Address where object is found in memory • Should be Efficient About Hashing functions Operations on Hash tables • Ultimately, hashing function will need to fit • Insert within the bounds of an array. – add an object to the hash table – index = (hash(O) ) % n • Remove – remove an object from the hash table • Find – Determine if a given object is in the hash table. 3

  4. Insert Remove 1) Apply hash function to 1) Apply hash Object Object object to see function to Index into where it Index into object would be if in bucket array bucket array the hash table 2) Add object Hash Hash to the index 2) If item is returned by function function there, remove the hash buckets it (and replace buckets function will a “blank object”) Find Advantages of hashing • Insert, Remove, Find 1) Apply hash function to – Performed in constant time Object object to see where it Index into – Time dependent only on complexity of hash would be if in bucket array function. the hash table Hash 2) If item is function there return true, else buckets return false Collisions Open address hashing • What happens if two objects hash to the • Ways to deal with collisions same index? – Open-address hashing – find another spot to put it – Hash functions aren’t perfect! • Linear Probing – go to next unfilled bucket – When this happens, it is called a collision. • How do we handle collisions? 4

  5. Linear Probing – Insert Linear Probing – Find 1) Apply hash 1) Apply hash function to function to Object Object object to see object to see where it Index into where it Index into should go should be bucket array bucket array 2) If bucket is 2) If bucket is Hash Hash full then, find has item in it, next available return true function function bucket. buckets buckets 3) If no open 3) If bucket does not have object in it, bucket found, but is not empty, traverse the table until start again at either the object or an empty bucket is top of hash found table Linear Probing Linear Probing – Why we need the “blank” object • Clustering 1) Apply hash function to – If your hash function is less than optimal Object object to see where it Index into • Many objects hashing to the same index should be bucket array • End up with clustering 2) If bucket is Hash has item in it, • In the worst case return true function – All objects hash to the same index buckets – Must do a linear search through the hash table 3) If bucket does not have object in it, but is not empty, traverse the table until – Θ (n) either the object or an empty bucket is found Linear probing Double Hashing • Another way to deal with collisions – Open-address hashing – find another spot to put Object it Index into • Double hashing – use a second hash function to bucket array determine how many slots forward to look cluster Hash function buckets 5

  6. Double Hashing – Insert Double Hashing – Insert 1) Apply hash 1) Apply hash function to function to object to see Object object to see Object where it where it Index into Index into should be should be hash1 hash1 bucket array bucket array 2) If bucket is 2) If bucket is Index Index full, apply a full, apply a second hash increment (2) second hash increment (2) function to get function to get an increment an increment buckets hash2 hash2 buckets 3) Apply 3) Apply increment to increment to index until index until empty bucket empty bucket is found is found Double Hashing Double Hashing • Double hashing • Double hashing – Hash function considerations – Hash function considerations • Must assure that increment returned by second hash • In our example, range of hash1 is 8 (size of hash function will result in all empty buckets being table) visited. • Hash2 returns a 2. • Can assure this by making the “range” of the two • 8 is a multiple of 2 hashing functions to be relatively prime. • Problem! – The two ranges have no common multiples except 1. Double Hashing Double Hashing – Find • Double hashing 1) Apply hash function to – Hash function considerations object to see Object where it – Finding relatively prime numbers should be Index into hash1 bucket array • Make the size of the hash table to be prime 2) If bucket Index contains • Make the range of hash 2 to be size of hash table – object return increment (2) 2. true. – Twin primes. buckets hash2 3) Else apply a second hash function to get an increment • Note that hash2 should never return 0. 4) Apply increment to index until object or empty bucket is found 6

  7. Open-address hashing Next time • In case of collision Chained Hashing – Find another open bucket to place your object Another way to deal with collisions. • Linear Probing – Search for empty bucket sequentially • Double hashing – Use a second hash function to get an increment – Questions? 7

Recommend


More recommend