16 hashtabl
play

16-HashTabl BitvectorReu.ec - Suppose we want to store some de # a - PowerPoint PPT Presentation

16-HashTabl BitvectorReu.ec - Suppose we want to store some de # a set SE Lad ] , for of S is - A bitrr representation a Boolean array B of size dtt its sit BLIJ , = { o sie d is true } : Bfi ] S or D= 20 , 5- { 3.7.93 Eg : . B. =


  1. 16-HashTabl

  2. BitvectorReu.ec - Suppose we want to store some de # a set SE Lad ] , for of S is - A bitr¥r representation a Boolean array B of size dtt ⇒ its sit BLIJ , = { o sie d is true } : Bfi ] S or D= 20 , 5- { 3.7.93 Eg : . B. = foIololohl4llotT4 • Operations are all 04 ) . remove 4) member tx ) , inserted , . Only d. is small practical where 1st Kd ^ Space inefficient if - Copy , Union , Intersection all ① (c)

  3. HashFuncti is a function - A hash function for a set D h :D → M to a smaller set . where IM1E 1DI , a map ie h :[ 0 , MAXINT ] → Lo , 123 , h ( x ) = x mod 13 Eg ( 1441=13 , ID 1=2,147,483,647 ) but hcxthly ) . - There will be ix. y ED stuffy values = fix ) and HES } = { y HCS ) - Notation : Define ' y HC15 ) = 2 's HC20 ) = 7 h (3) =3 ; h (7) = 4 ; HC13 ) Eg - O 's - = { 0,213,7 } h( { 3,7 , 13 , 15,203 ) a collision leg 3,15 ) we call it - If hlxthcy ) for ix. yes , will want hash functions h st . - We ( array indices ) = [ 0 , m - I ] for me # h - ran tends to distribute S uniformly over L0 , m - I h - = IM1 will be m prime -

  4. HashFunctiowtBitVec a Boolean array of size M - Let h :D → Lo , m - D , B set - For a set SED , sit . hlx )=i = true iff there is NED BLIJ HCS ) er { i :B Lil } = = { 3,7 , 13,15 , 20 } S Eg : m = 13 13 ; = x mod h ( x ) = { 0,2 . 3,73 h ( s ) Fin B = = { 0,2 , 3,7 , 13 , 15 , 19,25 , 27 , 31 , : { x : Bfhlx ) ] } - - -3 " suggests at 5 • Bfhtx ) ) " = I • Bfh 4) 3=0 x 45 implies . be false positives but e¥ there may never false negatives .

  5. BIoomFiH hash functions be a set of distinct = { hi , ha , . he } H . Let . . Lam - D for a set D , each with range . hlx )=i for some he H ; S E D , set BLIJ - For = true if = false Bfi ) o.w.TO test for membership S : in = true for all he H , return true . if BLHCXIJ o - w . return false . - hlx ) is a . We get a false positive only when collision for every HEH . a Bloom¥s - B is is large enough relative to 1st and - If m the hi are good quality , independent hash , then there will be few false positives functions

  6. ⇒ Hash Tables - - Let h :D → M be a hash function for D with M= Lo , m - ' I - Let A be an array of size IM1 and type D u E - 3 A : M → DUE -3 . - For a set SED , we want for A [ had ] = x , each XES hcx ) Fi for every des - if A Lil = . . :S = { 2,12 , 17,213 , hlx ) = x mod Eg 13 h Is )={ 2,11 , 5,93 A = - To check membership A Lhtx ) ] in S , return . a hashtabIef - A is - But what if we have collisions ? - Need coHisionhang a few methods . We will look at . .

  7. Hashingwithseparatechaining.lt a size - M array of linked lists A be - Set A Lil to be a list of the elements { xesihlxt.is . o test for membership in S : - T A Lhhd ] is in the list - Return true iff x = l A 5=21,5 , 7,13 , 18 , " } F ⇒ ¥?¥ It . next x mod is ve x : insertlremove x from Afhlx ) ) - T o insert 1rem . . almost uniformly over M , the lists will - If h distributes S be be essentially 04 ) . , and time will small have length IN - In the worst case , some lists and , performance degrades to that of : In ) linked lists i ) . .

  8. Hashing with Probing . ( open Addressing ) - A be an array of size M and type DUE -3 , > Let h :D → Lo , m -7 £ a hash function has flo )=o . Let f be a function f. A -0N , that leg . ixsy ⇒ find > fly ) ) and is monotone increasing = Lhtx ) tfli ) ) mod , for it N - Define hi 4) m > # fli ) = i htx ) = x mod 13 Ex . , ho (3) = h (3) to = 3 h (3) t = 4 h , (3) I = = 5 hzl 3) = h (3) t 2 , probe the sequence of cells : • To collisions resolve A Lino (a) 3 , Afk Hl ) , . Ahh , CH3 . - .

  9. Hashingwithprobing-lopenh-ddressingl.tn iCx)=(htx)tfmodm_ Toehechaformembershipofuxio Examine the sequence of locations 3 , A the A- tho , A [ him 3 . . . . . stop at the first location containing I. x or was found , false otherwise . return true if x - T oinsertu Examine the sequence of locations • , ALWYN A- tho 4) 3 , AL4H )3 . . . at the first location containing • Stop - store x there and . ft ) determines properties - Choice of .

  10. Hashingwithhinearprobing-let.fi = i ) F- The sequence of locations to probe is : is mod m ) A LH4B , A LhHtD , A Chintz ] , Afhlxtt 33 , . . . It - Suppose htt ) = x mod 13 , 5=12,9 , 18,363 F¥ ( so hls ) = { 25,9105 ) and A is o insert HC5 )=5 ; - T . compute 5 : - see that AL53 It AL63 so set - see that ALGJ = 5 = - , - Now : A = I HC5 )=5 ; - To check if . compute 5 es : - see that AL53 It , AC53 ¥5 - see that AL63 = 5 and return true . Compute HC31 ) = 5 , o check if 31 ES : - T - see that AL53 # 31 , AL53 # i. see that AL63 I 31 , AL6 ) I - and return false - see that AL73 = -

  11. HashingwithQuadrakcprobing.hetfcif.it → The sequence of locations to probe is : is mod m ) A 1h43 , A Lhlx ) to ] , , A Lhlxtt 13 , A- Lhtx )t 43 . . . It Exi Suppose htt ) = x mod 13 , 5=12,9 , 18,363 ( so h b) = { 25,9105 ) and A is - compute h( 351=9 - To insert 35 : . see that AL93 F. see that ACID x. see that AL13 = - and store 35 there . - Now : A is - compute h( 351=9 o check if 35 Es : T , AL93 # 35 . see that AL93 # - . see that Ado ) ¥ - , All 03 # 35 . see that AL13 = 35 and return true • T o check if 22 E S : . compute h( 22 ) = 9. see that AG3A 4OI , AL53 are , ALD not 22 or a. see that AE23 = - and return false

  12. Doublettashing.lt flit = i. hash , 4) , a hash function for D that has hzlx ) where is is rain ( hash ) different from h , and E Li , m ] with - The sequence of locations to probe is : is mod m ) ( t Afhlx ) ) , Afhlx ) thashzlx ) ] , Afhlx )t 2. hash , HI , . . . IE Suppose h 4) = x mod 13 , 8=12,9 , 18,36 } ( so h b) = { 25,9105 ) and A is . compute htx ) = - To insert 15 : 2. see that A4J F. compute hashzlx ) = 6 - see that AL83 = , and store 15 there - Now : A is - To check if 15 ES , check AL23 , then AL83 , and return true • To check if 10 ES : . Compute h Go ) = to - see that A LID # 10 , A 403 # i. compute has hz( lot = 4. see that AL13 = , and return false

  13. Removal with Open Addressing - - Suppose we have a hash table H for a set S containing x , and want to remove x. we just delete x. It uses separate chaining , - If It uses open addressing - If we cannot , because , ix affects the probe sequence for other elements . . Suppose htt ) = x mod 13 , E¥ 5=12 , 5,9 , 18,363 Linear Probing and A was obtained as in our example : A = - Suppose we now delete 18 , so A = I - I. , searching for 5 fails Now , because ALHC 5) I = - I solution is to mask cells elements we have deleted - One where .

  14. Removalwithopen-hddressing.FI we replace In the previous example , to 18 remove . it with d. : A = → - if AL53 Now perform search 4 insert procedures has as , key that we will never use . some - Toremoueu - examine the sequence of locations , A Lhz HI , A Lhote ) ) , A Child - . . is found , replace it with d. . when x as they insert work correctly - Notice that search 4 are be modified to reclaim space : - Insert can - Examine the sequence of prob locations o insert x : T - stop at the first one containing or d - and store x there . - NB : In implementation be special values or A could be an , d and - could , " variables fields array of objects " and or struts with " empty " deleted .

  15. ↳ adT ac - The loadfa of a hash table H is ' . ( If It uses separate # of keys )t # of elements marked d) chaining , there A = - no d 's ) m are - Good performance requires 1 not too large . : I should not be much larger - For separate chaining so average list length is about 1 than 1 . , is not too so that " For it open addressing , want 4<0.5 , hard to find a place to make an insertion .

  16. somepropertieswithopenAddressing.li near probing - \ Insertion always succeeds if 4L I a serious problem . • Primate ring is - Quadratic Probing - - Avoids primary clustering - but less problematic - Exhibits secondary clustering - Insertion alway succeeds if At 0.5 but may , fail if I > 0.5 ( even if there is space ) . . Doublettash.me# - Requires design of a second suitable hash function : Requires computing 2 hash functions whenever - probing beyond A [ hotel ) needed is .

  17. Rehashing - Rehashing means constructing hash table It a completely new hash table for the contents of It . - We may do it if want to : is too large ( close to 0.5 for - 4 addressing open , much larger than 1 for separate chaining ) - Performance has become poor , from ( which may result from clustering or from many removals ) long linked lists , ① Ln ) under the assumption that insert - Takes time is Eli ) .

  18. HashingPropert - designed hash tables are effective in practice - Well , with fast , member , remove operations insert - Require a good hash function for the domain of application - Operations 04 ) onarera → , under assumptions that in practice : may not hold - all keys equally like y - hash function distributes keys uniformly small - A on order of keys , - Do not support operations based in order such as : . enumerate Max , range look ups - min , r intersection . Union ( AVL Trees 4 B- Trees ) These are efficient with .

Recommend


More recommend