HashingHash tables and hash functionsCollisionsOpen hashingClosed hashingDesign and Analysis of Algorithms - Chapter 7 1HashingHashingA very efficient method for implementing a A very efficient method for implementing a dictionary, i.e., dictionary, i.e., a set with the operations:a set with the operations:– insert insert – find find – deletedeleteApplications:Applications:– databasesdatabases– symbol tablessymbol tablesDesign and Analysis of Algorithms - Chapter 7 2Hash tables and hash Hash tables and hash functionsfunctionsHash table:Hash table: an array with indices that correspond to an array with indices that correspond to bucketsbucketsHash function:Hash function: determines the bucket for each record determines the bucket for each recordExample:Example: student records, key=SSN. Hash function: student records, key=SSN. Hash function:hh((kk) = ) = kk mod mod mm((k k is a key andis a key and m m is the number of buckets) is the number of buckets)• if if mm = 1000, where is record with SSN= 315-17-4251 stored? = 1000, where is record with SSN= 315-17-4251 stored?Hash function must:Hash function must:•be easy to computebe easy to compute•distribute keys evenly throughout the tabledistribute keys evenly throughout the tableDesign and Analysis of Algorithms - Chapter 7 3CollisionsCollisionsIf If hh((k1k1)) = h = h((kk2) then there is a 2) then there is a collision. collision. Good hash functions result in fewer collisions.Good hash functions result in fewer collisions. Collisions can never be completely eliminated.Collisions can never be completely eliminated.Two types handle collisions differently: Two types handle collisions differently: •Open hashing - bucket points to linked list of all keys hashing to it.•Closed hashing – –one key per bucket –in case of collision, find another bucket for one of the keys (need Collision Collision resolution strategyresolution strategy))–linear probing: use next bucket – double hashing: use second hash function to compute incrementDesign and Analysis of Algorithms - Chapter 7 4Open hashingOpen hashingIf hash function distributes keys uniformly, average length If hash function distributes keys uniformly, average length of linked list will be of linked list will be n/mn/m Average number of probes = 1+Average number of probes = 1+αα/2/2 Worst-case is still linear!Worst-case is still linear! Open hashing still works if Open hashing still works if n>mn>m..Design and Analysis of Algorithms - Chapter 7 5Closed hashingClosed hashingDoes not work if Does not work if n>m.n>m. Avoids pointers.Avoids pointers. Deletions are Deletions are notnot straightforward. straightforward. Number of probes to insert/find/delete a key depends on Number of probes to insert/find/delete a key depends on load factor load factor αα = = nn//m m (hash table density) (hash table density) – successful search: (½) (1+ 1/(1- successful search: (½) (1+ 1/(1- αα))))– unsuccessful search: (½) (1+ 1/(1- unsuccessful search: (½) (1+ 1/(1- αα)²))²) As the table gets filled (As the table gets filled (αα approaches 1), number of probes approaches 1), number of probes increases dramatically: increases
View Full Document