DOC PREVIEW
UCLA PIC 10B - Advanced Hashing

This preview shows page 1-2-3 out of 10 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Lecture 24 Advanced Hashing PIC 10B Todd Wittman Note We ll end 10 minutes early today for evaluations Hash Table Choices Hash tables are a very powerful data structure but unlike the other containers we ve studied they have parameters choices of methods that need to be determined How many buckets M should we use More buckets More memory Faster run time What hash function should we use A hash function has 2 basic steps First we need to map the key to a large integer x e g we could add up the ASCII values of the chars in a string Second we need to convert the integer x to an index in the range 0 M 1 Division Method x M Multiplication Method int M A x int A x 1 Choosing the Number of Buckets To choose the number of buckets you have to ask How fast do searches need to be How much memory do I have available In our example for 14 students at Jedi Academy it takes a large number of buckets to get down to zero collisions Can you modify our program so that it finds the smallest number of buckets needed for 0 collisions Choosing the Hash Function First we need to decide how to convert the key to an integer x Preferably a large integer bigger than the buckets M so we don t map to just one part of our table For strings we added up the ASCII values of each character for int i 0 i s length i x int s i But with this method the order of the characters does not matter So permutations map to the same integer H Yoda H odaY A common solution is to weight the chars For example we could fix a constant 1 B 2 and compute for int i 0 i s length i x int pow B i s i 2 Choosing the Hash Function After we have an integer x we need to restrict it to the range 0 M 1 There are 2 popular methods for doing this Division Method Take x mod the of buckets index x M Multiplication Method Fix a constant 0 A 1 Then take the fractional part of A x and multiply that by M Then round the result down to the nearest integer index int M A x int A x The value of A needs to be set experimentally Knuth recommends A 5 1 2 Can you modify our program so that it uses the multiplication method Note you are asked to do this for HW9 Are Hash Tables Used in Real Computer Programming I recently had a conversation with a 25 year veteran IT manager at Hewlett Packard about what data structures they use for database management Dad do you use hash tables at HP I have no idea what a hash table is I ve never heard of it So no we don t use hash tables OK then what data structure do you use to store your databases We have this really slick system In every record we identify a character string to be the key for the record Then we convert that string to a number that tells us the index of an array Usually we add up the ASCII characters in the string divide by some number then take the remainder after division Then when we want to look up a record we convert the key to an index and bang there it is Huh In my class we call that a hash table OK then we use hash tables 3 Multiple Keys The demo hash table program we just looked at allows us to look up records by student name But the registrar wants to be able to look up student records by typing in the name OR the ID How do we set up a hash table so that we can look up records by two different types of keys One solution is to design two hash functions so that every name ID hash to the same index H Luke Skywalker 357 1 Leia 0 1 2 Darth 2 4 C3PO 3 4 6 Yoda 5 6 G 333 222 111 357 357 Luke 357 358 359 360 Han Chewie 359 360 361 499 But this is pretty much impossible to construct Multiple Keys The solution is to construct multiple hash tables of pointers one for each type of key The actual records are stored in a master linked list Each entry in the hash table points to the record Name Hash Table 0 1 2 3 4 5 6 7 Luke Skywalker Leia Organa Master List Luke Skywalker 333 222 111 357 2 85 History Leia Organa 283 528 233 1 3 96 Political Science ID Hash Table 283 528 233 333 222 111 0 1 2 3 4 5 6 7 4 Inserting With Multiple Keys We should create classes that store pointers to the record with each type of key class NameKey class IDKey private private Record ptr Record ptr string name string id number Create 2 hash tables one for each key type Assume both hash tables have M buckets although it is not necessary for them to be same size HashTable NameKey string nameHash M HashTable IDKey string idHash M To insert a record into our database we first add the record to our master linked list We need to get back the address of the record we just inserted I created a LinkedList push back function that returns a pointer We hash each key to its position in its respective hash table We set the NameKey IDKey ptr variable to point to the record in the master list Example Muliti Hash Suppose we are given a file students txt that list student records Given the Record NameKey and IDKey classes write a program that looks up student information by name or ID quickly 5 Erasing With Multiple Keys This system is not set up for erasing records To erase a record we should make our system doubly linked Each record should have pointers to the respective hash tables Then when we erase a record we also erase the elements of the hash tables that point to it Name Hash Table 0 1 2 3 4 5 6 7 Luke Skywalker Leia Organa Master List Luke Skywalker 333 222 111 357 2 85 History Leia Organa 283 528 233 1 3 96 Political Science ID Hash Table 283 528 233 333 222 111 0 1 2 3 4 5 6 7 Preparing for the final exam Final Exam is Sunday 3 14 3 00 6 00pm in Young Hall 24 Exam is roughly length of 2 midterms All topics this semester are fair game with an emphasis on topics covered since 2nd midterm Practice final exams are online now Today Searching and Sorting Algorithms Linear vs Binary Search Selection Insertion Merge Quick Sort Friday Review of Classes and Data Structures Classes The Big 4 Dynamic Memory Operator Overloading Template Functions Template Classes Data Structures Vectors Linked Lists Stacks Queues Binary Trees Hash Tables Heaps Other Topics You …


View Full Document

UCLA PIC 10B - Advanced Hashing

Documents in this Course
Load more
Download Advanced Hashing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Advanced Hashing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Advanced Hashing and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?