UCF COP 3502 - Hash Table - D3038840

Home> Schools> University of Central Florida> Computer Programming (COP) > COP 3502> Hash Table

DOC PREVIEW

UCF COP 3502 - Hash Table

School name University of Central Florida

Course Cop 3502- Computer Science I

Pages 25

This preview shows page 1-2-24-25 out of 25 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 25 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 25 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 25 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 25 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 25 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Hash TableOutlineMotivationRecord ExampleExisting Data StructuresSlide 6Direct Access TableSlide 8Hash FunctionSlide 10Hash Table with Perfect HashCost SummaryIssues in hashingCollision ResolutionOpen AddressingHow to compute probe sequencesOpen Addressing ExampleLinear Probing: h(k, i ) = (h(k) + i ) mod m.Linear Probing: Clustering IssueQuadratic Probing: h(k, i ) = (h(k) + i 2) mod mAn Issue with Quadratic ProbingDynamic Table ExpansionSeparate ChainingSlide 24SummaryHash TableMarch 30 2009COP 3502, UCF1Outline•Hash Table:–Motivation–Direct Access Table–Hash Table•Solutions for Collision Problem:–Open Addressing:•Linear Probing•Quadratic Probing•Dynamic Table Expansion–Separate Chaining2Motivation•We have to store some records and perform the following:–add new records–delete records–search a record by key•Find a way to do these efficiently!keyrecordOther fields containingassociated data3Record Example9903030005678900123450033333tomdavidandybetty7356.881.590sid (key) name score9908080 bill 49...Consider this problem. We want to store 1,000 student records and search them by student id.Consider this problem. We want to store 1,000 student records and search them by student id....4Existing Data Structures•Use an array to store the records, in unsorted order–add - add the records as the last entry, very fast O(1)–delete a target - slow at finding the target, fast at filling the hole (just take the last entry) O(n)–search - sequential search, slow O(n)•Use an array to store the records, keeping them in sorted order–add - insert the record in proper position, much record movement, slow O(n)–delete a target - how to handle the hole after deletion? Much record movement, slow O(n)–search - binary search, fast O(log n)5Existing Data Structures•Binary Search Tree:–add: insert the record in proper position, fast O(logn)–delete a target: fast O(logn)–search: fast O(logn)6Direct Access Table:33333:123450::betty:andy::90:81.5:name score56789david 56.8:9908080:::bill:::49::9999999One way is to store the records in a huge array (index 0..9999999) The index is used as the student id, i.e. the record of the student with sid 0012345 is stored at A[12345]One way is to store the records in a huge array (index 0..9999999) The index is used as the student id, i.e. the record of the student with sid 0012345 is stored at A[12345]7Direct Access Table•Pros:–add- very fast O(1)–delete – very fast O(1)–search – very fast O(1)•Cons:–Waste a lot of memory. –Use a table of TEN MILLION entries to store ONE THOUSAND records. 8Hash FunctionImagine that we have such a magic function Hash. It maps the key (sid) of the 1000 records into the integers 0..999, one to one. No two different keys maps to the same number.Imagine that we have such a magic function Hash. It maps the key (sid) of the 1000 records into the integers 0..999, one to one. No two different keys maps to the same number.H(‘0012345’) = 134H(‘0033333’) = 67H(‘0056789’) = 764…H(‘9908080’) = 3function Hash(key): integer;hash code9Hash Table:betty:bill::90:49:name scoreandy 81.5::david:::56.8::0033333:9908080:0012345::0056789:3670764999134To store a record, we compute Hash(sid) for the record and store it at the location Hash(sid) of the array. To store a record, we compute Hash(sid) for the record and store it at the location Hash(sid) of the array. To search for a student, we only need to peek at the location Hash(target sid).To search for a student, we only need to peek at the location Hash(target sid).H(‘0012345’) = 134H(‘0033333’) = 67H(‘0056789’) = 764…H(‘9908080’) = 310Hash Table with Perfect Hash•Such magic function is called perfect hash–add – very fast O(1)–delete – very fast O(1)–search – very fast O(1)•But it is generally difficult to design perfect hash. (e.g. when the potential key space is large)11Cost SummaryWorst Case Average CaseImplementation Search Insert Delete Search Insert DeleteSorted Array log N N N log N N/2 N/2Unsorted Array N 1 N N/2 1 N/2Binary Search Tree N N N log N log N log NHash Table w/ Perfect Hash1 1 1 1 1 112Issues in hashing•Each hash should generate a unique number. If two different items produce the same hash code we have a collision in the data structure. Then what?•To deal with collisions, two issues must be addressed:1. Hash functions must minimize collisions (there are strategies to do this).2. When collisions do occur, we must know how to handle them.13Collision Resolution•Focus on issue #2 (collision resolution):–Assume the following hash function is a reasonably good one: h(k) = k%1000 (hash code = last 3 digits)•Two ways to resolve collisions:–Open Addressing: every hash table entry contains only one key. If a new key hashes to a table entry which is filled, systematically examine other table entries until you find one empty entry to place the new key.•Linear Probing•Quadratic Probing–Separate Chaining: every hash table entry contains a pointer to a linked list of keys that hash to the same entry. 14Open Addressing•Store all keys in the hash table itself.•Each slot contains either a key or NULL.•To search for key k:–Compute h(k) and examine slot h(k). Examining a slot is known as a probe.–Case 1: If slot h(k) contains key k, the search is successful. Case 2: If this slot contains NULLL, the search is unsuccessful.–Case 3: There’s a third possibility, slot h(k) contains a key that is not k. We compute the index of some other slot, based on k and on which probe (count from 0: 0th, 1st, 2nd, etc.) we’re on. Keep probing until we either find key k (successful search) or we find a slot holding NULL (unsuccessful search).15How to compute probe sequences•Linear probing: Given auxiliary hash function h, the probe sequence starts at slot h(k) and continues sequentially through the table, wrapping after slot m − 1 to slot 0. Given key k and probe number i (0 ≤ i < m), h(k, i ) = (h(k) + i ) mod m, m is the size of the table.•Quadratic probing: As in linear probing, the probe sequence starts at h(k). Unlike linear probing, it examines cells 1,4,9, and so on, away from the original probe point: h(k, i ) = (h(k) + i 2) mod m 16Open Addressing Example•Three students:–<0000001, A, 81.3>–<0001001, B, 92.5>–<0002001, C, 99.0>•Hash codes:–h(0000001) = 1%1000 = 1–h(0001001) = 1001%1000 = 1–h(0002001) = 2001%1000 =

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-24-25 out of 25 pages.

UCF COP 3502 - Hash Table

Sign up for free to view:

Please select your school