DOC PREVIEW
UW CSE 332 - Lecture Notes

This preview shows page 1-2-3-4-5-32-33-34-35-64-65-66-67-68 out of 68 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 68 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 68 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 68 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 68 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 68 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 68 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 68 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 68 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 68 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 68 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 68 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 68 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 68 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 68 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 68 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Slide 1Hash Table: Another dictionaryHash tablesHash functionsWho hashes what?More on rolesWhat to hash?Hashing integersCollision-avoidanceMore arguments for a prime table sizeWhat if we don’t have ints as keys?Java-esque String HashSpecializing hash functionsCombining hash functionsAdditional operationsHash Tables: A Different ADT?Collision resolutionSeparate ChainingSeparate ChainingSeparate ChainingSeparate ChainingSeparate ChainingSeparate ChainingThoughts on chainingTime vs. space (constant factors only here)A more rigorous chaining analysisSeparate Chaining DeletionAn Alternative to Separate Chaining: Open AddressingAn Alternative to Separate Chaining: Open AddressingAn Alternative to Separate Chaining: Open AddressingAn Alternative to Separate Chaining: Open AddressingAn Alternative to Separate Chaining: Open AddressingOpen addressing: Storing in the tableMore about Open AddressingTerminologyPrimary ClusteringAnalysis of Linear ProbingIn a chartOpen Addressing: Quadratic probingQuadratic Probing ExampleQuadratic Probing ExampleQuadratic Probing ExampleQuadratic Probing ExampleQuadratic Probing ExampleQuadratic Probing ExampleAnother Quadratic Probing ExampleAnother Quadratic Probing ExampleAnother Quadratic Probing ExampleAnother Quadratic Probing ExampleAnother Quadratic Probing ExampleAnother Quadratic Probing ExampleAnother Quadratic Probing ExampleFrom bad news to good newsClustering reconsideredOpen Addressing: Double hashingDouble-hashing analysisYet another reason to use a prime TablesizeMore double-hashing factsCharts: Double hashing (w/ uniform hashing) vs. Linear probingWe’ve explored different methods of collision detectionRehashingMore on rehashingHashing and comparingEqual objects must hash the sameJava bottom line(Incorrect) ExampleAside: Comparable/Comparator have rules tooFinal word on hashingCSE332: Data AbstractionsLecture 11: Hash TablesTyler RobisonSummer 20101Hash Table: Another dictionary2Aim for constant-time (i.e., O(1)) find, insert, and delete“On average” under some reasonable assumptionsA hash table is an array of some fixed sizeDefine a mapping from each key to a location in tableBasic idea:0…TableSize –1 hash function:index = h(key)hash tablekey space (e.g., integers, strings)Hash tables3There are m possible keys (m typically large, even infinite) but we expect our table to have only n items where n is much less than m (often written n << m)Many dictionaries have this propertyCompiler: All possible identifiers allowed by the language vs. those used in some file of one programDatabase: All possible student names vs. students enrolledAI: All possible chess-board configurations vs. those considered by the current playerHash functions4Hash function: Our key to index mappingAn ideal hash function:Is fast to compute“Rarely” hashes two “used” keys to the same indexOften impossible in theory; easy in practiceWill handle collisions a bit later0…TableSize –1 hash function:index = h(key)hash tablekey space (e.g., integers, strings)Who hashes what?5Hash tables can be genericTo store elements of type E, we just need E to be:1. Comparable: order any two E (like with all dictionaries)2. Hashable: convert any E to an intWhen hash tables are a reusable library, the division of responsibility generally breaks down into two roles:•We will learn both roles, but most programmers “in the real world” spend more time on the client side, while still having an understanding of the libraryEint table-indexcollision?collisionresolutionclienthash table libraryMore on roles6Two roles must both contribute to minimizing collisions•Client should aim for different ints for expected items–Avoid “wasting” any part of E or the 32 bits of the int•Library should aim for putting “similar” ints in different indices–conversion to index is almost always “mod table-size”–using prime numbers for table-size is commonEint table-indexcollision?collisionresolutionclienthash table librarySome ambiguity in terminology on which parts are “hashing”“hashing”?“hashing”?What to hash?7In lecture we will consider the two most common things to hash: integers and stringsIf you have objects with several fields, it is usually best to have most of the “identifying fields” contribute to the hash to avoid collisionsExample: class Person { String first; String middle; String last; int age; }Hashing integers8key space = integersUseful for examplesSimple hash function: h(key) = key % TableSizeClient: f(x) = xLibrary g(x) = x % TableSizeFairly fast and naturalExample:TableSize = 10Insert 7, 18, 41, 34, 10(As usual, ignoring data “along for the ride”)What could go wrong?Now insert 20….0123456789104134718Collision-avoidance9Collision: Two keys map to the same indexWith “x % TableSize” the number of collisions depends onthe ints insertedTableSizeLarger table-size tends to help, but not alwaysExample: Insert 12, 22, 32 with TableSize = 10 vs. TableSize = 6Technique: Pick table size to be prime. Why?Real-life data tends to have a pattern, and “multiples of 61” are probably less likely than “multiples of 60”Later we’ll see that one collision-handling strategy does provably better with prime table sizeUsually use something like 10 for examples though0 1212 3234 225More arguments for a prime table size10If TableSize is 60 and…Lots of data items are multiples of 5, wasting 80% of tableLots of data items are multiples of 10, wasting 90% of tableLots of data items are multiples of 2, wasting 50% of tableIf TableSize is 61…Collisions can still happen, but 5, 10, 15, 20, … will fill tableCollisions can still happen but 10, 20, 30, 40, … will fill tableCollisions can still happen but 2, 4, 6, 8, … will fill tableIn general, if x and y are “co-prime” (means gcd(x,y)==1), then (a * x) % y == (b * x) % y if and only if a % y == b % ySo, given table size y and keys as multiples of x, we’ll get a decent distribution if x & y are co-primeGood to have a TableSize that has not common factors with any “likely pattern” xWhat if we don’t have ints as keys?11If keys aren’t ints, the client must convert to an intTrade-off: speed and distinct keys hashing to distinct intsVery important example: StringsKey space K = s0s1s2…sm-1 Where si are chars: si


View Full Document

UW CSE 332 - Lecture Notes

Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?