DOC PREVIEW
MIT 6 006 - Introduction to Algorithms

This preview shows page 1-2-17-18-19-35-36 out of 36 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

6.006-Introduction to6.006Introduction to AlgorithmsLecture 7PfM li K lliProf.Manolis KellisCLRS: 11.4, 17.Unit #2 – Genomes, Hashing, and Dictionaries2Unit #2: HashingLtT G Diti i Hhi•Last Tues: Genomes, Dictionaries, Hashing– Intro, basic operations, collisions and chainingSi l if h hi i–Simple uniform hashing assumption• Last Thur: Faster hashing, hash functions–Hash functions in practice: div/mult/python– Faster hashing: Rolling Hash: O(n2)O(nlgn)– Faster comparison: Signatures: mismatch time• Today: Space issues– Dynamic resizing and amortized analysis– Open addressing, deletions, and probing– Advanced topics: universal hashing, fingerprints3Today: Hashing IΙI: Space issuesRHhfthii SUHA lli i tRev: Hash functs, chaining, SUHA, rolling, signaturesDynamic dictionaries: Resizing hash tablesWhen to resize: insertions, deletionsResizing operations, amortized analysisOpen addressing: Doing away w/ linked listsOperations: insertion, deletionOperations: insertion, deletionProbing: linear probing, double hashingPerformance analysis: UHA, open vs. chainingPerformance analysis: UHA, open vs. chainingAdvanced topics: Randomized algorithms (shht!)Ui lh hi f th hiUniversal hashing, perfect hashingFingerprinting, file signatures, false positivesRemember Hashing I and II•Hashing and hash functions•Hashing and hash functions Humongous universe of keys  itty bitty little space• Hash table as dictionaryy Insert/Search/Delete• Collisions by chainingBuild a linked list in each bucketBuild a linked list in each bucket Operation time is length of list• Simple Uniform Hashing Assumption Every item to uniform random bucket n items in size m table  average length n/m = α•Speeding up hashing•Speeding up hashing Rolling Hash: fast sequence of hash’s Signatures: fast comparison, avoid frequent mismatchesCi•Comparing genomes O(n4)ObinsearchL(n3lgn)Ohash(n2lgn)Oroll/sign(nlgn)Today: Hashing IΙI: Space issuesRHhfthii SUHA lli i tRev: Hash functs, chaining, SUHA, rolling, signatures Dynamic dictionaries: Resizing hash tablesWhen to resize: insertions, deletionsResizing operations, amortized analysisOpen addressing: Doing away w/ linked listsOperations: insertion, deletionOperations: insertion, deletionProbing: linear probing, double hashingPerformance analysis: UHA, open vs. chainingPerformance analysis: UHA, open vs. chainingAdvanced topics: Randomized algorithms (shht!)Ui lh hi f th hiUniversal hashing, perfect hashingFingerprinting, file signatures, false positivesDynamic DictionariesIbi liii dll•In substring application, inserted all at once then scanned• More generally, arbitrary sequence of insert, delete, find• How do we know how big the table will get?•What if we guess wrong?What if we guess wrong?too small  load high, operations too slowtoo largehigh initialization cost, consumes space,too large high initialization cost, consumes space, •Want m=(n) at all timespotentially more cache-misses•Want m=(n) at all timesSolution: Resize when neededSolution: Resize when needed•Start table at small constant sizeStart table at small constant size• When table too full, make it biggerhbl ki ll• When table too empty, make it smaller• How? Build a whole new hash table and insert items Pick new hash ‘seed’, recompute all hashesp Recreate new linked lists Time spent to rebuild: p(new-size) + #hashes x (HashTime)When to resize?When to resize?• Approach 1: whenever n > m, rebuild table to new size Sequence of n insertsE h i t b ilda factor of (HashTime) is suppressed hereEach increases n past m, causes rebuild Total work: (1 + 2 + … + n) = (n2)•Approach 2:Whenevern≥ 2m rebuild table to new•Approach 2: Whenever n≥ 2m, rebuild table to new sizeCostly inserts:insert 2ifor alli:Costly inserts:insert 2for all i: These cost: (1 + 2 + 4 + … + n) = (n) All other inserts take O(1) time – why? Inserting n items takes O(n) time Keeps m a power of 2 --- good for modAmortized AnalysisAmortized Analysis•If a sequence of n operations takes time T, thenIf a sequence of n operations takes time T, then each operation has amortized cost T/n Like amortizing a loan: payment per monthgpyp• Rebuilding when n ≥ 2m  some ops are very slow (n) for insertion that causes last resize• But on average, fastg, O(1) amortized cost per operation•Often, only care about total runtimeOften, only care about total runtime So averaging is fineInsertions+Deletions?b ild bl i h ?• Rebuild table to new size when n < m? Same as bad insert: O(n2) workR b ild h</2?•Rebuild when n<m/2? Makes a sequence of deletes fastWhat about an arbitrary sequence of inserts/deletes?What about an arbitrary sequence of inserts/deletes?• Suppose we have just rebuilt: m=n•Next rebuild a growat leastmmore inserts areNext rebuild a grow at least mmore inserts are needed before growing table Amortized cost O(2m / m)) = O(1)• Next rebuild a shrink  at least m/2 more deletes are needed before shrinkingCost to rebuildPaid after m insertionsneeded before shrinking Amortized cost O(m/2 / (m/2)) = O(1)Cost to rebuildPaid after m/2 deletionsPutting the two togetherPutting the two together•AlgorithmAlgorithm Keep m a power of 2 (good for mod)Grow (double m) whenn≥ mGrow (double m) when n≥ m Shrink (halve m) when n ≤ m/4•AnalysisAnalysis Just after rebuild: n=m/2Next rebuild a grow  at least m/2 more insertsg• Amortized cost O(2m / (m/2)) = O(1) Next rebuild a shrink  at least m/4 more deletes• Amortized cost O(m/2 / (m/4)) = O(1)SummarySummary•Arbitrary sequence of insert/delete/findArbitrary sequence of insert/delete/find• O(1) amortized time per operationToday: Hashing IΙI: Space issuesRHhfthii SUHA lli i tRev: Hash functs, chaining, SUHA, rolling, signatures Dynamic dictionaries: Resizing hash tables When to resize: insertions, deletions Resizing operations, amortized analysis Open addressing: Doing away w/ linked listsOperations: insertion, deletionOperations: insertion, deletionProbing: linear probing, double hashingPerformance analysis: UHA, open vs. chainingPerformance analysis: UHA, open vs. chainingAdvanced topics: Randomized algorithms (shht!)Ui lh hi f th hiUniversal


View Full Document

MIT 6 006 - Introduction to Algorithms

Documents in this Course
Quiz 1

Quiz 1

7 pages

Quiz 2

Quiz 2

12 pages

Quiz 2

Quiz 2

9 pages

Quiz 1

Quiz 1

10 pages

Quiz 2

Quiz 2

11 pages

Quiz 1

Quiz 1

12 pages

Graphs

Graphs

27 pages

Load more
Download Introduction to Algorithms
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Introduction to Algorithms and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Introduction to Algorithms 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?