Unformatted text preview:

6 006 Intro to Algorithms Recitation 07 February 25 2011 Resizing Hash Tables Hash tables perform well if the number of elements in the table remain proportional to the size of the table If we know exactly how many inserts deletes are going to be performed on a table we would be able to set the table size appropriately at initialization However it is often the case that we won t know what series of operations will be performed on a table We must have a strategy to deal a various number of elements in the hash table while preserving an average O 1 access insertion and removal operations To restrict the load balance so that it does not get too large slow search insert delete or too small waste of memory we will increase the size of the hash table if it gets too full and decrease the size of the hash table if it gets too empty Resizing a hash table consists of choosing a new hash function to map to the new size creating a hash table of the new size iterating through the elements of the old table and inserting them into the new table Consider a hash table that resolves collisions using the chaining method We will double the size of the hash table whenever we make an insert operation that results in the load balance exceeding 1 i e n m We will halve the size of the hash table whenever we make a delete operation that results in the load balance falling beneath 14 i e n m4 In the next sections we will analyze this approach and show that the average runtime of each insertion and deletion is still O 1 even factoring in the time it takes to resize the table Increasing Table Size After doubling the table size due to an insert n m2 and the load balance is 21 We will need at least m insert operations before the next time we double the size of the hash table The next resizing 2 will take O 2m time as that s how long it takes to create a table of size 2m 6 006 Intro to Algorithms Recitation 07 February 25 2011 On average since the number of elements is proportional to the size of the table at all times each of the m2 inserts before resizing will still take O 1 time The last insert will take O 2m time as we need to factor in the time it takes to resize the table We can use amortized analysis to argue that the average runtime of all the insertions is O 1 The last insert before resizing costs O 2m time but we needed m2 inserts before actually paying that cost We can imagine spreading the O 2m cost across the m2 inserts evenly which adds an additional average amortized cost of 2m O 0 5m per insert or O 1 per insert Since the cost of insertion before was O 1 adding an additional O 1 amortized cost to each insert doesn t affect the asymptotic runtime and insertions on average take O 1 time still Decreasing Table Size Similarly after halving the table size due to an deletion n m2 We will need at least m4 delete operations before the next time we halve the size of the hash table The cost of the next halving is O m2 to make a size m2 table The m4 deletes take O 1 time and the resizing cost of O m2 can be split evenly across those m4 0 5m deletes Each deletion has an additional average amortized cost of O 0 25m or O 1 This results in maintaining the O 1 average cost per deletion Performance of Open Addressing Recall that searching inserting and deleting an element using open addressing required a probe sequence e g linear probing quadratic probing double hashing To analyze the performance of operations in open addressing we must determine on average how many probes does it take before we execute the operation Before we made the simple uniform hashing assumption SUHA which meant a hash function mapped to any slot from 0 to m 1 with equal probability Now we make the uniform hashing assumption UHA which is a slight extension from SUHA UHA assumes that the probe sequence is a random permutation of the slots 0 to m 1 In other words each probe looks likes we re examining a random slot that we havent examined before If the table has load balance that means there is a p 1 probability that the first probe will find an empty slot under UHA If the first probe is a collision note that the probability that the 6 006 Intro to Algorithms Recitation 07 February 25 2011 second probe will find an empty slot is greater than p since there are an equal number of empty slots that we could insert in but were choosing randomly from a pool of fewer slots In general after each collision there is a probability of at least p that we will probe into an empty slot Using principles of probability if there is exactly a probability of p that we will find an empty slot at each probe then we expect to probe p1 times before we succeed For example if p 14 we expect to probe 4 times before we find an empty slot Since in our case our probability of success is actually increasing after each probe p1 is a high estimate on how many times we probe before we succeed Since p 1 we expect to probe at most p1 times Looking at the behavior of the 1 graph it is clear that with open addressing performance is fairly good until approaches too 1 close to 1 Universal Hashing With a fixed hashing function an adversary could select a series of keys to insert into the hash table that all collide giving the hash table worst case performance Universal hashing is the idea that we select the hash function randomly from a group of hash functions This means an adversary cannot choose keys that he knows will give worst case performance anymore since the adversary doesn t even know what hash function will be chosen for the table If we form the group of hash functions carefully we can assure that the expected time for each operations is O 1 even if there is an adversary who is trying to achieve worst case performance For universal hashing to work the group of hash functions H must be universal This means that for each pair of distinct keys k l in the universe of keys the number of hash functions in the This means for each pair of distinct keys the chances group for which h k h l is at most H m of picking a hash function in which they collide is at most m1 which is the same probability given by the simple uniform hashing assumption


View Full Document

MIT 6 006 - Resizing Hash Tables

Documents in this Course
Quiz 1

Quiz 1

7 pages

Quiz 2

Quiz 2

12 pages

Quiz 2

Quiz 2

9 pages

Quiz 1

Quiz 1

10 pages

Quiz 2

Quiz 2

11 pages

Quiz 1

Quiz 1

12 pages

Graphs

Graphs

27 pages

Load more
Loading Unlocking...
Login

Join to view Resizing Hash Tables and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Resizing Hash Tables and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?