VASSAR CMPU 102 - Advanced Implementation of Tables - D1675247

Home> Schools> Vassar College> (CMPU) > CMPU 102> Advanced Implementation of Tables

VASSAR CMPU 102 - Advanced Implementation of Tables

Course Cmpu 102- Computer Science II: Data Structures and Algorithms

Pages 4

Download Save

Unformatted text preview:

© 2006 Pearson Addison-Wesley. All rights reserved 13 B-1Chapter 13 (excerpts)Advanced Implementation ofTablesCS102 Sections 51 and 52Marc Smith and Jim Ten EyckSpring 2008© 2006 Pearson Addison-Wesley. All rights reserved 13 B-2AVL Trees• An AVL tree– A balanced binary search tree– Can be searched almost as efficiently as a minimum-height binary search tree– Maintains a height close to the minimum– Requires far less work than would be necessary to keepthe height exactly equal to the minimum• Basic strategy of the AVL method– After each insertion or deletion• Check whether the tree is still balanced• If the tree is unbalanced, restore the balance© 2006 Pearson Addison-Wesley. All rights reserved 13 B-3AVL Trees• Rotations– Restore the balance of a tree– Two types• Single rotation• Double rotationFigure 13-38Figure 13-38a) An unbalanced binary search tree; b) a balanced tree after a single left rotation© 2006 Pearson Addison-Wesley. All rights reserved 13 B-4AVL TreesFigure 13-42Figure 13-42a) Before; b) during; and c) after a double rotation© 2006 Pearson Addison-Wesley. All rights reserved 13 B-5AVL Trees• Advantage– Height of an AVL tree with n nodes is always veryclose to the theoretical minimum• Disadvantage– An AVL tree implementation of a table is more difficultthan other implementations© 2006 Pearson Addison-Wesley. All rights reserved 13 B-6Hashing• Hashing– Enables access to table items in time that is relativelyconstant and independent of the items• Hash function– Maps the search key of a table item into a location thatwill contain the item• Hash table– An array that contains the table items, as assigned by ahash function© 2006 Pearson Addison-Wesley. All rights reserved 13 B-7Hashing• A perfect hash function– Maps each search key into a unique location of the hash table– Possible if all the search keys are known• Collisions– Occur when the hash function maps more than one item into thesame array location• Collision-resolution schemes– Assign locations in the hash table to items with different searchkeys when the items are involved in a collision• Requirements for a hash function– Be easy and fast to compute– Place items evenly throughout the hash table© 2006 Pearson Addison-Wesley. All rights reserved 13 B-8Hash Functions• It is sufficient for hash functions to operate onintegers• Simple hash functions that operate on positiveintegers– Selecting digits– Folding– Module arithmetic• Converting a character string to an integer– If the search key is a character string, it can beconverted into an integer before the hash function isapplied© 2006 Pearson Addison-Wesley. All rights reserved 13 B-9Resolving Collisions• Two approaches to collision resolution– Approach 1: Open addressing• A category of collision resolution schemes that probe for anempty, or open, location in the hash table– The sequence of locations that are examined is the probesequence• Linear probing– Searches the hash table sequentially, starting from the originallocation specified by the hash function– Possible problem» Primary clustering© 2006 Pearson Addison-Wesley. All rights reserved 13 B-10Resolving Collisions• Approach 1: Open addressing (Continued)– Quadratic probing• Searches the hash table beginning with the original location that thehash function specifies and continues at increments of 12, 22, 32, andso on• Possible problem– Secondary clustering– Double hashing• Uses two hash functions• Searches the hash table starting from the location that one hashfunction determines and considers every nth location, where n isdetermined from a second hash function• Increasing the size of the hash table– The hash function must be applied to every item in the old hashtable before the item is placed into the new hash table© 2006 Pearson Addison-Wesley. All rights reserved 13 B-11Resolving Collisions• Approach 2: Restructuring the hash table– Changes the structure of the hash table so that it canaccommodate more than one item in the same location– Buckets• Each location in the hash table is itself an arraycalled a bucket– Separate chaining• Each hash table location is a linked list© 2006 Pearson Addison-Wesley. All rights reserved 13 B-12The Efficiency of Hashing• An analysis of the average-case efficiency ofhashing involves the load factor– Load factor !• Ratio of the current number of items in the table to themaximum size of the array table• Measures how full a hash table is• Should not exceed 2/3– Hashing efficiency for a particular search also dependson whether the search is successful• Unsuccessful searches generally require more time thansuccessful searches© 2006 Pearson Addison-Wesley. All rights reserved 13 B-13The Efficiency of Hashing• Linear probing– Successful search: ![1 + 1(1-!)]– Unsuccessful search: ![1 + 1(1- !)2]• Quadratic probing and double hashing– Successful search: -loge(1- !)/ !– Unsuccessful search: 1/(1- !)• Separate chaining– Insertion is O(1)– Retrievals and deletions• Successful search: 1 + (!/2)• Unsuccessful search: !© 2006 Pearson Addison-Wesley. All rights reserved 13 B-14The Efficiency of HashingFigure 13-50Figure 13-50The relative efficiency of four collision-resolution methods© 2006 Pearson Addison-Wesley. All rights reserved 13 B-15What Constitutes a Good HashFunction?• A good hash function should– Be easy and fast to compute– Scatter the data evenly throughout the hash table• Issues to consider with regard to how evenly a hashfunction scatters the search keys– How well does the hash function scatter random data?– How well does the hash function scatter nonrandom data?• General requirements of a hash function– The calculation of the hash function should involve the entiresearch key– If a hash function uses module arithmetic, the base should beprime© 2006 Pearson Addison-Wesley. All rights reserved 13 B-16Table Traversal: An InefficientOperation Under Hashing• Hashing as an implementation of the ADT table– For many applications, hashing provides the mostefficient implementation– Hashing is not efficient for• Traversal in sorted order• Finding the item with the smallest or largest value in its searchkey• Range query• In external storage, you can simultaneously use– A hashing implementation of the tableRetrieveoperation– A search-tree implementation of the ordered

View Full Document


School:
Email:
New Password:
Confirm Password:

VASSAR CMPU 102 - Advanced Implementation of Tables

Sign up for free to view:

Please select your school