DOC PREVIEW
Stanford CS 106B - Implementing Sets

This preview shows page 1 out of 3 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 3 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 3 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Eric Roberts Handout #48CS 106B May 20, 2009Implementing SetsIn Our Last Episode . . .• I reviewed the basics of mathematical set theory and used thattheory to design the interface for a general Set class.• In the first five minutes or so of today’s class, I will developan implementation of the Set class based on the generic BSTclass we built last week.• For the rest of the day, I will talk about optimizations to thatimplementation along with some general strategies about howto make implementations as efficient as possible.template <typename ElemType>class Set { Set(int (*cmpFn)(ElemType, ElemType) = OperatorCmp); ~Set(); int size(); bool isEmpty(); void clear(); void add(ElemType element); void remove(ElemType element); bool contains(ElemType element); bool equals(Set & otherSet); bool isSubsetOf(Set & otherSet); void unionWith(Set & otherSet); void intersect(Set & otherSet); void subtract(Set & otherSet); Iterator iterator(); template <typename ClientElemType> void mapAll(void (*fn)(ElemType elem, ClientElemType &data), ClientElemType &data);private:#include "setpriv.h"};#include "setimpl.cpp"#endifContents of the set.h Interface/* * File: setpriv.h * --------------- * This file contains the private section for the set.h interface. *//* Instance variables */ BST<ElemType> bst; /* The BST representing the set */The Easy Implementation• As is so often the case, the easy way to implement the Setclass is to build it out of data structures that you already have.In this case, it make sense to build Set on top of the BST class.• The private section looks like this:template <typename ElemType>Set<ElemType>::Set(int (*cmp)(ElemType, ElemType)) : bst(cmp) { /* Empty */}template <typename ElemType>Set<ElemType>::~Set() { /* Empty */}template <typename ElemType>int Set<ElemType>::size() { return bst.size();}template <typename ElemType>bool Set<ElemType>::isEmpty() { return bst.isEmpty();}template <typename ElemType>void Set<ElemType>::add(ElemType element) { bst.add(element);}. . . and so on . . .The setimpl.cpp Implementation/* * Implementation notes: Set operations * ------------------------------------ * The functions isSubsetOf, unionWith, intersect, and subtract * are similar in structure. Each one uses an iterator to walk over * the appropriate set. */template <typename ElemType>bool Set<ElemType>::isSubsetOf(Set & otherSet) { Iterator iter = iterator(); while (iter.hasNext()) { if (!otherSet.contains(iter.next())) return false; } return true;}template <typename ElemType>bool Set<ElemType>::equals(Set & otherSet) { return isSubsetOf(otherSet) && otherSet.isSubsetOf(*this);}The setimpl.cpp ImplementationInitial Versions Should Be Simple• When you are developing an implementation of a publicinterface, it is usually best—at least as a first cut—to write thesimplest possible code that satisfies the requirements of theinterface.• This approach has several advantages:– You can get the package out to clients much more quickly.– Simple implementations are much easier to get right.– You often won’t have any idea what optimizations are neededuntil you have experiential data from clients of that interface. Interms of overall efficiency, some optimizations are much moreimportant than others.Premature optimization is the root of all evil.—Don Knuth– 2 –Optimizing the Binary Search Tree• If you code the Set implementation using the simple “let BSTdo all the work” strategy, it will probably not be surprising todiscover that the necessary optimizations are in the BST classitself.• In class last Friday, I didn’t go through one of the mostimportant topics in the discussion of binary search trees: thequestion of whether a tree is balanced. Binary search treesdeliver O (log N) performance only if the left and rightbranches of each node are roughly comparable in height. Thecode as I presented it in lecture does nothing to ensure thatproperty.• Leaving this issue until today might well mirror the discoveryof such problems in the real world. The BST class gets far lessdirect use than the Set class, so it is likely that performanceproblems would show up only when clients began using Sets.A Question of Balance• Ideally, a binary search tree containing the names of Disney’sseven dwarves would look like this:• If, however, you happened to enter the names in alphabeticalorder, this tree would end up being a simple linked list inwhich all the left subtrees were NULL and the right linksformed a simple chain. Algorithms on that tree would run inO(N) time instead of O(log N) time.• A binary search tree is balanced if the height of its left andright subtrees differ by at most one and if both of thosesubtrees are themelves balanced.AVL Trees• The text presents an algorithm designed by Georgii Adel’son-Vel’skii and Evgenii Landis for keeping a trees in balance.That algorithm is based on operations called rotations that tryto keep a tree in balance. For example, a simple left rotationlooks like this:• Exercise: How would you write the RotateLeft function thatperforms this transformation? What would the prototype be?What is left out of this diagram that you need to think about?Sets and Efficiency• After you release the set package, you might discover thatclients use them often for particular types for which there aremuch more efficient data structures than binary trees.• One thing you could do easily is check to see whether theelement type was string and then use a Lexicon instead of abinary search tree. The resulting implementation would be farmore efficient. This change, however, would be valuable onlyif clients used Set<string> often enough to make it worthadding the complexity.• One type of sets that do tend to occur in certain types ofprogramming is Set<char>, which comes up, for example, ifyou want to specify a set of delimiter characters for a scanner.These sets can be made astonishingly efficient as described onthe next few slides.Character Sets• The key insight needed to make efficient character sets (or,equivalently, sets of small integers) is that you can representthe inclusion or exclusion of a character using a single bit. Ifthe bit is a 1, then that element is in the set; if it is a 0, it is notin the set.• You can tell what character value you’re talking about bycreating what is essentially an array of bits, with one


View Full Document

Stanford CS 106B - Implementing Sets

Download Implementing Sets
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Implementing Sets and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Implementing Sets 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?