The Set ADT List and Tree ImplementationsThe Set ADTSlide 3Slide 4Additional set operations ….The Dictionary ADTUnordered ListsUnordered ListsTime to perform LookupExpected-cost analysisAssume: Uniform distribution of access to keysAssume: non-uniform access to keysWhat is the optimal ordering?But you usually don’t know the frequency of access in advance …Move-to-front Heuristic:Transpose Heuristic:Ordered ListsSlide 18BinarySearchLookupBinary Search ExampleInterpolation SearchThe Set ADTList and Tree ImplementationsCSCI 2720Spring 2007Eileen KraemerThe Set ADTWe assume:Members drawn from a single universeSets are finite; universe may be infiniteNo duplicate membersBut some representations can be used to represent multisets (which may contain duplicate members)Note: ADT = abstract data typeInteresting because “is x in S?” is asked in many applications.The Set ADTMember(x,S)If x in S return true else return falseUnion(S,T)return S T, the set consisting of every x in S or T or bothIntersection(S,T)return S T, the set consisting of every x in both S and TDifference (S,T)return S – T, the set of all x in S that are not in TMakeEmptySet()Return the empty setThe Set ADTIsEmptySet(S)return true if S is the empty set, else return falseSize(S)return |S| (“cardinality of S”), the number of elements in the set SInsert(x,S)set S to S {x} ; no effect if x is already in SDelete(x,S)set S to S – {x}; no effect if x not in SEqual(S,T) Return true if S == T (if S and T have the same members)Iterate(S,F)perform operation F on each member of set S (in no particular order)Additional set operations ….Apply only if members are linearly orderedMin(S) – return smallest member of set SMax(S) – return largest member of set SApplies only if members are of form <K,I>Where K is a key drawn from some key space and I is some associated infoLookup(K,S)Given a key value K, return an Info I such that <K,I> is a member of S; if no such member exists return If a <K,I> is found : successful searchIf is returned : unsuccessful searchThe Dictionary ADTA set ADT with the operations:MakeEmptySetIsEmptySetInsertDeleteLookupUnordered ListsSimplest representation of setsMay use contiguous memory, singly linked, doubly linkedApplies to sets drawn from any universeKeys may be ordered or unorderedUnordered ListsLookupSimple sequential search; At each step, compare two keys for equalityIf dictionary contains n elements, then Lookup is Θ(n) in the worst case if the lists are unorderedInsert = Lookup + insertion costDelete = Lookup + deletion costTime to perform LookupCount the number of comparisons:How many (K == K’) operations are performed?Linked list:Worst case for successful search:•K is last key in list: n comparisonsWorst case for unsuccessful search:•Compare to all keys in list: n comparisonsSeems “okay” only if list is shortExpected-cost analysisHere, we consider the frequency and cost of each lookup and compute a weighted sum: the expected costAssume:Uniform distribution of access to keysExpected cost for linked lists: = average number of comparisons to lookup a key= (1+2+3+…+n)/n= (n+1)/2Thus:Expected cost for a lookup in linked list is Θ(n)Assume:non-uniform access to keysCloser to reality: relatively few keys account for most of the LookUpsLet the keys in the dictionary be K1, …, Kn, ordered by frequency of access (K1 accessed most frequently)Let pi be frequency of access to key Kip1 + p2 + … + pn = 1.0For successful searches:Expected cost = (p1 *1)+ (p2 *2)+ … + (pn *n) For unsuccessful searches:Cost is always θ(n)What is the optimal ordering?A list kept in frequency order:p1 >= p2 >= … >= pnIf the frequencies are sufficiently skewed, this can be much less than (n+1)/2Example: p1 = ½, p2 = ¼, p3= ⅛, …etc. Expected cost is :(1* ½ + 2* ¼ + 3* ⅛+ …..n* 2-n+1)/nless than 2If p=4, then cost = (1* ½ + 2* ¼ + 3* ⅛+4*⅛) = 15/32But you usually don’t know the frequency of access in advance …However, you can adjust the order of the list as the Lookups occur:Move-to-front Heuristic:Transpose Heuristic:Move-to-front Heuristic:After each successful search, move the item that was sought to the front of the listCost is no more than twice the optimum (proof in text)Example: {A, B, C, D}Search on C: cost=3, adjust to {C, A, B, D}Search on D: cost=4, adjust to {D, C, A, B}Search on C: cost=2, adjust to {C, D, A, B}Search on D: cost=1, adjust to {D, C, A, B}Transpose Heuristic:If the item sought is not the first in the list, move it one position forward by exchanging it with the item just before itPerformance (after steady state reached) is even better than Move-to-Front (but harder to prove)Example: {A, B, C, D}Search on C: cost=3, adjust to {A, C, B, D}Search on D: cost=4, adjust to {A, C, D, B}Search on C: cost=2, adjust to {C, A, D, B}Search on A, cost=2, adjust to {A, C, D, B}Ordered ListsIf the keys have some linear order, can maintain list in order by key valueBenefits:Unsuccessful search can stop when key of greater value is encountered: expected number of comparisons reduced to n/2Ordered ListsIf representation is tabular (contigous memory, “array-based”), can employ binary searchBinary search (worst case) between 1 and floor(lg n) + 1 comparisons in a successful search of a table of size n, either floor(lg n) or floor(lg n) + 1 comparisons in an unsuccessful search Floor(lg n) + 1 exactly, if n is one less than a power of 2BinarySearchLookupFunction BinarySearchLookUp(key K, table T[0..n-1]):info{return information stored with key K in T, or if K is not in T}Left = 0Right = n-1Repeat foreverif Right < Left thenreturn elseMiddle = floor((Left+Right)/2)if (K == Key(T[Middle])) then return Info(T[Middle])else if K < Key(T[Middle])then Right <= Middle -1else Left = Middle + 1Binary Search ExampleChoose an array of 16 values (have class provide values)Work through example on board.Interpolation Searchfunction InterpolationSearchLookup(key K, table T[0..n-1]): info//return info stored with K in ordered table T, or if K not present//table positions T[-1] and T[n] are assumed to be
View Full Document