Hashing: Log (10100) is a big numberHashing detailsChaining with hashingHashing problemsWhat about hash functionsTools: Solving Computational ProblemsQuota ExceededRecursive structure matches codeSolving Problems RecursivelyPrint words read, but print backwardsExponentiationFaster exponentiationBack to RecursionRecognizing recursion:The Power of Recursion: Brute forceRecasting the problemRecursive example 1Recursive example 2Recursive example 3CompSci 100e6.1Hashing: Log (10100) is a big numberComparison based searches are too slow for lots of dataHow many comparisons needed for a billion elements?What if one billion web-pages indexed?Hashing is a search method: average case O(1) searchWorst case is very bad, but in practice hashing is goodAssociate a number with every key, use the number to store the key•Like catalog in library, given book title, find the bookA hash function generates the number from the keyGoal: Efficient to calculateGoal: Distributes keys evenly in hash tableCompSci 100e6.2Hashing detailsThere will be collisions, two keys will hash to the same valueWe must handle collisions, still have efficient searchWhat about birthday “paradox”: using birthday as hash function, will there be collisions in a room of 25 people? Several ways to handle collisions, in general array/vector usedLinear probing, look in next spot if not found•Hash to index h, try h+1, h+2, …, wrap at end•Clustering problems, deletion problems, growing problemsQuadratic probing•Hash to index h, try h+12, h+22 , h+32 , …, wrap at end•Fewer clustering problemsDouble hashing•Hash to index h, with another hash function to j•Try h, h+j, h+2j, …0 1 2 3 n-1CompSci 100e6.3Chaining with hashingWith n buckets each bucket stores linked listCompute hash value h, look up key in linked list table[h]Hopefully linked lists are short, searching is fastUnsuccessful searches often faster than successful•Empty linked lists searched more quickly than non-emptyPotential problems?Hash table detailsSize of hash table should be a prime numberKeep load factor small: number of keys/size of tableOn average, with reasonable load factor, search is O(1)What if load factor gets too high? Rehash or other methodCompSci 100e6.4Hashing problemsLinear probing, hash(x) = x, (mod tablesize)Insert 24, 12, 45, 14, delete 24, insert 23 (where?)Same numbers, use quadratic probing (clustering better?)What about chaining, what happens?0 1 2 3 4 5 6 7 8 9 100 1 2 3 4 5 6 7 8 9 10241245 1424124514CompSci 100e6.5What about hash functionsHashing often done on strings, consider two alternativespublic static int hash(String s){ int k, total = 0; for(k=0; k < s.length(); k++){ total += s.charAt(k); } return total;}Consider total += (k+1)*s.charAt(k), why might this be better?Other functions used, always mod result by table sizeWhat about hashing other objects?Need conversion of key to index, not always simpleEver object has method hashCode()!CompSci 100e6.6Tools: Solving Computational ProblemsAlgorithmic techniques and paradigmsBrute-force/exhaustive, greedy algorithms, dynamic programming, divide-and-conquer, …Transcend a particular languageDesigning algorithms, may change when turned into codeProgramming techniques and paradigmsRecursion, memo-izing, compute-once/lookup, tables, …Transcend a particular languageHelp in making code work•Avoid software problems (propagating changes, etc.)•Avoid performance problemsCompSci 100e6.7Quota ExceededYou’re running out of disk spaceBuy moreCompress filesDelete filesHow do you find your “big” files?What’s big?How do you do this?CompSci 100e6.8Recursive structure matches code public static long THRESHOLD = 1000000L; // one million bytes public static void findBig(File dir, String tab) { File[] dirContents = dir.listFiles(); System.out.println(tab+"**:"+dir.getPath()); for(File f : dirContents){ if (f.isDirectory()) { findBig(f,tab+"\t"); } else { if (f.length() > THRESHOLD){ System.out.printf("%s%s%8d\n",tab,f.getName(), f.length()); } } } }Does findBig call itself?CompSci 100e6.9Solving Problems RecursivelyRecursion is an indispensable tool in a programmer’s toolkitAllows many complex problems to be solved simplyElegance and understanding in code often leads to better programs: easier to modify, extend, verify (and sometimes more efficient!!)Sometimes recursion isn’t appropriate, when it’s bad it can be very bad---every tool requires knowledge and experience in how to use itThe basic idea is to get help solving a problem from coworkers (clones) who work and act like you doAsk clone to solve a simpler but similar problemUse clone’s result to put together your answerNeed both concepts: call on the clone and use the resultCompSci 100e6.10Print words read, but print backwardsCould store all the words and print in reverse order, but …Probably the best approach, recursion works toopublic void printReversed(Scanner s){ if (s.hasNext()){ // reading succeeded? String word = s.next(); // store word printReversed(s); // print rest System.out.println(word); // print the word }}The function printReversed reads a word, prints the word only after the clones finish printing in reverse orderEach clone has own version of the code, own word variableWho keeps track of the clones?How many words are created when reading N words?•Can we do better?CompSci 100e6.11ExponentiationComputing xn means multiplying n numbers (or does it?)What’s the simplest value of n when computing xn?If you want to multiply only once, what can you ask a clone?public static double power(double x, int n){ if (n == 0){ return 1.0; } return x * power(x, n-1);}Number of multiplications?Note base case: no recursion, no clonesNote recursive call: moves toward base case (unless …)CompSci 100e6.12Faster exponentiationHow many recursive calls are made to computer 21024?How many multiplies on each call? Is this better? public static double power(double x, int n){ if (n == 0) { return 1.0; } double semi = power(x, n/2); if (n % 2 == 0) { return semi*semi; } return x * semi * semi;}What about an iterative version of this function?CompSci 100e6.13Back to
View Full Document