Analysis of AlgorithmsTime and spaceWhat does “size of the input” mean?Characteristic operationsExact valuesHigher-level languagesAverage, best, and worst casesConstant timeLinear timeConstant time is (usually) better than linear timeThe array subset problemmembersubsetAnalysis of array subset algorithmWhat about the constants?Simplifying the formulaeBig O notationBig O for subset algorithmCan we justify Big O notation?y = x2 + 3x + 5, for x=1..10y = x2 + 3x + 5, for x=1..20Common time complexitiesThe EndJan 15, 2019Analysis of Algorithms2Time and spaceTo analyze an algorithm means:developing a formula for predicting how fast an algorithm is, based on the size of the input (time complexity), and/ordeveloping a formula for predicting how much memory an algorithm requires, based on the size of the input (space complexity)Usually ti me is our biggest concernMost algorithms require a fixed amount of space3What does “size of the input” mean?If we are searching an array, the “size” of the input could be the size of the arrayIf we are merging two arrays, the “size” could be the sum of the two array sizesIf we are computing the nth Fibonacci number, or the nth factorial, the “size” is nWe choose the “size” to be a parameter that determines the actual time (or space) requiredIt is usually obvious what this parameter isSometimes we need two or more parameters4Characteristic operationsIn computing time complexity, one good approach is to count characteristic operationsWhat a “characteristic operation” is depends on the particular problemIf searching, it might be comparing two valuesIf sorting an array, it might be: comparing two values swapping the contents of two array locations both of the aboveSometimes we just look at how many times the innermost loop is executed5Exact valuesIt is sometimes possible, in assembly language, to compute exact time and space requirementsWe know exactly how many bytes and how many cycles each machine instruction takesFor a problem with a known sequence of steps (factorial, Fibonacci), we can determine how many instructions of each type are requiredHowever, often the exact sequence of steps cannot be known in advanceThe steps required to sort an array depend on the actual numbers in the array (which we do not know in advance)6Higher-level languagesIn a higher-level language (such as Java), we do not know how long each operation takesWhich is faster, x < 10 or x <= 9 ?We don’t know exactly what the compiler does with thisThe compiler almost certainly optimizes the test anyway (replacing the slower version with the faster one)In a higher-level language we cannot do an exact analysisOur timing analyses will use major oversimplificationsNevertheless, we can get some very useful results7Average, best, and worst casesUsually we would like to find the average time to perform an algorithmHowever,Sometimes the “average” isn’t well definedExample: Sorting an “average” arrayTime typically depends on how out of order the array isHow out of order is the “average” unsorted array? Sometimes finding the average is too difficultOften we have to be satisfied with finding the worst (longest) time requiredSometimes this is even what we want (say, for time-critical operations)The best (fastest) case is seldom of interest8Constant timeConstant time means there is some constant k such that this operation always takes k nanosecondsA Java statement takes constant time if:It does not include a loopIt does not include calling a method whose time is unknown or is not a constantIf a statement involves a choice (if or switch) among operations, each of which takes constant time, we consider the statement to take constant timeThis is consistent with worst-case analysis9Linear timeWe may not be able to predict to the nanosecond how long a Java program will take, but do know some things about timing: for (i = 0, j = 1; i < n; i++) { j = j * i; }This loop takes time k*n + c, for some constants k and c k : How long it takes to go through the loop once (the time for j = j * i, plus loop overhead) n : The number of times through the loop (we can use this as the “size” of the problem) c : The time it takes to initialize the loopThe total time k*n + c is linear in n10Constant time is (usually)better than linear timeSuppose we have two algorithms to solve a task:Algorithm A takes 5000 time unitsAlgorithm B takes 100*n time unitsWhich is better?Clearly, algorithm B is better if our problem size is small, that is, if n < 50Algorithm A is better for larger problems, with n > 50So B is better on small problems that are quick anywayBut A is better for large problems, where it matters moreWe usually care most about very large problemsBut not always!11The array subset problemSuppose you have two sets, represented as unsorted arrays: int[] sub = { 7, 1, 3, 2, 5 };int[] super = { 8, 4, 7, 1, 2, 3, 9 }; and you want to test whether every element of the first set (sub) also occurs in the second set (super): System.out.println(subset(sub, super));(The answer in this case should be false, because sub contains the integer 5, and super doesn’t)We are going to write method subset and compute its time complexity (how fast it is)Let’s start with a helper function, member, to test whether one number is in an array12member static boolean member(int x, int[] a) { int n = a.length; for (int i = 0; i < n; i++) { if (x == a[i]) return true; } return false;}If x is not in a, the loop executes n times, where n = a.lengthThis is the worst caseIf x is in a, the loop executes n/2 times on averageEither way, linear time is required: k*n+c13subsetstatic boolean subset(int[] sub, int[] super) { int m = sub.length; for (int i = 0; i < m; i++) if (!member(sub[i], super) return false; return true;}The loop (and the call to member) will execute:m = sub.length times, if sub is a subset of superThis is the worst case, and therefore the one we are most interested inFewer than sub.length times (but we don’t know how many)We would need to figure this out in order to compute average time complexityThe worst case is a linear number of times through the loopBut the loop body doesn’t take constant time, since it calls member, which takes
View Full Document