Calculus 220 section 7 5 Least Squares notes prepared by Tim Pilachowski So far we have dealt with equations and found both derivatives and integrals of those functions In the real world the equations didn t fall out of the sky but were developed from data observations made about phenomenon The curve is found that fits the data Hopefully you remember how to find the equation of a line given two points calculate slope m substitute to find the y coordinate of the y intercept b and write the equation y mx b When there are more than two points and they don t conveniently line up for us we use partial derivatives to minimize the difference error between the observed data and the line of best fit The process is called regression analysis and the method is called least squares We ll adopt the statistics convention and use the formula y Ax B for the line of regression 9 8 Example A A manufacturer has collected preliminary data relating number of 5 6 6 7 units produced x measured in hundreds and cost y measured in 1000 s Find the equation that best represents cost as a function of number of units produced Answer 0 58x 3 06 2 4 The data are pictured in the scatterplot to the right The line of regression will go through the middle but the question becomes Of all the lines that we might draw that seem to fit the data which is the one that has the least error between the observed actual y value and the regression s predicted value Ax B xi 2 observed yi 4 5 6 6 7 9 8 regression error Ei Ei 2 Since some of the points will lie above the regression line and some will lie below it some of the errors will be positive and some will be negative Finding the sum of the errors would result in a canceling effect Instead to keep the value of each error and retain its effect we ll find the sum of the squared errors This is the function for which we want a minimum We ll use the techniques of the previous sections partial derivatives We ll need the chain rule The process above involved only four data points If we had 400 or 4000 the same process would become very unwieldy very quickly We can develop a general formula which can be applied to N any number of points xi x1 observed yi y1 regression Ax1 B error Ei y1 Ax1 B x2 y2 Ax 2 B y 2 Ax 2 B M xN M yN M M y N Ax N B Ax N B Ei 2 y1 Ax1 B 2 y 2 Ax2 B 2 yN M Ax N B 2 Using the symbol to mean the sum of the least squares error function we want to minimize is sum of squared errors f A B y Ax B 2 N Note I really should write 2 f A B y i Axi B but chose the version above for simplicity s sake i 1 Using the sum rule and chain rule we get f 2 y Ax B x 2 xy Ax 2 Bx 0 A f 2 y Ax B 1 2 y Ax B 0 B Note that since we have N points and thus have N terms in our sum we can replace B with NB Dividing by two to get easier numbers and rearranging gives us A x 2 B x xy xy A x 2 B x 0 A x NB y y A x NB 0 y A x Solving the second equations for B we get B N N xy x y Substituting into the first equation and solving for A we get A 2 N x 2 x Example A revisited Use the formulas above to find the least squares regression equation x y 2 4 5 6 6 7 9 8 xy x2
View Full Document