Engineering Analysis ENG 3420 Fall 2009 Dan C Marinescu Office HEC 439 B Office hours Tu Th 11 00 12 00 Lecture 22 Attention The last homework HW5 and the last project are due on Tuesday November 24 Last time Linear regression Exponential power and saturation non linear models Linear least squares regression Today Linear regression versus sample mean Coefficient of determination Polynomial least squares fit Multiple linear regression General linear squares More on non linear models Interpolation Chapter 15 Polynomial interpolation Newton interpolating polynomials Lagrange interpolating polynomials Next Time Splines Lecture 22 2 Quantification of Errors For a straight line the sum of the squares of the estimate residuals is n n Sr e yi a0 a1 xi 2 i i 1 i 1 The standard error of the estimate s y x Sr n 2 2 Linear regression versus the sample mean What is the difference between linear regression and the case when we simply compute the sample mean and draw a line corresponding to the sample mean The spread the histogram of the differences between the values predicted by linear regression and the actual sample values Regression data showing a the spread of data around the mean of the dependent data and b the spread of the data around the best fit line The reduction in spread represents the improvement due to linear regression Coefficient of Determination S Sr The coefficient of determination r2 r 2 t St n S t yi y i 1 2 r2 represents the percentage of the original uncertainty explained by the model For a perfect fit Sr 0 and r2 1 If r2 0 there is no improvement over simply picking the mean If r2 0 the model is worse than simply picking the mean Example V m s F N i xi yi a0 a1xi yi 2 yi a0 a1xi 2 1 10 25 39 58 380535 4171 2 20 70 155 12 327041 7245 Fest 234 2857 19 47024v 3 30 380 349 82 68579 911 4 40 550 544 52 8441 30 5 50 610 739 23 1016 16699 6 60 1220 933 93 334229 81837 7 70 830 1128 63 35391 89180 8 80 1450 1323 33 653066 16044 360 5135 1808297 216118 St yi y 1808297 2 Sr yi a0 a1 xi 216118 2 sy 1808297 508 26 8 1 216118 189 79 8 2 1808297 216118 r2 0 8805 1808297 s y x 88 05 of the original uncertainty has been explained by the linear model Polynomial least fit squares MATLAB has a built in function polyfit that fits a least squares n th order polynomial to data p polyfit x y n x independent data y dependent data n order of polynomial to fit p coefficients of polynomial f x p1xn p2xn 1 pnx pn 1 MATLAB s polyval command can be used to compute a value using the coefficients y polyval p x Fitting an mth order polynomial to n data points Minimize n n Sr e yi a0 a1 xi a x L a x 2 i i 1 2 2 i i 1 The standard error is s y x Sr n m 1 because the mth order polynomial has m 1 coefficients The coefficient of determination r2 is St Sr r St 2 n S t yi y i 1 2 m 2 m i Multiple Linear Regression Now y is a linear function of two or more independent variables y a0 a1 x1 a2 x2 Lam xm The best fit minimize the sum of the squares of the estimate residuals n n Sr e yi a0 a1x1 i a2 x2 i Lam xm i 2 i i 1 i 1 For example when y a0 a1 x1 a2 x2 instead of a line we have a plane 2 General Linear Least Squares Linear polynomial and multiple linear regression all belong to the general linear least squares model y a0 z0 a1z1 a2 z2 Lam zm e where z0 z1 zm are a set of m 1 basis functions and e is the error of the fit The basis functions can be any function data but cannot contain any of the coefficients a0 a1 etc The equation can be re written for each data point as a matrix equation y Z a e where y is a vector of n dependent data a is a vector of m 1 coefficients of the equation e contains the error at each point and Z is z01 z02 Z M z 0n z11 z12 M z1n L zm1 L zm2 O M L zmn with zji representing the value of the jth basis function calculated at the ith point Solving General Linear Least Squares Coefficients Generally Z is an n x m 1 matrix Simple inversion cannot be used to solve for the m 1 a Instead the sum of the squares of the estimate residuals is minimized 2 m 2 Sr ei yi a j z ji i 1 j 0 i 1 n n The outcome of this minimization yields Z Z a Z y T T Example Given the colum vectors x and y find the coefficients for best fit line y a0 a1x a2x2 Z ones size x x x 2 a Z Z Z y MATLAB s left divide will automatically include the Z T terms if the matrix is not square so a Z y would work as well To calculate measures of fit St sum y mean y 2 Sr sum y Z a 2 r2 1 Sr St syx sqrt Sr length x length a Nonlinear Models How to deal with nonlinear models when we cannot fit a straight line to the sample data Transform the variables and solve for the best fit of the transformed variables This works well for exponential power saturation models but not all equations can be transformed easily or at all Perform nonlinear regression to directly determine the least squares fit To perform nonlinear regression write a function that returns the sum of the squares of the estimate residuals for a fit and then use fminsearch function to find the values of the coefficients where a minimum occurs The arguments to the function to compute Sr should be the coefficients the independent variables and the dependent variables Example Given two vectors of n observations ym for the force F and xm for the velocity v find the coefficients a0 and a1 for the best fit of the equation a1 0 F a v First write a function called fSSR m containing the following function f fSSR a xm ym yp a 1 xm a 2 f sum ym yp 2 Use fminsearch in the command window to obtain the values of a that minimize fSSR a fminsearch fSSR 1 1 v F where 1 1 is an initial guess for the a0 a1 vector is a placeholder for the options Comparison between the transformed of the power equation and the direct method in our example In the general case the two methods produce different results the coefficients of the equations are different The direct method produces the largest r2 Polynomial …
View Full Document