U of M MATH 4452 - LECTURE NOTES - D62188

Home> Schools> University of Minnesota- Twin Cities> Mathematics (MATH) > MATH 4452> LECTURE NOTES

U of M MATH 4452 - LECTURE NOTES

School name University of Minnesota- Twin Cities

Pages 7

Download Save

Unformatted text preview:

Math Modeling Lecture 17: Modeling of Data: Linear Regression Page 14452 Mathematical Modeling Lecture 17: Modeling of Data: LinearRegressionIntroductionIn modeling of data, we are given a set of data points, and we want to fit a function with adjustableparameters to the data points. Obviously, we want the function to approximate the data as well as possible,and to do that you have to choose certain values for the parameters in your function. You choose theseparameter values by first designing a merit function which you wish to minimize. When the merit functionis minimized, the agreement between the function and the data will have close agreement.You can see that fitting a function to data becomes a problem of minimization in many dimensions (thenumber of adjustable parameters in your function is the dimension of the problem).Once we have fit the function to the data, we need to assess how good the fit actually is. There has to besome sort of statistical analysis of the fit.The bulk of this discussion was based on Reference [1], which is an excellent first resource for a variety ofapplied numerical analysis.General Set UpWe have N data points (ti, yi), i = 1, 2, . . . , N which we want to fit to a model function which has Madjustable parameters, f(t) = f(t; α1, . . . , αM).We actually have a great deal of choice in what type of function we want to minimize. It can be anythingthat will measure the relation of the data to the model function. The vector which compares the data to themodel function at each point is given byy1− f(t1; α1, . . . , αM)y2− f(t2; α1, . . . , αM)...yN− f(tN; α1, . . . , αM)We can minimize this vector based on a variety of different norms:l1norm:NXi=1|yi− f(ti; α1, . . . , αM)|lpnorm:ÃNXi=1(yi− f(ti; α1, . . . , αM))p!1/pl∞norm:Nmaxi=1(yi− f(ti; α1, . . . , αM))What is typically done is that the l2norm is used, since it is the Euclidean space norm, and the square ofMath Modeling Lecture 17: Modeling of Data: Linear Regression Page 2the norm is minimized (hence the name: least squares fit):minimize F (α1, . . . , αN) =NXi=1(yi− f(ti; α1, . . . , αM))2The above result assumes we know the data points only. However, what if we know the data points andfor each data point a standard deviation, σi? How could this be incorporated into the function we wish tominimize? We can do the followingminimize F (α1, . . . , αN) =NXi=1µyi− f(ti; α1, . . . , αM)σi¶2(1)which assumes that each data point has a measurement error which is independently random and distributedas a normal distribution around the actual model f (t). This result is based on a great deal of statistics, andthe idea that random deviations will converge to a normal distribution. Of course, this may not be the casein practice.Frequently, it is true that σi= σ is the same for all the data points. In that case, σ2can be factored outof the sum and the σ does not appear in the solution for α1and α2. Since this is the case, if you are givendata which does not have an associated error σiwhich depends on the data point you can simply set σi= 1and proceed with the analysis.Minimizing Eq. (1) is just a multivariable unconstrained minimization procedure, which yields the systemof equations0 =NXi=1µyi− f(ti; α1, . . . , αM)σ2i¶µ∂∂αkf(ti; α1, . . . , αM)¶, k = 1, . . . , M (2)which must b e solved for the M unknowns αi.Linear RegressionLinear regression does not mean fitting data to a straight line! The “linear” refers to the models dependenceon the parameters αk. However, for now we are interested in fitting to a straight line. In this case, ourfitting function becomesf(t; α1, α2) = α1+ α2t.The system of equations in Eq. (2) becomesα1NXi=11σ2i+ α2NXi=1tiσ2i=NXi=1yiσ2iα1NXi=1tiσ2i+ α2NXi=1t2iσ2i=NXi=1yitiσ2i(3)Math Modeling Lecture 17: Modeling of Data: Linear Regression Page 3We can simplify the notation if we use the following:S =NXi=11σ2i, St=NXi=1tiσ2i, Sy=NXi=1yiσ2i, Stt=NXi=1t2iσ2i, Sty=NXi=1tiyiσ2i, ∆ = SStt− S2t.The solution to Eq. (3) is given byα1=SttSy− StSty∆α2=SSty− StSy∆The Correlation Coefficient–How Good is Our Model Function?All we need now is an estimate of how good our linear fit is. Reference [1] has a significantly expandeddiscussion on determining how good your linear regression model is.We will consider the case where σi∼ σ for all i. This is frequently not a restrictive assumption, since thesources of error in measuring the data that lead to σiare frequently the same for all measurements.We calculate what is called the correlation coefficient, R2, which is a ratio of the model sum of the squaresto the total sum of the squaresR2=PNi=1(f(ti) − ¯y)2PNi=1(yi− ¯y)2, ¯y =1NNXi=1yi.Thus, if R2∼ 1 the model is a good representation of the data, and the data are representative of a linearfunction. If R2∼ 0 the data are essentially random, and a linear function cannot represent the data well.The results of this analysis for a particular data set (contained in the corresponding Mathematica file) areshown in Fig. 1. The data set is given for completeness in the Appendix.0510 1520t510152025yFigure 1: Linear fit, f(t) = 4.74964 + 1.04711t, to a data set. For this fit, the correlation coefficient wasfound to be R2= 0.96, which indicates that the data does represent a linear function, and the linear functionwe found represents the data well.Math Modeling Lecture 17: Modeling of Data: Linear Regression Page 4ExtrapolationOnce we have found the linear model of the data, what is it good for? Many times, what is of interest is theslope of the curve, or the y-intercept. Or, the model can be used for extrapolation. In any case, we wouldlike an estimate of the standard deviation of the model from the data. We can get an estimate by computingσ =vuutNXi=1(yi− f(ti))2(4)If our data is normally distributed about the model function (which it may very well not be!), we wouldexpect measurements will be within ±σ of the model function 68% of the time, and within ±2σ 95% of thetime. Figure 2 shows this result.0510 1520t0510152025y0510 1520t0510152025yFigure 2: An example of the numb er of data points contained within ±σ (left) and ±2σ (right) of the modelfunction, with σ = 1.15 from Eq. (4). We get 60% within ±σ and 95% within ±2σ.We can use this σ to estimate the error in extrapolation. Since we are assuming the model function isaccurately representing the

View Full Document


School:
Email:
New Password:
Confirm Password:

U of M MATH 4452 - LECTURE NOTES

Sign up for free to view:

Please select your school