This preview shows page 1 out of 4 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Final Exam Review - Model vs Data December 15, 2006 Model vs. Data In most experiments we can control a set of independent variables x and can measure the value of the dependent variable y. For such a system we can propose a model which relates the value of the dependent variable to the values of the independent variables as shown in Equation(1) y = f(x; Θ) (1) The aim of the generating a model for the system is to obtain an answer to the following three questions For what value of Θ is the deviation between model and data minimum? • Is the model consistent with the data? • • What are the error bars on the values of parameters? In the context of models we classify models as linear or non-linear. Linear models depend on the parameters Θ linearly. For example the log of the rate of an arrhenius reaction is linear in the parameters log(A) and ER a . This model is linear even tough log(A) is not linearly dependent on the dependent variable temperature T . Ealog(k) = log(A) − (2) RT Usually in solving these problems we make the following two assumption 1. The dependent variable y, that is being measured is distributed nor-mally (a gaussian distribution) around its mean value. This distri-bution can be due to many factors which are not in control of the experimentalist. 2. On the other hand the independent variables x are known exactly. 1 Cite as: Sandeep Sharma, course materials for 10.34 Numerical Methods Applied to Chemical Engineering, Fall 2006. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].� � � � � � � � Best values of parameters Let us assume that the model that we have is correct and that the standard deviation of the random errors in the measurement of the dependent variable y is σ. Then the probability of getting a measured value yi is given by the Equation(3). p(yi) ∝ exp (yi − f(xi; Θ))2 (3) −2σ2 If we perform N different experiments then the probability of obtaining a vector y measurements of the dependent variables is given in Equation(4). N� (yi − f(xi; Θ))2 p(y) exp∝ −2σ2 i=1 N1 � ∝ exp −2σ2 (yi − f(xi; Θ))2 (4) i=1 The probability of getting this vector y becomes highest when sum of squares of errors become minimum. Thus the idea of minimizing the s um of squares of errors is based on the assumption that the errors in the mea-surement are normally distributed. Central limit theorem ensures that this assumption is justified if we assume that each measurement is obtained by performing many repeats. If the model is linear then we have an analytical solution for the best fit values of the parameters. If the model is non-linear in parameters then there can potentially be multiple local minima and we have to be careful. The linear model can be written as shown in Equation(5). y = X Θ (5) The solution to the model is given by the expression in Equation(6). Θ = (XT X)−1XT y (6) A potentially better method of solving minimization problem is by per-forming SVD decomposition of X (recall SVD of X gives three matrices U , Σ and V which are related to X as X = UΣV ). The best fit values of Θ are given in Equation(7). N� Ui.y Θ = Vi (7) σii=1 2 Cite as: Sandeep Sharma, course materials for 10.34 Numerical Methods Applied to Chemical Engineering, Fall 2006. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].� � �Model consistency Let σ be the variance in the measured data. Then the probability that we get a vector y of measured values is given by Equation(4). Now we define a parameter χ2 as � �2 χ2 Nyi − f(xi; Θ) = σi=1 The least square method of calculating the parameters is nothing but the same as minimizing the value of χ2 . Also χ2 is the a sum of N normally distributed quantities with mean 0 and variance 1. This χ2 is itself a random variable and is distributed as chi-square distribution with N −dim(θ) degrees of freedom. This chi-square distribution can be used to quantify the goodness of the fit. The probability of a model being correct is given by the area under the curve of a chi-square pdf between the abscissa χ2 and inf. Confidence intervals If we know the value of σ we can assume that y is distirubted normally around its mean value ˆy with a variance σ. We can then go ahead and calculate the approximate probability distribution functions of the parameters. From the probability distribution functions of the parameters we can calculate the 95% confidence intervals for the parameters. If the model is linear in parameters then one would expect that if the pdf of y was normal then the pdf of the parameters would also be normal. This is infact true and for more than one parameter we obtain a higher dimension gaussian. The covariance matrix for the parameters Θ; cov(Θ) = σ2[XT X | ]−1 . The 95% confidence intervals ΘM for a parameter θ is given in Equation(8). θi = θM,j ± Z2.5σ [XT X | ]−1 −1/2 (8) θM jj An interesting thing to note in the above equation is that the error bars on a parameter θi depends on the matrix X and thus by cleverly chosing our experimental conditions we can use a X that minimizes the error bars on the parameter of interest. When the model is not linear we can still use Equation(8) to calculate the error bars on the parameters, only in this case we can generate a linearised design matrix using Equation(9) and then use Equation(7) to calculate the error bars on the matrix. ∂f (xi, Θ)Xi,j = (9) ∂θj This is the same way that matlab function nlinfit and nlparci work. Note that by using this linearized design matrix we lose the information 3 Cite as: Sandeep Sharma, course materials for 10.34 Numerical Methods Applied to Chemical Engineering, Fall 2006. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].of the covariance between different parameters. The graphical way to look at any pair of parameters is to plot the χ2 value for a range of these two parameters. To convert the χ2 plot to a plot of probability we just calculate the value of Δχ2 = χ2(θ1, θ2) − χ2 and this value of Δχ2 is distributed min with a chi-square distribution of 2 degrees of freedom. An example of this was worked out


View Full Document

MIT 10 34 - Model vs Data

Documents in this Course
Load more
Download Model vs Data
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Model vs Data and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Model vs Data 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?