DOC PREVIEW
HARVARD MATH 19B - Lecture 20

This preview shows page 1 out of 2 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 2 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 2 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Math 19b: Linear Algebra with Probability Oliver Knill, Spring 2011Lecture 20: More data fittingLast time, we saw how the geometric formula P = A(ATA)−1ATfor the projection on the imageof a matrix A allows us to fit data . Given a fitting problem, we write it as a system of linearequationsAx = b .While this system is not solvable in g eneral, we can look for the point on the image of A whichis closest to b. This is the ”best possible choice” of a solution” and called the least squaresolution:The vector x = (ATA)−1ATb is the least square solution of the system Ax = b.The most popular example of a data fitting problem is linear regression. Here we have datapoints (xi, yi) and want to find the best line y = ax + b which fits these data. But data fittingcan be done with any finite set of functions. Data fitting can b e done in higher dimensions too.We can for example look for the best surface fit through a given set of points (xi, yi, zi) in space.Also here, we find the least square solution of the corresponding system Ax = b which is obtainedby assuming all points to be on the surfa ce.1 Which paraboloid ax2+ by2= z best fits the datax y z0 1 2-1 0 41 -1 3In other words, find the least square solutionfor the system of equations for the unknownsa, b which aims to have all data points on theparaboloid.Solution: We have to find the least square solution to the system of equationsa0 + b 1 = 2a1 + b 0 = 4a1 + b 1 = 3 .In matrix for m this can be written as A~x =~b withA =0 11 01 1,~b =243.We have ATA ="2 11 2#and ATb ="75#. We get the least square solution with theformulax = (ATA)−1ATb ="31#.The best fit is the function f(x, y) = 3x2+ y2which produces an elliptic paraboloid.2A graphic from the Harvard Manage-ment Company Endowment Report ofOctober 2010 is shown to the left. As-sume we want to fit the growth us-ing functions 1, x, x2and assume theyears are numbered starting with 1990.What is the best parabola a+bx+cx2=y which fits these data?quintenium endowment in billions1 52 73 184 255 27We solved this example in class with linear regression. We saw that the best fit. With aquadratic fit, we g et the system A~x =~b withA =1 1 11 2 41 3 91 4 161 5 25,~b =57182527.The solution vector ~x =abc=−21/5277/35−2/7which indicates strong linear growth but someslow down.3 Here is a problem on data analysis from a website. We collect some data from users but noteverybody fills in all the dataPerson 1 3 5 - 3 9 - - - 2 9Person 2 4 - - 8 - 5 6 2 - 9Person 3 - 4 2 5 7 - 1 9 8 -Person 4 1 - - - - - - - - -It is difficult to do statistic with this. One possibility is t o filter out all data fro m people whodo not fulfill a minimal requirement. Person 4 for example did not do the survey seriouslyenough. We would throw this dat a away. Now, one could sort the data according to someimpo r tant row. Arter tha one could fit the data with a function f (x, y) of two variables.This function could be used to fill in the missing data. After tha t, we would g o a nd seekcorrelations between different rows.Whenever do ing datar eduction like this, one must always compare different scenariosand investigate how much the outcome changes when changing the data.The left picture shows a linear fit of the above data. The second picture shows a fit withcubic functions.Homework due March 23, 20 111 Here is an example of a fitting problem, where the solution is not unique:x y0 10 20 3Write down the corresponding fitting problem for linear functions f(x) = ax + b = y.Whatis going wrong?2 If we fit data with a polynomial of t he form y = a0+ a1x + a2x2+ a3x3+ ...a + yx7. Howmany data points (x1, y1), . . . , (xm, ym) do you expect to fit exactly if the points x1, x2, ..., xmare all different?3 The first 6 prime numbers 2, 3, 5, 7, 11 define the data points (1, 2), (2, 3), (3, 5), (5, 7 ) , (6, 11)in the plane. Find the best parabola of the form y = ax2+ c which fits these


View Full Document

HARVARD MATH 19B - Lecture 20

Download Lecture 20
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 20 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 20 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?