Berkeley STAT 135 - The geometry of least square - D2230919

Home> Schools> University of California, Berkeley> Statistics (STAT) > STAT 135> The geometry of least square

DOC PREVIEW

Berkeley STAT 135 - The geometry of least square

School name University of California, Berkeley

Course Stat 135- Concepts of Statistics

Pages 7

This preview shows page 1-2 out of 7 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 7 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 7 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 7 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

The geometry of least squaresWe can think of a vector as a point in space, where the elements of the vector are the coordinatesof the point. Consider for example, the following vector’s:t = (−4, 0), u = (5, 0), v = (4, 4), w = (−2, 2)Each of these vectors has two elements. If we regard, for example, the first element of the vector vas the x-coordinate and the second element of v as the y-coordinate, each vector can be representedby a point in a two-dimensional space as shown in the picture below.This idea can be extended to factors with three or more elements, but we need a space of three ormore dimensions to do so. Our inability to visualize a multidimensional space, however, should notprevent us from thinking of any n-element vector as a point in an n-dimensional spaceA vector can also be viewed geometric way as they directed line segment with an arrow startingfrom the origin and ending at the point representing the vector in the space. Draw this geometricrepresentation in the plot below.1Again, this way of viewing vectors can be extended to vectors with three or more elements, as wecan always imagine a line segment starting at the origin of an n-dimensional space and e nding atthe point representing the vector in the space.This geometric representation of vectors raises 2 questions:What are the length and direction of the line segment representing a vector?What is the angle between two line segments representing two vectors?We now have three ways of viewing a vector: algebraically, graphically, and geometrically. Let’smeasure the length of the line segment representing the vector v. Using Pythagorean theorem, thelength of the vector v, that is, the line from the origin to the point v in the figure above is:You can verify that the lengths of the vectors w and t are 2√2 and 4, respectively.More generally, for any n-dimensional vector v = (v1, v2, ..., vn), the length of v, denoted by ||v|| isdefined as||v|| =√v0v =The length the vector is also referred to as the norm of the vector. Note that the length of thegeometric vector can be interpreted as the magnitude of the corresponding algebraic vector. Thusthe larger the elements of the vector in absolute value, the greater its length and the larger themagnitude of the vector.Now let’s measure the angle between any to geometric vectors. Consider the angle between the twovectors in the diagram below.2The numerator is the inner product of the two vectors. Thus, the cosine of the angle between twovectors is the ratio of their inner product and the product of their lengths (norms). For example,let’s measure the angle between the vectors u and v from above.u0v = 5 ∗ 4 + 0 ∗ 4 = 20||u|| = 5||v|| = 4√2And hence cos(θ) =u0v||u||||v||=1√2The angle between two vectors is directly related to the concepts of linear dependence and indepen-dence. Two vectors u and v are linearly dependent if and only if one vector can be written as thescalar multiple of the other. When this happens, the cosine of the angle between these two vectorsis either +1 or -1. This implies that the angle is either 0 or 180 degrees. We may then conclude thatwent to vectors are linearly dependent, the angle between them is either 0 or 180 degrees, whichmeans that the two line segments representing the vectors have the same (or opposite) direction.The converse of the statement is also true.Two vectors u and v of the same order are perpendicular if and only if u0v = 0. We also say thatthese two vectors are orthogonal. The converse is also true, that is, when two vectors have an innerproduct of 0 then they are orthogonal. However if two vectors are linearly independent they mayor may not be orthogonal.Now take a look at what happens graphically only after subtract two vectors, or when we multiplyvector by a scalar. Suppose you wish to add the vectors v. and w. Algebraically, we haveDraw the geomtric picture below. Notice that the line segment representing the sum of the twovectors is the diagonal of the parallelogram having the line segments representing the two vectors3as adjacent to e dges. This is known as the parallelogram rule, and is always true for the sum of anytwo vectors.What about the difference between two vectors? The distance between the vectors v. and w is thesame as the norm (length) of the vector (v-w). It is sometimes useful to consider the differencebetween two vectors v and w as the sum of v and -w.Multiplying a vector by a scalar amounts to multiplying each of its elements by the scalar. Draw apicture to convince yourself of this.4A vector of length one is called a normal vector. Any non-zero vector can be normalized by dividingeach of its elements by its norm. This transformation is called normalization.The normalized versions of two vectors of the same direction are equal.The elements of the normal vector can be interpreted as direction cosines. Simply stated, they arethe cosine of the angles between the vector and each of the coordinate axes. Draw a picture toconvince yourself of this.Connection to Fitting by Least SquaresConsider observations (x1, y1), . . . , (xn, yn). Reexpress these observations as two n-dimensionalvectors, y = (y1, . . . , yn) and x = (x1, . . . , xn). Let 1 denote the vector in n-dimensional space ofall 1’s, i.e. 1 = (1, 1, . . . , 1).The span of x is all those vectors of the form cx, where c is a scalar. The span of x, 1 is thecollection of vectors that can be expressed as ax + b1.Show the following1.Pni=1(yi− c)2= || ||22. So minimizing the quantity above with respect to c is the same as finding the inthe linear span of3. Show that 1 is orthogonal to y − ¯y1. Use a picture to verify this fact.4. Use this fact to show that||y − c1||2= ||y − ¯y1||2+ ||(¯y −c)1||255. Next establish that ¯y is the minimizer ofPni=1(yi−c)2, and that ¯y1 is the closest po int in thelinear span of 1 to y.6. Show that in general the closest vector to y in the linear span of a vector v is the projectionPvy =y0v||v||2v7. Now consider the lest squares fit from this geometric perspectivenXi=1[yi− (a + bxi)]2= || ||2Minimizing with respect to a and b is equivalent to projecting onto the linearspan of and , of finding the closest vector to in the linearspan of8. Show that the linear span of 1 and x is the same as the linear span of 1 and x − ¯x1.9. Explain why P1,xy = P1y + Px−¯x1y.10. Show that P1y = ¯y1 andPx−¯x1y =y0(x − ¯x1)||x − ¯x1||2(x − ¯x1)611. Show thatˆb obtained from

View Full Document