Unformatted text preview:

Math 110 - Fall 05 - Lectures notes # 20 - Oct 14 (Friday)We will cover the contents of Chapter 3 slightly differently,than the textbook, showing how to computerange(A), rank(A), nullspace(A), nullity(A),A^{-1} (if it exists)solve Ax=b (determine if there are no solutions, 1 solution,or infinitely many solutions).all by computing a single "matrix factorization" of A, calledthe "LU factorization" or "LU decomposition", which isa formula for A as product of much simpler matrices.The algorithm for computing this factorization is justGaussian elimination, where we keep track of all theelementary matrix operations we do:reordering rows and/or columnsadding a multiply of one row to anothermultiplying a row or column by a scalaras a product of simple matrices. When you use computer software todo matrix operations, it will return your answer as a product ofsimpler matrices much like the ones we show here.LU decomposition plays an important role in government high technologypolicy: the largest computers in the world (called "supercomputers") areregularly measured by how fast they can solve Ax=b using LU decompositionon very large matrices, with the ratings appearing on the web sitewww.top500.org. This web site keeps track of the 500 fastest computersanywhere in the world (at least those not at secret government sites).For example, as of June 2005, the top computer was an IBM Blue Gene/L,consisting of 65,536 parallel processors. The machine solveda linear system of n = 1,277,951 equations using the LU decompositionalgorithm described below at a speed of 136.8 Teraflops, that is136.8 trillion = 136.8 x 10^12 arithmetic operations per second.Since the number of arithmetic operations required by the algorithmwill turn out to be about 2/3*n^3, the time required to solve thissingle linear system was(2/3)*(1,277,951)^3 / 136.8x10^12 = 10161 seconds ~ 2.8 hoursRecall that multiplying 2 matrices takes about 2n^3 operations,3 times more operations.A few years ago the fastest machine on the Top 500 list was theNEC "Earth Simulator", which ran at a speed of 35.9 Teraflops, as fastas the next four machines combined. This caught the attention of parts1of the US government responsible for national security, because NEC isa Japanese company, and one national security goal of the US governmentis to have better technology, including faster computers, than anyone elsein the world. This is because these computers are used for breaking secretcodes, designing weapons, and other national security purposes. Indeed,the largest computers in the US have been built for these purposes fordecades. The speed of the NEC machine led to a flurry of governmentstudies and reports to analyze the situation, for examplewww.sc.doe.gov/ascr.FOSCfinalreport.pdf. More aspects of implementinglinear algebra algorithms on parallel machines are discussed in the courseCS267, see www.cs.berkeley.edu/~demmel/cs267_Spr05.One of the most widely used software packages for solving linear algebraproblems is called LAPACK (its parallel version is ScaLAPACK); for exampleit is used within Matlab. This software was produced in a joint projectbetween UC Berkeley, the University of Tennessee, and other groups.We have recently been funded to produce a new version. Seewww.netlib.org/lapack for the current software.Ex: Suppose L is square and lower triangular:L=[1 00][1 20][3-13]Then we can solve Lx = b for x given any b by "substitution"Solve 1*x_1 = b_1 for x_1Solve 1*x_1 + 2*x_2 = b_2 for x_2Solve 3*x_1 - 1*x_2 + 3*x_3 = b_3 for x_3As long as each L_ii is nonzero, so we can divide by it,we can solve any Lx=b for x and get a unique answer.In particular, the only solution of Lx = 0 is x=0, so thatL is one-to-one. This implies that L is invertible.We can compute L^{-1} by solving for it one column at a time:Let X = L^{-1}, so L*X = I, and e_j = j-th standard basisvector, so L*X*e_j = I*e_j, orL*(X*e_j) = e_j or L*(column j of X) = e_jso we can solve for X = L^{-1} one column at a time.Ex: Suppose U is square, upper triangular, and each U_ii is nonzero.Then we can similarly solve any U*x=b for a unique x given b bysubstitution, solving first for x_n, then x_{n-1}, and so on.We can also compute U^{-1} as above.2Ex: Suppose A = L*U, where L and U are n by n and nonsingular.Then A is invertible (why?), andA^{-1} = (L*U)^{-1} = U^{-1} * L^{-1}and solving A*x = b for x given b means computingx = A^{-1}b= (U^{-1} * L^{-1})*b= U^{-1} * (L^{-1}*b) ... by associativity= U^{-1} * y... where we compute y = L^{-1}*b i.e. solve L*y=b by substitutionwhere we finally compute x = U^{-1}*y i.e. solve U*x = y by substitutionagain. In other words, if we know A = L*U where L and U are invertibletriangular matrices, we can solve A*x=bby2steps:(1) solve L*y = b by substitution(2) solve U*x = y by substitutionIf we want to compute A^{-1} we use the same idea as above,solving A*X = I for X one column of X at a time.Stop&Ask: Suppose A = U*L where U and L are n by n and invertible.How do we solve A*x=b?It may seem strange to expect to know A in the form A = L*U, butthis is in fact exactly what Gaussian elimination applied to A computes.Ex:A=[3-1 2][6 0 10][-3 -3 -17 ]To apply Gaussian elimination, we do the following:for i = 1 to n-1 (3-1 = 2 in this case)subtract multiples of row i from rows i+1 through n,in order to zero out A below the diagonalSubtract 2 * row 1 from row 2Subtract -1 * row 1 from row 3,yielding A’ = [ 3 -1 2 ; 0 2 6 ; 0 -4 -15 ]Subtract -2 * row 2 from row 2,yielding A’’ = [ 3 -1 2 ; 0 2 6 ; 0 0 -3 ]Note that A’’ = U is upper triangular. Let’s take the multiplierswe computed and put them into a lower triangular matrix L3L=[1 00][2 10][-1 -2 1 ]Then we can confirm that L*U = A. We will see thatthis is true in general, as long as we never get a 0 on the diagonal.ASK & WAIT: What happens if A_11 = 0? What happens at any point if somediagonal entry = 0?If A is m-by-n, we will only get a zero on the diagonalwhen rank(A) < min(m,n), as we will see. First we will see what happensassuming this is not the case, and then show the most general case.We will use induction. To do so, we need some basic factsabout "block matrices", which are a useful notation tokeep track of what parts of the matrix we’re working on:Def. Let A by m by n, wherem=m1+m2andn=n1+n2Then we can write A as a "block matrix"A = [ A_11 A_12 ][ A_21 A_22 ]where A_11 and A_21 have m1 columns, A_12 and A_22 have m2 columnsA_11 and A_12 have n1 rows, and A_21 and A_22 have n2 rows.We


View Full Document

Berkeley MATH 110 - Lecture Notes

Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?