UMD CMSC 351 - Lecture 26: Chain Matrix Multiplication - D2311954

Home> Schools> University of Maryland, College Park> Computer Science (CMSC) > CMSC 351> Lecture 26: Chain Matrix Multiplication

DOC PREVIEW

UMD CMSC 351 - Lecture 26: Chain Matrix Multiplication

School name University of Maryland, College Park

Course Cmsc 351- Algorithms

Pages 4

This preview shows page 1 out of 4 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Lecture Notes CMSC 251Lecture 26: Chain Matrix Multiplication(Thursday, April 30, 1998)Read: Section 16.1 of CLR.Chain Matrix Multiplication: This problem involves the question of determining the optimal sequence forperforming a series of operations. This general class of problem is important in compiler design forcode optimization and in databases for query optimization. We will study the problem in a very re-stricted instance, where the dynamic programming issues are easiest to see.Suppose that we wish to multiply a series of matricesA1A2...AnMatrix multiplication is an associative but not a commutative operation. This means that we are free toparenthesize the above multiplication however we like, but we are not free to rearrange the order of thematrices. Also recall that when two (nonsquare) matrices are being multiplied, there are restrictions onthe dimensions. A p × q matrix has p rows and q columns. You can multiply a p × q matrix A times aq × r matrix B, and the result will be a p × r matrix C. (The number of columns of A must equal thenumber of rows of B.) In particular for 1 ≤ i ≤ p and 1 ≤ j ≤ r,C[i, j]=qXk=1A[i, k]B[k, j].Observe that there are pr total entries in C and each takes O(q) time to compute, thus the total time(e.g. number of multiplications) to multiply these two matrices is p ·q · r.BC=ApqqrprMultiplicationpqrtime = =*Figure 33: Matrix Multiplication.Note that although any legal parenthesization will lead to a valid result, not all involve the same numberof operations. Consider the case of 3 matrices: A1be 5 × 4, A2be 4 × 6 and A3be 6 × 2.mult[((A1A2)A3)] = (5 · 4 · 6) + (5 · 6 · 2) = 180,mult[(A1(A2A3))] = (4 · 6 · 2) + (5 · 4 · 2) = 88.Even for this small example, considerable savings can be achieved by reordering the evaluation se-quence. The Chain Matrix Multiplication problem is: Given a sequence of matrices A1,A2,...,Anand dimensions p0,p1,...,pnwhere Aiis of dimension pi−1× pi, determine the multiplication se-quence that minimizes the number of operations.Important Note: This algorithm does not perform the multiplications, it just figures out the best orderin which to perform the multiplications.79Lecture Notes CMSC 251Naive Algorithm: We could write a procedure which tries all possible parenthesizations. Unfortunately, thenumber of ways of parenthesizing an expression is very large. If you have just one item, then there isonly one way to parenthesize. If you have n items, then there are n − 1 places where you could breakthe list with the outermost pair of parentheses, namely just after the 1st item, just after the 2nd item,etc., and just after the (n − 1)st item. When we split just after the kth item, we create two sublists tobe parenthesized, one with k items, and the other with n − k items. Then we could consider all theways of parenthesizing these. Since these are independent choices, if there are L ways to parenthesizethe left sublist and R ways to parenthesize the right sublist, then the total is L · R. This suggests thefollowing recurrence for P (n), the number of different ways of parenthesizing n items:P (n)=1 if n =1,Pn−1k=1P (k)P (n − k) if n ≥ 2.This is related to a famous function in combinatorics called the Catalan numbers (which in turn isrelated to the number of different binary trees on n nodes). In particular P (n)=C(n−1) andC(n)=1n+12nn.Applying Stirling’s formula, we find that C(n) ∈ Ω(4n/n3/2). Since 4nis exponential and n3/2is justpolynomial, the exponential will dominate, and this grows very fast. Thus, this will not be practicalexcept for very small n.Dynamic Programming Solution: This problem, like other dynamic programming problems involves de-termining a structure (in this case, a parenthesization). We want to break the problem into subproblems,whose solutions can be combined to solve the global problem.For convenience we can write Ai..jto be the product of matrices i through j. It is easy to see thatAi..jis a pi−1× pjmatrix. In parenthesizing the expression, we can consider the highest level ofparenthesization. At this level we are simply multiplying two matrices together. That is, for any k,1 ≤ k ≤ n − 1,A1..n= A1..kAk+1..n.Thus the problem of determining the optimal sequence of multiplications is broken up into 2 questions:how do we decide where to split the chain (what is k?) and how do we parenthesize the subchainsA1..kand Ak+1..n? The subchain problems can be solved by recursively applying the same scheme.The former problem can be solved by just considering all possible values of k. Notice that this problemsatisfies the principle of optimality, because if we want to find the optimal sequence for multiplyingA1..nwe must use the optimal sequences for A1..kand Ak+1..n. In other words, the subproblems mustbe solved optimally for the global problem to be solved optimally.We will store the solutions to the subproblems in a table, and build the table in a bottom-up manner.For 1 ≤ i ≤ j ≤ n, let m[i, j] denote the minimum number of multiplications needed to computeAi..j. The optimum cost can be described by the following recursive definition. As a basis observe thatif i = j then the sequence contains only one matrix, and so the cost is 0. (There is nothing to multiply.)Thus, m[i, i]=0.Ifi<j, then we are asking about the product Ai..j. This can be split by consideringeach k, i ≤ k<j,asAi..ktimes Ak+1..j.The optimum time to compute Ai..kis m[i, k], and the optimum time to compute Ak+1..jis m[k+1,j].We may assume that these values have been computed previously and stored in our array. Since Ai..kis a pi−1×pkmatrix, and Ak+1..jis a pk×pjmatrix, the time to multiply them is pi−1·pk·pj. Thissuggests the following recursive rule for computing m[i, j].m[i, i]=0m[i, j] = mini≤k<j(m[i, k]+m[k+1,j]+pi−1pkpj) for i<j.80Lecture Notes CMSC 251It is not hard to convert this rule into a procedure, which is given below. The only tricky part is arrangingthe order in which to compute the values. In the process of computing m[i, j] we will need to accessvalues m[i, k] and m[k+1,j]for k lying between i and j. This suggests that we should organize thingsour computation according to the number of matrices in the subchain. Let L = j − i +1denote thelength of the subchain being multiplied. The subchains of length 1 (m[i, i]) are trivial. Then we buildup by computing the subchains of lengths 2, 3,...,n. The final answer is m[1,n]. We need to be alittle careful

View Full Document