UT EE 381K - Literature Survey: Non-Negative Matrix Factorization - D313948

Home> Schools> University of Texas at Austin> Electrical Computer (EE) > EE 381K> Literature Survey: Non-Negative Matrix Factorization

DOC PREVIEW

UT EE 381K - Literature Survey: Non-Negative Matrix Factorization

School name University of Texas at Austin

Course Ee 381k- Analysis & Design of Communication Networks

Pages 10

This preview shows page 1-2-3 out of 10 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Literature Survey:Non-Negative Matrix FactorizationJoel A. TroppInstitute for Computational Engineering and Sciences, 1 University Sta-tion, C0200, The University of Texas at Austin, Austin, TX 78712E-mail address: [email protected]. This article surveys recent research on Non-Negative Matrix Factorization(NNMF), a relatively new technique for dimensionality reduction. It is based on the ideathat in many data-processing tasks, negative numbers are physically meaningless. TheNNMF technique addresses this problem by placing non-negativity constraints on the datamodel. I discuss the applications of NNMF, the algorithms and the qualitative results. Sincemany of the algorithms proposed for NNMF seem to lack a firm theoretical foundation, thisarticle also surveys techniques for proving that iterative algorithms converge. It concludeswith a description of additional investigations which are presently underway.1. INTRODUCTION 31. IntroductionA basic technique in dimensionality reduction is Principal Component Analysis (PCA),which calculates a set basis vectors that can be used to approximate high-dimens ional dataoptimally in the least-squares sense. The number of basis vectors is much smaller thanthe number of dimensions, so encoding the data as linear combinations of the basis vectorstransforms it to a lower-dimensional space. This reduction can be used to improve thetractability of data analysis algorithms, to discover features in the data or to find hiddenvariables. See Jolliffe [1] for a survey.One major problem with PCA is that the basis vectors have both positive and negativecomponents, and the data are represented as linear combinations of these vectors with pos-itive and negative coefficients. The optimality of PCA can be traced back to constructioncancelation of the signs. In many applications, the negative components contradict physicalrealities. For example, the pixels in a grayscale image have non-negative intensities, so animage with negative intensities cannot be reasonably interpreted.To address this philosophical problem, several researchers suggested that the search fora representative basis should be confined to non-negative vectors. Formally, this idea can beinterpreted as decomposing a non-negative matrix A into two non-negative factors V and H,i.e.minV ≥0,H ≥0kA − VH k2F. (1.1)Hence the paradigm is called Non-Negative Matrix Factorization (NNMF). We assume thatV has far fewer columns than A has rows, so the approximation can succeed only if itdiscovers latent structure in the data. Although NNMF is computationally intensive, thelast ten years have seen several efforts at implementing it for various purposes.2. NON-NEGATIVE MATRIX FACTORIZATION 42. Non-Negative Matrix FactorizationIn this se ction, we shall summarize the development of NNMF in several different researchcommunities. This survey is by no means comprehensive, but it addresses the most importantwork.2.1. Work of Paatero. The idea of non-negative matrix factorization can be tracedback to a 1994 paper of Paatero and Tapper [2]. Their goal was to perform factor analysison environmental data, which problem involves finding a small numb er of root causes thatexplain a large set of measurements. Each factor is a positive combination of some elementaryvariables. In real circumstances, a factor is present (in which case it has a certain positiveeffect) or it is absent (in which case it has no effect). Therefore, it often makes sense toconstrain the factors and their influences to be non-negative.The problem can be posed formally. Suppose that the columns of A are the measurements,the columns of V are the factors and the rows of H are the influences of each factor (calledscores). Use W to denote the weight associated to each element, which indicates the level ofconfidence in that measurement. Paatero and Tapper advocate optimizing the functionalkW · (A − VH )k2Fsubject to V ≥ 0 and H ≥ 0.Here, · denotes the Hadamard (also known as the Schur or elementwise) product, and theinequalities are also elementwise.Paatero and Tapper originally proposed using a constrained alternating least squaresalgorithm (ALS) to solve the problem [2]. This method fixes V and solves the optimizationwith respect to H . Then it reverse s the roles of the variables and repeats the process adinfinitum. The algorithm is initialized with different random matrices in an effort to obtaina global optimum.Paatero subsequently invented several other algorithms for attacking the optimization.His second algorithm, PMF2, may be viewed as a version of the alternating method witha plethora of ad hoc modifications, which complicate the implementation enormously [4].2. NON-NEGATIVE MATRIX FACTORIZATION 5Later, Paatero presented a more general method, called the Multilinear Engine, for findingmulti-factor models with non-negativity constraints [5]. In these models, the approximantVH is replaced by a longer product of matrices. The program uses a modified conjugategradient algorithm to solve the optimization problem.By now, Paatero’s community has generated an extensive body of work on non-negativefactor analysis, which is impossible to review here. These papers share several shortcomingsfrom the present point of view. First, they concentrate on a specific application of NNMF,so it is not clear how relevant they are to the wider body of applications. Second, thealgorithms they use are frequently Byzantine, which precludes straightforward modificationfor other domains. Third, they do not seem to prove that the algorithms converge or toaddress the properties of the solutions which their algorithms yield. Indeed, it appears moreor less impossible to ascertain the theoretical complexity or convergence of their methods.They make claims based primarily on empirical evidence.2.2. Conic and Convex Coding. Independent of Paatero, Lee and Seung introducedthe concept of NNMF in a 1997 paper on unsupervised learning [6]. They begin by consider-ing the following encoding problem. Suppose that the columns of V are fixed feature vectorsand that a is an input vector to be encoded. The goal is to minimize the reconstruction errorminhka − V hk22. Different learning techniques can be obtained from various constraints onthe vector h. PCA corresponds to an unconstrained minimization, while Vector Quantization(VQ) requires that h equal one of the canonical basis vectors (i.e. a single unit componentwith the remaining entries zero). Lee and

View Full Document