Wright CS 707 - Latent Semantic Indexing

Unformatted text preview:

Latent Semantic IndexingToday’s topicLinear Algebra BackgroundEigenvalues & EigenvectorsMatrix-vector multiplicationMatrix vector multiplicationSlide 7Slide 8ExampleEigen/diagonal DecompositionDiagonal decomposition: why/howDiagonal decomposition - exampleExample continuedSymmetric Eigen DecompositionExerciseTime out!Singular Value DecompositionSlide 18SVD exampleLow-rank ApproximationSlide 21Reduced SVDApproximation errorSVD Low-rank approximationLatent Semantic Indexing via the SVDWhat it isVector Space Model: ProsProblems with Lexical SemanticsSlide 29Polysemy and ContextLatent Semantic Indexing (LSI)Goals of LSILatent Semantic AnalysisPerforming the mapsSlide 35Empirical evidenceSlide 37Failure modesBut why is this clustering?Intuition from block matricesSlide 41Slide 42Simplistic pictureSome wild extrapolationLSI has many other applicationsPowerPoint PresentationOverviewOutlineSlide 49Slide 50Slide 51Slide 52Slide 53Slide 54Slide 55Slide 56Slide 57Slide 58Slide 59Slide 60Slide 61Slide 62Slide 63Slide 64Slide 65Slide 66Slide 67Slide 68Slide 69Slide 70Slide 71Slide 72Slide 73Slide 74Slide 75Reduced Model (K = 2)Slide 77LSI, SVD, & EigenvectorsComputing Similarity in LSIPrasad L18LSI 1Latent Semantic IndexingAdapted from Lectures by Prabhaker Raghavan, Christopher Manning and Thomas HoffmannToday’s topicLatent Semantic IndexingTerm-document matrices are very largeBut the number of topics that people talk about is small (in some sense)Clothes, movies, politics, …Can we represent the term-document space by a lower dimensional latent space?Prasad 2L18LSILinear Algebra BackgroundPrasad 3L18LSIEigenvalues & EigenvectorsEigenvectors (for a square mm matrix S)How many eigenvalues are there at most?only has a non-zero solution if this is a m-th order equation in λ which can have at most m distinct solutions (roots of the characteristic polynomial) – can be complex even though S is real.eigenvalue(right) eigenvectorExample4Matrix-vector multiplicationS 30 0 00 20 00 0 1has eigenvalues 30, 20, 1 withcorresponding eigenvectors0011v0102v1003vOn each eigenvector, S acts as a multiple of the identitymatrix: but as a different multiple on each.Any vector (say x= ) can be viewed as a combination ofthe eigenvectors: x = 2v1 + 4v2 + 6v3642Matrix vector multiplicationThus a matrix-vector multiplication such as Sx (S, x as in the previous slide) can be rewritten in terms of the eigenvalues/vectors:Even though x is an arbitrary vector, the action of S on x is determined by the eigenvalues/vectors.Sx S(2v1 4v2 6v3)Sx 2Sv1 4Sv2 6Sv321v1 42v2 63v3Sx 60v1 80v2 6v3Prasad 6L18LSIMatrix vector multiplicationObservation: the effect of “small” eigenvalues is small. If we ignored the smallest eigenvalue (1), then instead ofwe would getThese vectors are similar (in terms of cosine similarity), or close (in terms of Euclidean distance).6080660800Prasad 7L18LSIEigenvalues & Eigenvectors0 and ,2121}2,1{}2,1{}2,1{- vvvSvFor symmetric matrices, eigenvectors for distincteigenvalues are orthogonalTSS and 0 if ,complex for ISAll eigenvalues of a real symmetric matrix are real.0vSv if then ,0, SwwwTnAll eigenvalues of a positive semi-definite matrixare non-negativePrasad 8L18LSIExampleLetThenThe eigenvalues are 1 and 3 (nonnegative, real). The eigenvectors are orthogonal (and real):2112S.01)2(21122IS 1111Real, symmetric.Plug in these values and solve for eigenvectors.PrasadLet be a square matrix with m linearly independent eigenvectors (a “non-defective” matrix)Theorem: Exists an eigen decomposition (cf. matrix diagonalization theorem)Columns of U are eigenvectors of SDiagonal elements of are eigenvalues of Eigen/diagonal DecompositiondiagonalUnique for distinct eigen-valuesPrasad 10L18LSIDiagonal decomposition: why/hownvvU ...1Let U have the eigenvectors as columns:nnnnnvvvvvvSSU............11111Then, SU can be writtenAnd S=UU–1.Thus SU=U, or U–1SU=Prasad 11Diagonal decomposition - exampleRecall .3,1 ;211221SThe eigenvectors and form  11111111UInverting, we have2/12/12/12/11UThen, S=UU–1 = 2/12/12/12/130011111RecallUU–1 =1.Example continuedLet’s divide U (and multiply U–1) by 2 2/12/12/12/130012/12/12/12/1Then, S=Q (Q-1= QT )Why? Stay tuned …Prasad 13L18LSIIf is a symmetric matrix:Theorem: There exists a (unique) eigen decompositionwhere Q is orthogonal:Q-1= QTColumns of Q are normalized eigenvectorsColumns are orthogonal.(everything is real)Symmetric Eigen DecompositionTQQS Prasad 14L18LSIExerciseExamine the symmetric eigen decomposition, if any, for each of the following matrices: 01100110 32214222Prasad 15L18LSITime out!I came to this class to learn about text retrieval and mining, not have my linear algebra past dredged up again …But if you want to dredge, Strang’s Applied Mathematics is a good place to start.What do these matrices have to do with text?Recall M  N term-document matrices … But everything so far needs square matrices – so …Prasad 16L18LSISingular Value DecompositionTVUA MM MN V is NNFor an M  N matrix A of rank r there exists a factorization(Singular Value Decomposition = SVD) as follows:The columns of U are orthogonal eigenvectors of AAT.The columns of V are orthogonal


View Full Document

Wright CS 707 - Latent Semantic Indexing

Download Latent Semantic Indexing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Latent Semantic Indexing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Latent Semantic Indexing 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?