Unformatted text preview:

Component Analysis vs DiscriminantsPCA: Linear Projections of DataProjection to lower dimensional spaceLinear ProjectionPrincipal Components AnalysisData MatrixScatter MatrixProjectionVariance along ProjectionOptimization ProblemCharacteristic EquationFirst Principal ComponentOther Principal ComponentsAlternative Derivation of PCAProjection into k Eigen VectorsExample of PCAPCA using covariance matrixText Retrieval Application of PCALatent Semantic Indexing (LSI)LSI methodSingular Value DecompositionLSI Method: First Two Principal Components of Document Term MatrixLSI Practical IssuesComponent Analysis and Discriminantsx2Reducing dimensionality when:1. Classes are disregarded Principal Component Analysis (PCA)2. Classes are considered Discriminant Analysisx11Component Analysis vs DiscriminantsTwo classical approaches for finding effective linear transformationsz PCA (Principal Component Analysis) “Projection that best represents data in least- square sense”z MDA (Multiple Discriminant Analysis) “Projection that best separates the data in a least-squares sense”82PCA: Linear Projections of Dataz Excessive dimensionality x= [x1, x2,… xd] causesz Computational difficultyz Visualization issuesz Solution: z Combine features to reduce dimensionalityz Linear combinations, e.g., 2x1+3x2+x3z are simple to compute and tractablez Project high dimensional data onto a lower dimensional space83Projection to lower dimensional spacez Allow computer to search for interesting directionsx2x1Direction defined byx1 -x2= 0Direction:x1 + x2= 0Good separationPoor Separation4Linear Projectionz Equation of line through origin: x1+ 2x2 + 4x3= 0z Projection of a point x along line with projection weights a is given by:z Example:[]042 4 2 1321321=++=⎥⎥⎥⎦⎤⎢⎢⎢⎣⎡xxxxxxx∑==djjTx1jaxaDirection defined byx1 -x2= 0or a =[1 -1](1,6)Projection of (1,6) along [1 -1]: 1 - 6 = -5Can be written as wtx = 0where w = [1 2 4]Projection of (1,1) along [1 -1]: 1 - 1 = 0(1,1)Projection of (1,-1) along [1 -1]: 1 + 1 = 25Principal Components Analysisz We will look at Data Matrix, Scatter Matrix and Covariance Matrix to arrive at best projection of data6Data MatrixLet X be a n x d data matrix of n samplesd variablesx(1)x(i) is a d x 1column vectorEach row of matrixis of the form x(i)Tx(i)x(n)Assume X is mean-centered, so that the value of each variable issubtracted for that variable7Scatter Matrix[][][ ]dddddd Stknkk x is x 11 x sincematrix x a))((1mxmx −−=∑=z Scatter Matrix is (n-1) times Covariance Matrix z Relationship Between Data Matrix and Scatter Matrix:mean zero has sincedata theofmatrix scatter theis XddXXSt×=8ProjectionLet a be a p x 1 column vector of projection weights thatresult in the largest variance when the data matrix X are projected along aThe projection of any data vector x is the linear combination∑==djjTx1jaxaProjected values of all data vectors in data matrix X onto acan be expressed asXawhich is an n x 1 column vector.9Variance along ProjectionVariance along a is()()meanzerohas X sincedata theofmatrix covariance theis whereaaaaaa2ddXXSSXXXXtTtTT×====aσThus variance is a function of both a and SMaximizing variance along a is not well-defined since we can increase it without limit by increasing the size of the components of a.10Optimization ProblemImpose a normalization constraint on the a vectors such thataTa = 1Optimization problem is to maximize)1aa(aa −−=TTSuλVarianceCriterionNormalizationCriterionWhere λis a Lagrange multiplier.Solution: Differentiating wrt a yields0I)a-(S toreduceswhich 0a2a2a==−=∂∂λλSuCharacteristic Equation of S!11Characteristic EquationGiven a d x d matrix M, a very important class of linearequations is of the formdxddx1 dx1xMxλ=which can be rewritten as0I)x-(M=λIf M is real and symmetric there are d possible solution vectors (vectors x that satisfy the charecteristic equation) called Eigen Vectors, e1, .., edand associated Eigen valuesdλλ,...,112First Principal ComponentThe matrix M is the Covariance matrix S,Characteristic Equation0a)IS(=−λRoots are Eigen ValuesCorresponding Eigen Vectors are principal componentsFirst principal component is the Eigen Vector associatedwith the largest Eigen value of S.13Other Principal Componentsz Second Principal component is in direction orthogonal to firstz Has second largest Eigen value, etcFirst PrincipalComponentx1x2SecondPrincipalComponent14Alternative Derivation of PCAz DHS Text gives another way to derive the fact that eigenvectors of scatter matrix are principal components z By setting up the squared error criterion function of the data and the vector e that minimizes it satisfies the characteristic equationz Can be generalized to from one-dimensional projection to a dimension d’ which are the eigen vectors of S forming the principal components15Projection into k Eigen VectorsVariance of data projected into first k Eigen vectors isSquared error in approximating true data matrix X using only first k Eigen vectors is∑=kjj1λ∑∑=+=pllpkjj11λλUsually 5 to 10 principalcomponents capture 90%of variance in the data16Scree PlotAmount of varianceexplained by eachconsecutiveEigen valueCPU dataEigen values:63.2610.7010.306.685.232.181.310.34Weights put by first componenton eight variables are:0.199-0.365-0.399-0.336-0.331-0.298-0.421-0.423CPU data (8 variables)Eigen values of Correlation MatrixScatterplot Matrix for CPU dataExample of PCAEigenVector17PCA using covariance matrixProportions ofvariation attributableto different components:96.023.930.040.010000Scree Plots: (a) Eigen values from correlation matrix(b) Eigen values from covariance matrix18Graphical Use of PCAFirst twoprincipal componentsof six dimensional data(17 pills:times at whichspecified proportionof pill has dissolved:10%,30%,50%,70%,75%,90%)Pill 3 is very differentProjection onto first two principal componentsComponent 1Component 219Text Retrieval Application of PCAz Retrieving Documents based on Vector Space Model ( Key Words)z Vector Space representation of documentsz Query is represented as a vector, Distance is Cosine of angle between query and documentz Data Matrix is represented in terms of Principal Components using Singular Value Decompositiondatabase SQL index regression likelihood linearD1 24219003D2 32105030D3 12165000D4 672000D5 43 31 20 0 3 0D6 20018716D7 0 0 1 32 12 0D8 3002242D9 1 0 0 34 27 25D10 6001742320Latent Semantic Indexing (LSI)z Disadvantage of exclusive


View Full Document
Download Component Analysis and Discriminants
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Component Analysis and Discriminants and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Component Analysis and Discriminants 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?