DOC PREVIEW
UCLA STAT 231 - Dimension Reduction

This preview shows page 1-2-3-4-5-6 out of 17 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Lecture note for Stat 231: Pattern Recognition and Machine LearningLecture 4-7: Dimension Reduction One common method is to assume smooth density functions in empty spaces, e.g.Lecture note for Stat 231: Pattern Recognition and Machine LearningDimension reduction techniquesThe other method is to reduce the dimension of the feature space, for example by projecting a feature vector to a lower dimensional space.Common techniques for dimension reduction:1. Principle component analysis (PCA)2. Fisher linear discriminant analysis3. Independent component analysis (ICA)4. Multi-dimensional scaling (MDS)5. Over-complete bases coding6. Transformed component analysis (TCA)Lecture note for Stat 231: Pattern Recognition and Machine LearningPCAThe principal component analysis (PCA), also called Karhunen-Loeve transform in functional space,is widely used for dimension reduction. In vision, it becomes popular by the eigen-face example.There are many ways to derive PCA, here we study it from the perspective of dimension reduction.Given: a number of n samples { x1,x2, …, xn} in d-space.Objective: project it in a d’< d space, that is, approximate each vector xkby Criterion: minimize the sum of squared error.∑∑==−+=nkkdiikidxeameamJ12'1'||)(||),,(kdiikixeam →+∑='1Lecture note for Stat 231: Pattern Recognition and Machine LearningPCAe1e2The result of minimizing the error is:m is the sample mean, eiis the i-th largest eigen-vector of the co-variance matrexakiis the projection of xkto eiThe book derives this in three separate steps. As this is so well-known, wedon’t unfold the details.Lecture note for Stat 231: Pattern Recognition and Machine LearningExample on face representation400 images each labeled with 122 points.Lecture note for Stat 231: Pattern Recognition and Machine LearningEigen-vectors for Geometry and PhotometryLecture note for Stat 231: Pattern Recognition and Machine LearningProblem with PCAThe PCA reduces dimension for each class individually. The resulting components are good representation of the data in each class, but they may not be good for discrimination purpose.For example, suppose the two classes have 2D Gaussian-like densities, represented by the twoellipsis. They are well separable. But if we project the data to the first principal component (i.e. from 2D to 1D), then they become inseparable (with a very low Chernoff information). The best projection is the short axis. In fact, this is typical problem in studying generative models vs discriminative models. The generativemodels aim at representing the data faithfully, while discriminative models target telling objects apart.Lecture note for Stat 231: Pattern Recognition and Machine LearningFisher linear discriminantLecture note for Stat 231: Pattern Recognition and Machine LearningFisher linear discriminantLecture note for Stat 231: Pattern Recognition and Machine LearningFisher linear discriminantThese are 1D variablesLecture note for Stat 231: Pattern Recognition and Machine LearningFisher linear discriminantThis is a typical criterion used in almost all discriminative methods.Lecture note for Stat 231: Pattern Recognition and Machine LearningFisher linear discriminantLecture note for Stat 231: Pattern Recognition and Machine LearningFisher linear discriminantObservation: In the Bayes decision study before, for twoa straight line whose normal is the Fisher linear discriminantLecture note for Stat 231: Pattern Recognition and Machine LearningMultiple discriminant analysisFor c classes, we compute c-1 discriminants, that is, to project the d-dimensional featuresinto (c-1)-space. The linear discriminant is a special case with c=2. For example, C=3.The 2-discriminants span a 2D plane, the left projection is better than the rightLecture note for Stat 231: Pattern Recognition and Machine LearningMultiple discriminant analysisLecture note for Stat 231: Pattern Recognition and Machine LearningMultiple discriminant analysisLecture note for Stat 231: Pattern Recognition and Machine LearningExamples in


View Full Document

UCLA STAT 231 - Dimension Reduction

Download Dimension Reduction
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Dimension Reduction and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Dimension Reduction 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?