UT CS 395T - Visual Recognition - D2783384

Home> Schools> University of Texas at Austin> Computer Science (CS) > CS 395T> Visual Recognition

DOC PREVIEW

UT CS 395T - Visual Recognition

School name University of Texas at Austin

Course Cs 395t- Multicore Operating Systems Implementation

Pages 5

This preview shows page 1-2 out of 5 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 5 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 5 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 5 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

1Visual Recognition: FacesTuesday, January 23Why faces?• Natural applications in human-computer interfaces (teleconferencing, assistive technology), organizing personal photos, surveillance,…• Well-studied category, special structure• We’ll touch on a only a few general approachesFaces• Detection: given an image, where is the face?• Recognition: whose face is it?Image credit: H. RowleyAnnChallenges• Face pose• Occlusions• Illumination• Variable components (glasses, mustache, etc.)• Differences in expressionApproaches• Subspaces– e.g. Turk and Pentland, Belhumeur and Kreigman• Shape and appearance models– e.g. Cootes and Taylor, Blanz and Vetter•Boosting– e.g. Viola and Jones• Neural networks– e.g. Rowley et al.• SVMs– e.g. Heisele et al., Guo et al.•HMMs– e.g. Nefian et al.Eigenpictures/Eigenfaces• Sirovitch and Kirby 1987: PCA to compress face images• Turk and Pentland 1991: PCA + nearest neighbors to classify face images• Main idea: face images are highly correlated; low-d subspace captures most appearance variation2• Around d=80,000 pixels each• To represent the space accurately, want num samples >> d• But space of face images actually much smaller than space of all 80,000 dimensional imagesImages as high-dimensional pointsPCA intuition• Construct lower dimensional linearsubspace that best explains variation of the training examplesPixel value 1Pixel value 2u1A face imageA (non-face) imagePCA• N data points: x1,…,xNxiin Rd• Mean vector μ, covariance matrix ΣWhat unit vector u in Rdcaptures the most possible variance of the data?PCAPixel value 1Pixel value 2u1projection of data pointcovariance of data pointsMaximizing this is an eigenvalue problemÆ use eigenvector(s) of Σ that correspond to the largest eigenvalue(s) as the new basis.Eigenfaces• Premise: Set of faces lie in a subspace of set of all images• Use PCA to determine the k (k<d) vectors u1,…ukthat span that subspace:x =~ μ + w1u1+ w2u2+ … + wkuk• Then essentially use nearest neighbors in “face space” coordinates (w1,…wk) to do recognitionEigenfacesTraining images:x1,…,xN3EigenfacesTop eigenvectors: u1,…ukMean: μEigenfacesFace x in “face space” coordinates:+…=++++++Eigenface recognition• Process labeled training images:–Run PCA– Project each training image onto subspace• Given novel image:– Project onto subspace– If reconstruction error too largeNot a face– Else if too far from any training faceUnknown face–ElseClassify as closest training face in k-dimensional subspaceSmall demo• Eigenfaces on the face images in the Caltech-4 database• 435 images, same scale, alignedMean9 closest to meanPrincipal component 1Principal component 24Visualizing the primary modes of variationVisualizing the primary modes of variationClustering in the face subspaceClustering in face subspaceClustering in face subspace Limitations• PCA useful to represent data, but directions of most variance not necessarily useful for classification (see work by Belhumeur & Kreigman using LDA)• Not appropriate for all data: PCA is fitting hyperplane to data / Gaussian where Σ is covariance matrix (see nonlinear techniques)• In this application, assumptions about pre-processing applied to face images may be unrealistic• Suited for what kinds of categories?5FisherfacesBelhumeur et al. PAMI 1997Rather than maximize scatter of projected classes as in PCA, maximize ratio of between-class scatter to within-class scatter by using Fisher’s Linear DiscriminantNon-linear dimensionality reduction• Locally Linear Embedding (LLE), Roweisand Saul• Isomap, Tenenbaum et al.• Kernel PCA, Scholkopf et al.• Laplacian Eigenmaps, Belkin and NiyogiImage credit: Roweis and SaulActive appearance models• Eigenfaces model appearance only, and so cannot be robust to shape, pose and expression changes• Active appearance models (Cootes and Taylor) model shape and appearanceActive appearance modelsFactor out the faces’ shape differences when comparing their texture / appearanceComing up• For Thursday: more on faces– Read Viola and Jones, and Sinha et al.– Review on Viola and Jones due– Zubair will present• For Tuesday: part-based models– Read Felzenszwalb and Huttenlocher– Review due– Pushkala will

View Full Document

UT CS 395T - Visual Recognition

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2 out of 5 pages.

UT CS 395T - Visual Recognition

Sign up for free to view:

Please select your school