Minimum Probability of Error Image RetrievalIntroductionSignal indexing and retrievalImage indexing and retrievalWhy is this interesting?CurrentlyContent-based retrievalFormulationDecision-theoretic formulationDecision-theoretic retrievalRetrieval architectureTraditional solutionsMPE retrieval systemsFeature representationSummaryFeatures vs representationBounds on recognition accuracyIn wordsOptimal subspace selectionOptimal embedding spaceFeature TransformationsCorel Image DatabaseCOIL Object DatabasePrecision at 20% RecallComplexityProbability density modelsPropertiesPropertiesInvarianceEmbedded multi-resolution mixtureEmbedded multi-resolution mixtureEmbedded multi-resolution mixtureEmbedded multi-resolution mixtureEmbedded multi-resolution mixtureAnd it works!Database structureHierarchical searchSome notation/simplificationsLearningInsightMixture hierarchiesHierarchical EstimationHierarchical updates (NIPS 98)Applications: recognitionRecognition resultsRecognition examplesApplication: semantic classifiersSemantic classificationRecognition resultsRecognition resultsPersonal picture collectionsClassification resultsErrors (indoor vs outdoor)Errors (sky detection)1SVCLMinimum Probability of Error Image RetrievalNuno Vasconcelos ECE DepartmentUCSD2SVCLIntroduction•to acquire, store, process images is now trivial•digital imagery is pervasive and easy to move around•problem: finding things that we are interested in•interest in visual search and retrieval engines3SVCLSignal indexing and retrieval•the long-term vision is:• 1) do for signals what search engines have done for text (always searchable, accurate)•2) semantics: from “best matches” to “best answers” (“what was the average rain fall in the top five states?”) • 3) support user interaction and personalization(both within a single session and between multiple sections).• in the short-term we concentrate on the image/video database domain.4SVCLImage indexing and retrieval•at the intersection of a number of disciplines1) vision/IP for representation (what are good features?)2) learning for personalization(relevance feedback,“this is my dog”)3) DB theory for speed(indexing)•one would expect that putting these together should be easy.•nothing could be further from the truth!!Indexing/retrievalLearning/Pattern rec.Vision/Image Proc.Databasetheory5SVCLWhy is this interesting?•as vision/IP, pattern recognition, or DB problem has various “non-standard” features:1) very large amounts of data2) very large numbers of classes (up to one per image)3) no control over imaging conditions4) “highly non-linear” user interaction5) non-metric spaces6) these properties break all the standard assumptions of vision, learning, and DB theory7) lots of interesting open questions at both theoretical and practical levels (e.g. scalability, all the difficult vision issues, etc)8) lots of practical application scenarios too.6SVCLCurrently•annotate all images and do text queries•problems:– annotation is time consuming– images and words have multiple interpretations– “annotations reflect the interpretation of who created them, not that of who is searching”7SVCLContent-based retrieval•allow users to express queries directly in visual domain–user provides query image–system extracts low-level features (texture, color, shape)–signature compared with those extracted from database–top matches returnedTexturesimilarityColorsimilarityShapesimilarity8SVCLFormulation• retrieval as classification• database as a set Y={1,…,C} of image classes• query as a new data-point to classify???????9SVCLDecision-theoretic formulation• given: feature space X and set Y={1,…,C} of classes • goal: design map that minimizes probability of retrieval error• Bayes classifier is optimal),( ),)((minarg*oooogyxyxgPg ∀≠=YXg →:*)|(maxarg)(*xiyPxgi==)()|(maxarg iyPiyxPi===10SVCLDecision-theoretic retrieval1) select a feature space2) estimate conditional density in this space for each class()iyxp =|~())(~|~maxarg)( iypiyxpxgi===3) retrieval = “what class density is most likely to have generated the queryexamples?”11SVCLRetrieval architecture++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++=?++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++•should guarantee MPE solution•three main components– feature transformation– density models– similarity function•we already have optimal similarity function•can we pick the other two optimally as well?12SVCLTraditional solutions•most do not really scale well to this type of problem– e.g. (vailaya et al.) report 12 days of computing time for application of a traditional feature selection method (sequential search) on a relatively small problem (2 classes, 1K images per class)– practical systems rely on “empirical features” (e.g. color histograms, edginess, cepstral coeficients, …)– these tend to work well, but on specific domains•basic idea:– start from a library of “empirical feature sets” – select the MPE-set for the problem at hand•problem: how do we know what that is?13SVCLMPE retrieval systems• probability of error is lower bounded by Bayes error:•theorem:for a retrieval problem with observation space Z and a feature transformationthe Bayes error on X can never be smaller than that on Z. Equality is achieved if and only if T is invertible.• intrinsic result: emphasis on features is a bad idea[])|(max1*xiyPELix=−=XZT→:⎩⎨⎧>= otherwise , ,****invertible is T ifZXZXLLLL14SVCLFeature representation• Theorem: Given a retrieval problem with equiprobableclasses, a feature space X, unknown class-likelihoods p(x|y=i), and a decision
View Full Document