DOC PREVIEW
UCSD CSE 252C - Using Multiple Segmentations

This preview shows page 1-2-20-21 out of 21 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Using Multiple Segmentations to Discover Objects and their Extent in Image CollectionsCarolina GalleguillosPDF created with pdfFactory trial version www.pdffactory.comGoal: Given a collection of unlabelled images, discover visual object categories and their segmentation automatically.IntroductionApproach: 1) Produce multiple segmentations of each image.2) Discover clusters of similar segments.3) Score all segments by how well they fit object cluster.PDF created with pdfFactory trial version www.pdffactory.comBackgroundThe task of discovering objects and scene categories [Fei-Fei& Perona, 2005] [Quelhas et al, 2005] and [Sivicet all, 2005]Borrowing tools from the statistical text analysis community (pLSAand LDA) that use bag of words approach.§ Images are treated as documents.§ Cluster affine invariant point descriptors as visual words.§ Each Image is represented by a histogram of visual words.MAPPING ONTO VISUAL DOMAIN:Issues: Visual words are not always as descriptive as text (visual phonemes or visual letters).PDF created with pdfFactory trial version www.pdffactory.comRepresent an image as a histogram of “visual words”•Detect affine covariant regions.•Represent each region by a SIFT descriptor.•Build visual vocabulary by k-means clustering (K~1,000).•Assign each region to the nearest cluster centre.2010...Background: Bag-of-words ApproachesPDF created with pdfFactory trial version www.pdffactory.comVisual word shortcomingsVisual Polysemy: Single visual word occurring on different (but locally similar) parts on different object categories.Visual Synonyms: Two different visual words representing a similar part of an object (wheel of a motorbike).If the object and its background are highly correlated, modelling the entire image can actually help recognition.PDF created with pdfFactory trial version www.pdffactory.comImagesMultiple segmentationsCars BuildingsIntuition #1: All segmentations are wrong, but some segments are goodIntuition #2: All good segments are alike, each bad segment is bad in its own way. Multiple segmentations for to produce groups of visual wordsPDF created with pdfFactory trial version www.pdffactory.comThe AlgorithmGiven a large collection of unlabeled images:1.For each image, compute multiple candidate segmentations using Normalized-Cuts.2.For each segment, compute histograms of visual words.3.Perform topic discovery, treating each segment as a document, using LDA over all segments in the collection.4.For each topic sort segments using KL divergence.PDF created with pdfFactory trial version www.pdffactory.comMultiple segmentationsWe use Normalized Cuts, varying parameter settings: # segments and image scale.PDF created with pdfFactory trial version www.pdffactory.comFind visual wordsDiscovering Objectsw …visual words d …documents (images) z …topics (‘objects’)P(w|d), P(z|d) and P(w|z) are multinomial distributionsUse statistical text analysis techniques such as Latent Semantic Analysis (LSA), Probabilistic LSA [Hofmann ’99] or Latent Dirichlet Allocation (LDA) [Bleiet al. ’03]. Here we chose LDA.Form histogramsDiscover topics (objects)SegmentsVisual wordsRepresenting Segments:Finding coherent segment clusters (topics):PDF created with pdfFactory trial version www.pdffactory.comLatent DirichletAllocation [Bleiet al, 2003]Generative probabilistic model for collections of discrete data such as text corpora.LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. Dirichlet priorPDF created with pdfFactory trial version www.pdffactory.comLatent DirichletAllocation [Bleiet al, 2003]Generative probabilistic model for collections of discrete data such as text corpora.LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. Multinomial distrib.of topics (or topics mixture)PDF created with pdfFactory trial version www.pdffactory.comLatent DirichletAllocation [Bleiet al, 2003]Generative probabilistic model for collections of discrete data such as text corpora.LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. TopicPDF created with pdfFactory trial version www.pdffactory.comLatent DirichletAllocation [Bleiet al, 2003]Generative probabilistic model for collections of discrete data such as text corpora.LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. wordPDF created with pdfFactory trial version www.pdffactory.comLatent DirichletAllocation [Bleiet al, 2003]Generative probabilistic model for collections of discrete data such as text corpora.LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. Matrix P(wi=1 | zi=1)PDF created with pdfFactory trial version www.pdffactory.comLatent DirichletAllocation [Bleiet al, 2003]Generative probabilistic model for collections of discrete data such as text corpora.LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. Free variational parametersPDF created with pdfFactory trial version www.pdffactory.comSegment scoringCompare segment distributions against learned topic distribution over visual words using KL divergenceVisual wordsProbabilityVisual wordsProbabilityVisual wordsProbabilityLearned topicdistributionKL divergence: 1.89 KL divergence: 2.90PDF created with pdfFactory trial version www.pdffactory.comSegmentations and their KL divergenceRetrieval accuracy Segmentation accuracyAverage precision for MSRC Average overlap area


View Full Document

UCSD CSE 252C - Using Multiple Segmentations

Download Using Multiple Segmentations
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Using Multiple Segmentations and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Using Multiple Segmentations 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?