Slide 1TaskOutlineContent-based Image Retrieval on the WWWPART I: A System for Large-scale, Content-based Image Retrieval on the WWWSystem OverviewVisual Features describe the ImagesCollateral Text as an additional FeatureRetrieval in 2 StepsRetrieval: TextRetrieval – Visual Features (MPEG-7)Retrieval – Challenges for Visual FeaturesA Combined Distance for the MPEG-7 FeaturesClustering speeds up the searchRelevance Feedback Improves the ResultsRelevance Feedback: Query Vector MovementRelevance Feedback: Weight AdapationImplementation – Software and HardwarePart II: The Semantics WithinSemantics: Combining Text and Visual FeaturesIdentifying a Method to find the SemanticsMethod: Data Mining for Semantic CluesFrequent Itemsets and Association RulesExample & AdvantagesUsing FIMI to find the itemsetsDiapers and Beer !!?Characteristics of the Itemsets and RulesExploiting the Itemsets and RulesSelecting Interesting Low-Level Clusters based on RulesThe Visual LinkThe Visual Link: A Graph-Based ApproachThe Visual Link: An ExampleThe Visual Link: An ApproximationDiscussion & DemoDiscussion: PrecisionBefore we continue … some numbersAnd now … the moment you’ve all been waiting for …ConclusionsQuestionsOutlookThanksWhich Rules are of Interest?Characteristics and ChallengesCharacteristics of the Itemsets and Rules - OverallWhy keyword filtering of the results does not workProposal: Semantic ClustersThursday, May 27, 2004A System for Large-scale, Content-based Web Image Retrieval- and the Semantics withinTill QuackThursday, May 27, 2004Create a content-based image retrieval system for the WWWLarge-scale, one order of magnitude larger than existing systems. Means O(106) itemsRelevance FeedbackExplore and exploit the semantics withinTake large-scale, content-based image retrieval one step closer to commercial applicationsTaskThursday, May 27, 2004OutlineContent-based Image Retrieval on the WWWPART I: A System for Image Retrieval on the WWWFeaturesRetrievalRelevance FeedbackSoftware DesignPART II: The Semantics withinIdentifying a Method to find SemanticsData Mining for Semantic CluesFrequent Itemset Mining and Association RulesThe Visual LinkDiscussion & DemonstrationConclusions & OutlookThursday, May 27, 2004Content-based Image Retrieval on the WWWCharacteristics of the data repositorySize: 4.2 billion documents in Google’s indexDiversity: Documents in any context, languageControl: Anybody can publish anythingDynamics: Ever changingSystem RequirementsFASTSCALABLEMake use of all the information availableMotivation for a new systemExisting systems •Either pure text (Google)•Or pure content-basedLarge-ScaleThursday, May 27, 2004PART I: A System for Large-scale, Content-based Image Retrieval on the WWWUllrich MoenichTill QuackLars ThieleThursday, May 27, 2004System OverviewKeywordsVisualFeaturesImage SpiderWorld Wide WebDMOZDataKeywordExtractionFeatureExtractionImage DescriptionImages(Binaries)KeywordIndexingClusteringKeyword RequestNearest Neighbor SearchMatching ImagesUser picksrelevantimagesMatching ImagesInverted Indexkeyid | imageidRetrievalOfflineCluster nCluster 2Cluster 1Cluster nCluster 2Cluster 1Cluster nCluster 2Cluster 1Cluster nCluster 2Cluster 1mySQLThursday, May 27, 2004Visual Features describe the ImagesGlobal Features from MPEG-7 StandardCurrently no Segmentation •Reasons: Scalability and the diversity of the dataTexture FeaturesEdge Histogram Descriptor (EHD)•Histogram of quantified edge directions. 80 dimensionsHomogeneous Texture Descriptor (HTD)•Output of Gabor filter-bank. 62 dimensions.Color FeaturesScalable Color Descriptor (SCD)•Color Histogram. 256, 128, 64 or 32 dimensionsDominant Color Descriptor (DCD)•Up to 8 dominant colors (3d color-space) and their percentages–32 “dimensions”•“Bins” defined for each imageThursday, May 27, 2004Collateral Text as an additional FeatureALT Tag and Collateral Text around imagesVERY uncontrolled annotationStemming: Porter StemmerExample: training -> trainMore matching terms for boolean queriesBut also some new ambiguities •train: to train [verb] / the train [noun]Thursday, May 27, 2004Retrieval in 2 StepsKeywordsVisualFeaturesImage SpiderWorld Wide WebDMOZDataKeywordExtractionFeatureExtractionImage DescriptionImages(Binaries)KeywordIndexingClusteringKeyword RequestNearest Neighbor SearchMatching ImagesUser picksrelevantimagesMatching ImagesInverted Indexkeyid | imageidRetrievalOfflineCluster nCluster 2Cluster 1Cluster nCluster 2Cluster 1Cluster nCluster 2Cluster 1Cluster nCluster 2Cluster 1mySQL1. Text Retrieval2. Visual Nearest Neighbor SearchThursday, May 27, 2004Retrieval: TextOptionsBoolean query on inverted indexVector Space ModelLSI etc.ChoiceRanked boolean queries on inverted indexRanking: tf*idfReasonsSpeedSparsity of data:•600 000 Keywords in total•1 document: 10-50 wordsKeyword ImageId tfshoe 1233 1sport 1233 1red 1233 1banana 1234 1fruit 1234 2Order 1234 1Keyid ImageId tf124 1233 1341 1233 1345 1233 1445 1234 175 1234 2875 1234 1Thursday, May 27, 2004Retrieval – Visual Features (MPEG-7)K-Nearest Neighbor search (K-NN)Find K closest candidates ci to query image q in a vector spaceDistance: Minkowsky Metrics for distance d(ci,q) namely L1 and L2 normsMost MPEG-7 descriptors are high-dimensional vectorsThe “dimensionality curse” appliesHigh dimensional spaces behave “weirdly”In particular the distances are not too meaningfulThursday, May 27, 2004Retrieval – Challenges for Visual FeaturesWe have several (visual) feature types How can we combine them?Our database is very large.How can we search it fast enough?i.e. how can we avoid comparing the query vector with each database entry?Thursday, May 27, 2004A Combined Distance for the MPEG-7 FeaturesWe use a combined distance of all the visual feature types The individual distances occupy different ranges in different distributionsThe distributions were transformed to a normal distribution in the range [0,1]The distances are then combined linearlyThursday, May 27, 2004Clustering speeds up the searchProblemMillions of items in DBLinear search over the whole dataset too slowLooking only for the K nearest neighbors anyway(One) SolutionPartition the data into Clusters, identified by representative, the
View Full Document