11Classic Information Retrieval2• Although search has changed, classictechniques still provide foundations –our starting point3Information Retrieval• User wants information from a collection of“objects”: information need• User formulates need as a “query”– Language of information retrieval system• System finds objects that “satisfy” query• System presents objects to user in “usefulform”• User determines which objects from amongthose presented are relevant4Information Retrieval cont.• Define each of the words in quotes– Information object– Query– Satisfying objects– Useful presentation• Notion of relevance critical– What really want?– Insufficient structure for exact retrieval• Develop algorithms for the search andretrieval tasks5Think first about text documents• Early digital searches – digital cardcatalog:– subject classifications, keywords• “Full text” : words + English structure– No “meta-structure”• Classic study– Gerald Salton SMART project 1960’s6Scaling• What are attributes changing from 1960’s toonline searches of today?Some of answers we had on board:– Much much larger collections– Heterogeneous collections– Collections dynamic: docs come, go, change– Decentralized / distributed collections– More diverse users• Use for relevance?– More complex queries– Much much more computing resources• How do they change problem?27Develop modelsBegin with document models on board:• Document is a ______ of terms*SetBagSequence* “term” is used instead of “word” to signal more generalpossibilities: serial numbers, nonsense,
View Full Document