Hyper-Searching the WebSlide 2Basic search engineSlide 4Slide 5Cluster search engineMeta-search engine“Smarter” meta-search engineThe Clever ProjectSlide 10Slide 11Clever vs. GoogleHyper-Searching the WebSearch EnginesBasic Search(index)Cluster Search(themes)Meta-search(outsource)“Smarter” meta-search(themes + outsource)Basic search engine•Examples: AltaVista, InfoSeek, HotBot, Lycos, Excite, Google, etc•Maintains an index for every word found•Processes through crawling, indexing, and returning resultsBasic search engine•Different ranking systems used -most use heuristics (easiest solution) counts # of keywords that appear -Google uses PageRankBasic search engine•No idea of searcher’s intent so “best” result hard to achieve•Problems with synonymy and polysemy ex. car and automobile ex. jaguar•One solution: store semantic relations -only can help w/synonmy•Can’t identify concepts/author intent ex. IBM site does not say “computer”Cluster search engine•Example: Clusty•Clusters results into categories/themes•Can show results that would be ranked lower in another search engine -due to different meanings in words, can show the less searched-forMeta-search engine•Examples: Dogpile, Surfwax, Copernic, etc•Sends searcher’s query to a database of search engines•Claimed to not be any better than database; often the referenced search engines are small, free, commercial•Users can create their own on Google of up to 5,000 URLs as “database”“Smarter” meta-search engine•Example: Clever project (n/a online yet)•Includes clustering and linguistic analysis“cat”AltaVistaYahooGoogleClever“cat”“cat”Cat – felineCat – powerCat – equipmentCat – scansetc.The Clever Project•Uses hyperlinks to locate hubs and authorities“a respected authority is a page that is referred to by many good hubs; a useful hub is a location that points to many valuable authorities”The Clever Project•Obtains a list of webpages from a standard index & follows hyperlinks to increase own database -resulting collection = “root set” -each page gets numerical hub & authority scoreThe Clever Project•Similar to PageRank in determining method – guesses & constant calculations -useful by-product: clusters sites•Adds to competition because competitors don’t have to acknowledge their competition through hyperlinksClever vs. Google GOOGLE - gives initial rankings - keeps pages indpt. of queries - faster - looks forward “link to link” CLEVER - root sets per keyword - page priority through query context - forwards & backwards “hub and authority” - sometimes too broad ex.
View Full Document