Unformatted text preview:

Hyper-Searching the WebSlide 2Basic search engineSlide 4Slide 5Cluster search engineMeta-search engine“Smarter” meta-search engineThe Clever ProjectSlide 10Slide 11Clever vs. GoogleHyper-Searching the WebSearch EnginesBasic Search(index)Cluster Search(themes)Meta-search(outsource)“Smarter” meta-search(themes + outsource)Basic search engine•Examples: AltaVista, InfoSeek, HotBot, Lycos, Excite, Google, etc•Maintains an index for every word found•Processes through crawling, indexing, and returning resultsBasic search engine•Different ranking systems used -most use heuristics (easiest solution) counts # of keywords that appear -Google uses PageRankBasic search engine•No idea of searcher’s intent so “best” result hard to achieve•Problems with synonymy and polysemy ex. car and automobile ex. jaguar•One solution: store semantic relations -only can help w/synonmy•Can’t identify concepts/author intent ex. IBM site does not say “computer”Cluster search engine•Example: Clusty•Clusters results into categories/themes•Can show results that would be ranked lower in another search engine -due to different meanings in words, can show the less searched-forMeta-search engine•Examples: Dogpile, Surfwax, Copernic, etc•Sends searcher’s query to a database of search engines•Claimed to not be any better than database; often the referenced search engines are small, free, commercial•Users can create their own on Google of up to 5,000 URLs as “database”“Smarter” meta-search engine•Example: Clever project (n/a online yet)•Includes clustering and linguistic analysis“cat”AltaVistaYahooGoogleClever“cat”“cat”Cat – felineCat – powerCat – equipmentCat – scansetc.The Clever Project•Uses hyperlinks to locate hubs and authorities“a respected authority is a page that is referred to by many good hubs; a useful hub is a location that points to many valuable authorities”The Clever Project•Obtains a list of webpages from a standard index & follows hyperlinks to increase own database -resulting collection = “root set” -each page gets numerical hub & authority scoreThe Clever Project•Similar to PageRank in determining method – guesses & constant calculations -useful by-product: clusters sites•Adds to competition because competitors don’t have to acknowledge their competition through hyperlinksClever vs. Google GOOGLE - gives initial rankings - keeps pages indpt. of queries - faster - looks forward “link to link” CLEVER - root sets per keyword - page priority through query context - forwards & backwards “hub and authority” - sometimes too broad ex.


View Full Document
Download Lecture
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?