View Full Document


Unformatted text preview:

Elimination of Junk Document Surrogate Candidates through Pattern Recognition Eunyee Koh Daniel Caruso Andruid Kerne Ricardo Gutierrez Osuna Interface Ecology Lab Center for Study of Digital Libraries Computer Science Department Texas A M University College Station TX 77843 USA eunyee dcaruso andruid rgutier cs tamu edu ABSTRACT 1 INTRODUCTION A surrogate is an object that stands for a document and enables navigation to that document Hypermedia is often represented with textual surrogates even though studies have shown that image and text surrogates facilitate the formation of mental models and overall understanding Surrogates may be formed by breaking a document down into a set of smaller elements each of which is a surrogate candidate While processing these surrogate candidates from an HTML document relevant information may appear together with less useful junk material such as navigation bars and advertisements Representing large collections of documents to users in ways that facilitate understanding the essential meanings that the documents convey is a hard problem This is a form of Vanevar Bush s problem which frames our field there is too much information 4 Surrogates are information elements selected from a specific document which can be used in place of the original document 3 25 Most responses to search queries are represented in the form of lists of textual surrogates 14 32 35 Yet studies have shown that users prefer image and text surrogates and understand them more readily 10 20 Further image and text representations facilitate the formation of mental models 13 Building good image and text surrogates for a document is not simple and straightforward One approach to this problem is to explicitly include image and text surrogates among the metadata that is specified for each document just as abstracts are kept as textual representations Image and text surrogates function as boosters 28 that add value to the process of content aggregation by promoting

Access the best Study Guides, Lecture Notes and Practice Exams

Loading Unlocking...

Join to view doceng07_koh and access 3M+ class-specific study document.

We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view doceng07_koh and access 3M+ class-specific study document.


By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?