WSU CSE 6363 - Personalized Web Search

Unformatted text preview:

INTRODUCTIONPROBLEMUser Search HistoryUser ProfileMatrix Representation of User Search History and User ProfileA Category HierarchyInference of User Search IntentionALGORITHMS TO LEARN PROFILESTwo LLSF-based AlgorithmsRocchio-based AlgorithmkNNAdaptive LearningMAPPING QUERIES TO CATEGORIESUsing User Profile OnlyUsing General Profile OnlyUsing Both User and General ProfilesEXPERIMENTSData SetsPerformance MetricExperimental ResultsCONCLUSIONACKNOWLEDGMENTSREFERENCESPersonalized Web Search by Mapping User Queries to Categories Fang Liu Department of Computer Science, University of Illinois at Chicago Chicago, IL 60607 (312) 996-4881 [email protected] Clement Yu Department of Computer Science, University of Illinois at Chicago Chicago, IL 60607 (312) 996-2318 [email protected] Weiyi Meng Department of Computer Science, SUNY at Binghamton Binghamton, NY 13902 (607) 777-4311 [email protected] ABSTRACT Current web search engines are built to serve all users, independent of the needs of any individual user. Personalization of web search is to carry out retrieval for each user incorporating his/her interests. We propose a novel technique to map a user query to a set of categories, which represent the user's search intention. This set of categories can serve as a context to disambiguate the words in the user's query. A user profile and a general profile are learned from the user's search history and a category hierarchy respectively. These two profiles are combined to map a user query into a set of categories. Several learning and combining algorithms are evaluated and found to be effective. Among the algorithms to learn a user profile, we choose the Rocchio-based method for its simplicity, efficiency and its ability to be adaptive. Experimental results indicate that our technique to personalize web search is both effective and efficient. Categories and Subject Descriptors H.3.4 [Information Storage and Retrieval]: Systems and Software – User profiles and alert services General Terms Algorithms, Performance, Experimentation, Design. Keywords Personalization, Search Engine, Category Hierarchy, Information Filtering 1. INTRODUCTION As the amount of information on the Web increases rapidly, it creates many new challenges for Web search. When the same query is submitted by different users, a typical search engine returns the same result, regardless of who submitted the query. This may not be suitable for users with different information needs. For example, for the query "apple", some users may be interested in documents dealing with “apple” as “fruit”, while other users may want documents related to Apple computers. One way to disambiguate the words in a query is to associate a small set of categories with the query. For example, if the category "cooking" or the category "fruit" is associated with the query "apple", then the user's intention becomes clear. Current search engines such as Google or Yahoo! have hierarchies of categories to help users to specify their intentions. The use of hierarchical categories such as the Library of Congress Classification is also common among librarians. A user may associate one or more categories to his/her query manually. For example, a user may first browse a hierarchy of categories and select one or more categories in the hierarchy before submitting his/her query. By utilizing the selected categories, a search engine is likely to return documents that are more suitable to the user. Unfortunately, a category hierarchy shown to a user is usually very large, and as a result, an ordinary user may have difficulty in finding the proper paths leading to the suitable categories. Furthermore, users are often too impatient to identify the proper categories before submitting his/her queries. An alternative to browsing is to obtain a set of categories for a user query directly by a search engine. However, categories returned from a typical search engine are often too many and independent of a particular user. In addition, many of the returned categories do not reflect the intention of the searcher. This paper studies how to supply, for each user, a small set of categories as a context for each query submitted by the user, based on his/her search history. Specifically, we provide a strategy to (1) model and gather the user's search history, (2) construct a user profile based on the search history and construct a general profile based on the ODP (Open Directory Project1) category hierarchy, (3) deduce appropriate categories for each user query based on the user's profile and the general profile, and (4) perform numerous experiments to demonstrate that our strategy of combining a user profile and a general profile (general knowledge) is both effective and efficient. The categories obtained from the proposed method are likely to be related to the user's interest and, therefore, can provide a proper context for the user query. Consider the situation where a mobile user wants to retrieve documents using his/her PDA. Since the bandwidth is limited and the display is small, it may not be practical to transmit a large number of documents for the user to choose the relevant ones. Suppose that it is possible to show the 1 RDF dumps of the Open Database are available for download from http://dmoz.org/rdf.html Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. CIKM’02, November 4--9, 2002, McLean, Virginia, USA Copyright 2002 ACM 1-58113-492-4/02/0011…$5.00. 558retrieved documents on one screen. If these documents are not relevant to the user, there is no easy way for the user to direct the search engine to retrieve relevant documents. With the use of our proposed technique, a small number of categories with respect to the user’s query are shown. If none of the categories is desired, the next set of categories is provided. This is continued until the user clicks on the desired categories, usually one, to express his/her intention. As will be demonstrated by our experiments, the user usually finds the categories of interest among the first 3 categories obtained by our system. Since 3 categories


View Full Document

WSU CSE 6363 - Personalized Web Search

Download Personalized Web Search
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Personalized Web Search and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Personalized Web Search 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?