Stanford CS 276B - Lecture Notes - D826084

Home> Schools> Stanford University> Computer Science (CS) > CS 276B> Lecture Notes

Stanford CS 276B - Lecture Notes

Course Cs 276b- Text Information Retrieval, Mining, and Exploitation

Pages 8

Download Save

Unformatted text preview:

1CS276BWeb Search and MiningWinter 2005Lecture 5(includes slides borrowed from Jon Herlocker)Recap: Project and PracticumWe hope you’ve been thinking about projects!Revised concrete project plan due todayInitial project presentation: Thursday and TuesdayAbout 10 minutes per groupAbout 5 minutes presentations and a few minutes discussionA chance to explain and focus what you are doing and why it’s interestingPlan for TodayRecommendation Systems (RS)The most prominent type of which goes under the name Collaborative Filtering (CF)What are they are and what do they do?A couple of algorithmsGoing beyond simple behavior: contextHow do you measure them?Recommendation SystemsGiven a set of users and itemsItems could be documents, products, other users …Recommend items to a user based onPast behavior of this and other usersWho has viewed/bought/liked what?Additional information on users and itemsBoth users and items can have known attributes [age, genre, price, …]What do RSs achieve?Help people make decisionsExamples: Where to spend attentionWhere to spend moneyHelp maintain awarenessExamples:New productsNew informationSample ApplicationsEcommerceProduct recommendations - amazonCorporate IntranetsRecommendation, finding domain experts, …Digital LibrariesFinding pages/books people will likeMedical ApplicationsMatching patients to doctors, clinical trials, …Customer Relationship ManagementMatching customer problems to internal experts2Well-known recommender systems: Amazon and NetflixCorporate intranets - document recommendationCorporate intranets - “expert”findingInputs to intranet systemBehaviorusers’ historical “transactions”Contextwhat the user appears to be doing nowUser/domain attributesadditional info about users, documents …Inputs - more detailPast transactions from users:which docs viewedcontent/attributes of documentswhich products purchasedpages bookmarkedexplicit ratings (movies, books … )Current context:browsing historysearch(es) issuedExplicit role/domain info:Role in an enterpriseDocument taxonomiesInterest profilesExample - behavior onlyUsers Docs viewedU1U2d1d2d3U1 viewed d1, d2, d3.U2 views d1, d2.Recommend d3 to U2.?3Expert finding - simple exampleRecommend U1 to U2 as someone to talk to?U1U2d1d2d3Simplest Algorithm: Naïve k Nearest NeighborsU viewed d1, d2, d5.Look at who else viewed d1, d2 or d5.Recommend to U the doc(s) most “popular” among these users.UVWd1d2d5Simple algorithm - shortcomingTreats all other users as equally importantIgnores the fact that some users behaved more like me in the pastTypical RS issuesLarge item space Usually with item attributesLarge user baseUsually with user attributes (age, gender, city, …)Some evidence of customer preferencesExplicit ratings (powerful, but harder to elicit)Observations of user activity (purchases, page views, emails, what was printed, …)Typically extremely sparse, even when user has an opinionUsers ItemsObserved preferencesThe RS SpaceItem-ItemLinksUser-UserLinksLinks derived from similar attributes, similar content, explicit cross referencesLinks derived from similar attributes, explicit connections(Ratings, purchases, page views, laundry lists, play lists)DefinitionsA recommendation system is any system which provides a recommendation/prediction/opinion to a user on itemsRule-based systems use manual rules to do thisAn item similarity/clustering system uses item links to recommend items like ones you likeA classic collaborative filtering system uses the links between users and items as the basis of recommendationsCommonly one has hybrid systems which use all three kinds of links in the previous picture4Link typesUser attributes-based RecommendationMale, 18-35: Recommend The MatrixContent SimilarityYou liked The Matrix: recommend The Matrix ReloadedCollaborative FilteringPeople with interests like yours also liked Kill BillRule-based recommendationsIn practice – rule-based systems are common in commerce enginesMerchandizing interfaces allow product managers to promote itemsCriteria include inventory, margins, etc.Must reconcile these with algorithmic recommendationsMeasuring collaborative filteringHow good are the predictions?How much of previous opinion do we need?Computation.How do we motivate people to offer their opinions?Matrix viewA = UsersDocsAij = 1 if user i viewed doc j,= 0 otherwise.AAt : Entries give # of docs commonly viewed by pairs of users.Voting AlgorithmRow i of AAt: Vector whose jthentry is the # of docs viewed by both i and j.Call this row ri’e.g., (0, 7, 1, 13, 0, 2, ….)What’s on the diagonal of AAt?Voting AlgorithmThen ri °A is a vector whose kthentry gives a weighted vote count to doc kemphasizes users who have high weights in ri.Recommend doc(s) with highest vote counts.How does this differ fromthe simple algorithm?riA5Voting Algorithm - implementation issuesWouldn’t implement using matrix operationsuse weight-propagation on compressed adjacency listsNeed to log and maintain “user views doc”relationship. typically, log into databaseupdate vote-propagating structures periodically.For efficiency, discard all but the heaviest weights in each rionly in fast structures, not in back-end database.WritepseudocodeForward pointerThere are connections between CF and web link analysis:The voting algorithm may be viewed as oneiteration of the Hubs/Authorities algorithm Different setting/algorithmEach user i rates some docs (products, … )say a real-valued rating vikfor doc kin practice, one of several ratings on a formThus we have a ratings vector vi for each user(with lots of zeros)Compute a correlation coefficient between every pair of users i,jdot product of their ratings vectors (symmetric, scalar) measure of how much user pair i,j agrees: wijPredict user i’s utility for doc kSum (over users j such that vjkis non-zero) wijvjkOutput this as the predicted utility for user i on doc k.So how does this differfrom the voting algorithm?It really doesn’t …Same algorithm, different scenarioImplicit (user views doc) vs. Explicit (user assigns rating to doc)Boolean vs. real-valued utilityIn practice,

View Full Document


School:
Email:
New Password:
Confirm Password:

Stanford CS 276B - Lecture Notes

Sign up for free to view:

Please select your school