Stanford CS 276B - CS 276B LECTURE NOTES - D1805024

Home> Schools> Stanford University> Computer Science (CS) > CS 276B> CS 276B LECTURE NOTES

Stanford CS 276B - CS 276B LECTURE NOTES

Course Cs 276b- Text Information Retrieval, Mining, and Exploitation

Pages 44

Download Save

Unformatted text preview:

CS276B Text Information Retrieval, Mining, and ExploitationRecap – last timeToday’s topicsRecommendation SystemsSample ApplicationsCorporate intranets - document recommendationCorporate intranets - “expert” findingInputs to systemInputs - more detailExample - behavior onlyExpert finding - simple exampleSimplest AlgorithmSimple algorithm - shortcomingMeasuring collaborative filteringOther aspectsRule-based recommendationsUser space vs. item spaceMatrix viewVoting AlgorithmVoting algorithmVoting Algorithm - implementation issuesExerciseDifferent setting/algorithmPredict user i’s utility for doc kSame algorithm, different scenarioRating interfaceEarly systemsRecap slide 6 - InputsThe next level - modeling contextContext modificationUsing a vector spaceRecommendations from contextCorrelations at run timeModified vectorsMeasuring recommendationsk nearest neighbors - efficacySummary so farSlide 38Capturing role/domainSlide 40Utility formulationUser typesMatrix reconstructionResourcesCS276BText Information Retrieval, Mining, and ExploitationLecture 10Feb 18, 2003Recap – last timeVector space classificationNearest neighbor classificationSupport vector machinesHypertext classificationToday’s topicsRecommendation systemsWhat they are and what they doA couple of algorithmsGoing beyond simple behavior: contextHow do you measure them?Begin: how do you design them “optimally”?Recommendation SystemsGiven a set of users and itemsitems could be documents, products, other users … Recommend items to a user based onpast behavior of this and other usersadditional information on users/items.Sample ApplicationsCorporate IntranetsRecommendation, finding domain experts, … EcommerceProduct recommendations - amazonMedical ApplicationsMatching patients to doctors, clinical trials, … Customer Relationship ManagementMatching customer problems to internal experts in a Support organization.Corporate intranets - document recommendationCorporate intranets - “expert” findingInputs to systemBehaviorusers’ historical “transactions”Contextwhat the user appears to be doing nowRole/domainadditional info about users, documents …Inputs - more detailPast transactions from users:which docs viewedcontent/attributes of documentswhich products purchasedpages bookmarkedexplicit ratings (movies, books … )Current context:browsing historysearch(es) issuedExplicit role/domain info:Role in an enterpriseDocument taxonomiesInterest profilesExample - behavior onlyUsers Docs viewedU1U2d1d2d3U1 viewed d1, d2, d3.U2 views d1, d2.Recommend d3 to U2.?Expert finding - simple exampleU1U2d1d2d3Recommend U1 to U2 as someone to talk to?Simplest AlgorithmU viewed d1, d2, d5.Look at who else viewed d1, d2 or d5.Recommend to U the doc(s) most “popular” among these users.UVWd1d2d5Simple algorithm - shortcomingTreats all other users as equally importantIgnores the fact that some users behaved more like me in the pastMeasuring collaborative filteringHow good are the predictions?How much of previous opinion do we need? Computation.How do we motivate people to offer their opinions?Other aspectsRule-based recommendationWorking in user space vs. item spaceBuild regression models of userRule-based recommendationsIn practice – rule-based systems in commerce enginesMerchandizing interfaces allow product managers to promote itemsCriteria include inventory, margins, etc.Must reconcile these with algorithmic recommendationsUser space vs. item spaceShould we work with user similarity or item similarity?As with general clusteringRecommendations could come from similar usersOr opinions could come from items similar to the one we seek an opinion onSimilar based on what?In some cases, can use both cuesMatrix viewA = UsersDocsAij = 1 if user i viewed doc j,= 0 otherwise.AAt : Entries give # of docs commonly viewed by pairs of users.Voting AlgorithmRow i of AAt : Vector whose jth entry is the # of docs viewed by both i and j.Call this row ri’ e.g., (0, 7, 1, 13, 0, 2, ….) What’s on the diagonal of AAt?Voting algorithmThen ri ° A is a vector whose kth entry gives a weighted vote count to doc kemphasizes users who have high weights in ri .Recommend doc(s) with highest vote counts.How does this differ fromthe simple algorithm?riAVoting Algorithm - implementation issuesWouldn’t implement using matrix operationsuse weight-propagation on compressed adjacency listsNeed to log and maintain “user views doc” relationship. typically, log into databaseupdate vote-propagating structures periodically.For efficiency, discard all but the heaviest weights in each ri only in fast structures, not in back-end database.WritepseudocodeExerciseThe voting algorithm may be viewed as one iteration of the Hubs/Authorities algorithm from CS276a (as in Lecture 3).Derive the extension to the full Hubs/Authorities algorithm with convergence.Make sure all users don’t get the same recommendations!How do you interpret the top “Hubs”?Different setting/algorithmEach user i rates some docs (products, … )say a real-valued rating Uik for doc kin practice, one of several ratings on a formThus we have a ratings vector Ui for each user(with lots of zeros)Compute a correlation coefficient between every pair of users i,jdot product of their ratings vectors (symmetric, scalar) measure of how much user pair i,j agrees: SijPredict user i’s utility for doc kSum (over users j such that Ujk is non-zero) Sij UjkOutput this as the predicted utility for user i on doc k. So how does this differfrom the voting algorithm?It really doesn’t …Same algorithm, different scenarioImplicit (user views doc) vs. Explicit (user assigns rating to doc)Boolean vs. real-valued utilityIn practice, must convert user ratings on a form (say on a scale of 1-5) to real-valued utilitiesCan be fairly complicated mappingLikeminds function (Greening white paper)Requires understanding user’s interpretation of formRating interfaceEarly systemsGroupLens (U of Minn) (Resnick/Iacovou/Bergstrom/Riedl)netPerceptions companyTapestry (Goldberg/Nichols/Oki/Terry)Ringo (MIT Media Lab) (Shardanand/Maes)Experiment with variants of these algorithmsRecap slide 6 - InputsPast

View Full Document


School:
Email:
New Password:
Confirm Password:

Stanford CS 276B - CS 276B LECTURE NOTES

Sign up for free to view:

Please select your school