UCSD CSE 291 - Recommender Systems - D2545573

Home> Schools> University of California, San Diego> Computer Science & Engineering (CSE) > CSE 291> Recommender Systems

DOC PREVIEW

UCSD CSE 291 - Recommender Systems

School name University of California, San Diego

Course Cse 291-

Pages 37

This preview shows page 1-2-17-18-19-36-37 out of 37 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 37 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Recommender Systems New Approaches with Netflix Dataset Robert Bell Yehuda Koren AT T Labs ICDM 2007 Presented by Matt Rodriguez Outline Overview of Recommender System Approaches which are Content based or Collaborative Filtering New Nearest Neighbor model from the leaders of the Netflix Competition Background Adomvicious Tuzihilan IEEE 2005 Toward the Next Generation of Recommender Systems A Survey of the State of the Art and Possible Extensions Bell Koren ACM 2007 Scalable Collaborative Filtering with Jointly Derived Neighborhood Interpolation Weights Approaches for Recommender Systems Content Based Approach user is recommended items similar to the ones the user preferred in the past Collaborative Filtering user is recommended items that people with similar tastes liked in the past Hybrid Approach combine the two approaches Content Based approach Content based system uses a utility function The Drawbacks 1 Costs associated with gathering users profiles and extracting features from the content 2 Limited Content Analysis 3 Overspecialization 4 No notion of overall popularity global effect 5 No new features for the new trends Collaborative Filtering approach CF uses ratings from other users to recommend items Useful when item features are expensive or difficult to determine Memory based approach uses a function to determine rating of a user item pair the function examines the ratings of the nearest neighbors NNs 1 Neighbor of the user 2 Neighbor of the item Model based approach uses the ratings to learn a model which is then used to make rating predictions Memory Based Approach Derive a rating from similar users or items A similarity function determines the neighbors Unified Collaborative and Content Based Model Pitfalls of Recommender Systems How to handle new users or new items Overspecialization Bad to recommend items that are too similar to items previously Goal is to find new items that are desirable Sparsity 1 Many users do not have sufficient ratings 2 Many items do not have sufficient ratings Outline Overview of Recommender System Approaches which are Content based or Collaborative Filtering New Nearest Neighbor model from the leading group in the Netflix Competition Scalable Collaborative Filtering with Jointly Derived Neighborhood Weights Contributions are normalizing global effects through the dataset Capture interdependence of the neighbors by deriving their weights simultaneously Present a scalable technique for finding similarities between large number of users Previous Rating Techniques Compare user user similarities or item item similarities Koren Bell New Strategy 1 Remove global effects 2 Find Interpolation weights for the neighbors simultaneously by solving an optimization problem 3 Reduce the dimensionality of users by performing an iterative version of SVD Addresses the following concerns 1 NN methods are not good at accounting for global effects 2 Previous weighting methods fail to account for interdependencies between neighbors 3 Previously interpolation weights always sum to one If there is no useful neighborhood information then it is better to ignore it 4 NN methods work poorly if the number of ratings differs substantially Normalizing by Removing Global Effects Some users may give higher ratings than others users Some items may systematically receive higher ratings that other items Ratings may change over time For global effects consider 1 average rating for an item 2 average rating for a user 3 number of ratings for an item 4 number of ratings for a user Parameter Estimation for Global Effects Sequentially estimate one effect at a time At each step use the residuals from the previous step Parameter Estimation For sparse data user parameter estimate is not reliable because there is too much variance Shrink the parameter based on how many ratings it had This is an example of Bayesian shrinkage Approximation in Estimating User Parameter Koren Bell use a simpler estimator used for the user parameter The user parameter can be estimated from the posterior mean Examples of Global Effects Results are from the Probe dataset 1 4 million ratings Allows rating to change over time and based on average rating or support Effects are processed sequentially Neighborhood Relationships Model Overview Compares item item similarity Defines a model that can predict a rating given weights Sets up an optimization problem to derive the weights Accounts for sparsity by shrinking ratings with low support toward the common mean Jointly derives the weights using a QP solver Neighborhood Relationships Model Find the K most similar items using similarity function Provides a prediction for a rating once the weights have been calculated Computing the weights simultaneously is a least squares optimization problem Optimal Weights A is K x K matrix that stores item item ratings by different users b is a vector with ratings from similar items multiplied by the current item This ignores many pairwise relationships among ratings by the same user Neighborhood Relationships Model Normalizes A and b by dividing it by the number of users that rated items i and j Neighborhood Relationship Model This steps shrinks ratings that have not been rated by many users Preprocessing 1 Calculate similarity between each item pairs 2 Shrink sij based on then number of ratings 3 Normalize A based on the number of users rated each pair of items Preprocessing is quadratic in number of items linear in ratings Calculation of the weights The weights are constrained to be nonnegative This makes it a QP optimization problem Determine the weights by using a quadratic algorithm Algorithm uses Gradient Projection method Rating process 1 Compute item item similarity score 2 Compute item item matrix 3 Given a user and an item find the neighboring K items 4 Use the QP solver to solve K x K system of equations which derives the weights 5 Compute the rating for that specific user and item User Oriented computation Many internet recommendation system will have more users than items If a user has not rated items similar to an item we can find other users that are similar and derive a rating from them Users can passively provide other information to the system transactions web page views use this information to find similar users Low dimensional embedding of the users Regularizing R Iteratively perform SVD Unifying user and item relations in single model A different user v might ratings might be a good predictor for some

View Full Document