DOC PREVIEW
CMU CS 15740 - Efficient Parallel Access Patterns for Sparse Matrices

This preview shows page 1-2-3 out of 8 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

IntroductionCollaborative Filtering and the Netflix Prize DatasetPseudo Singular Value DecompositionExperimental ResultsSequential Baseline, Movie OrderMovie OrderLocal Copy, Movie OrderLocal Copy, User OrderTwo PassGreedy Static OrderFuture WorkConclusionEfficient Parallel Access Patterns for Sparse MatricesKevin Waughkwaugh@andrewAbstractA notable characteristic of the scientific computing and machine learning prob-lem domains is the large amount of data to be analyzed and manipulated. Here,carefully crafted parallel algorithms have the potential to make a massive impacton the size of a problem that is deemed tractable. Sparse matrices are a data struc-ture frequently encountered within these domains. Typically, these sparse matri-ces have a non-uniform structure, which can make the task of designing efficientparallel algorithms for sparse matrices much more complex than for their densecounterparts. In this report, we consider multiple ways of parallelizing the mostcomputationally intensive section of a typical sparse matrix algorithm — its innerloop. We compare the theoretical trade-offs between each of the approaches andempirically test performance on the Netflix Prize data set [4, 5], a large collabora-tive filtering task that seeks to learn movie preferences from a sparse collection ofratings.1 IntroductionLarge-scale scientific computing and machine learning have rapidly become popular research areasin the last decade. The advances in these areas have led to many practical applications that havedramatically impacted society in a generally beneficial way. For example, real time weather simu-lation, financial market analysis, web indexing and searching, voice and face recognition, productrecommendation, intrusion and fraud detection, and control automation are tasks that computerswere rarely used for or trusted with fifteen years ago. Though these applications would not bepossible without the staggering advancements made by researchers in these areas, it is safe to saythat the upward trend in processor performance and memory capacity has been pivotal to this pro-gression. As of recent, the rate at which single processor computation is increasing is stagnating.Instead, computer architects have shifted focus towards multi-core designs to keep up with the evergrowing computational demands. This paradigm shift will in require the efficient design of parallelalgorithms to effectively make use of the newest computing resources.A sparse matrix is a frequently reoccuring data structure in scientific computing and machine learn-ing. Unlike a dense matrix, most of the entries in a sparse matrix are zero and are not explicitlystored in memory. For example, a web search engine might use a matrix of word frequency countsfor a set of web pages. Here, a web page typically uses a small subset of the words in the dictionary,which leads to most of the matrix containing zeros. Similarly, a social networking graph, whereeach edge indicates some type of relationship, when represented as an adjacency matrix will likelybe sparse. In these cases, a machine learning algorithm for creating a search index or inferring rela-tionships must be equipped to handle this explicit sparse structure to gain traction on any reasonablysized data set.Let us denote an m by n sparse matrix as a pair (A, S), where (i, j) ∈ S is a non-zero entry inthe matrix and A(i, j) is its value. A reoccuring pattern in sparse matrix algorithms is, then, torepeatedly evaluate functions f and g on the non-zero entries to computeu =X(i,j)∈Sf(i, j, A(i, j)) and v =X(i,j)∈Sg(i, j, A(i, j)) (1)1where u ∈ Rmand v ∈ Rnare row and column vectors respectively. This model encompassesa large class of algorithms including the vast array centered around matrix-vector products, whichhave been well studied and optimized [2, 6]. For example, solution techniques for linear systemsincluding solving a positive definite system with the conjugate gradient method, a variety of matrixfactorization algorithms including ones for computing eigenvector and singular value decomposi-tions, and optimization algorithms like interior-point methods for linear and quadratic programming.Despite its simple structure, no one approach to parallelizing this family of algorithms is obviouslybest. Usually, determining how to effectively distribute this computation will depend on the non-uniform pattern of sparseness in the data in addition to the functions f and g. In this report, wewill examine a variety of schemes for parallelizing algorithms in this family. We will compare andcontrast these schemes empirically on both shared-memory machines and a chip multiprocessor aswell as theoretically in terms of both computational and memory resource overhead in addition tocache performance.This report is sectioned as follows. First, we will introduce the Netflix Prize data set and its collabo-rative filtering task, which we will use for our empirical study. Then, we will describe one possibleapproach for collaborative filtering, the pseudo-singular value decomposition learned using gradientdescent, whose inner loop follows the noted sparse access pattern. Finally, we will investigate andcontrast, both theoretically and empirically, a variety of ways to parallelize this approach.2 Collaborative Filtering and the Netflix Prize DatasetA collaborative filtering task is most easily described as a recommendation system. Take, for exam-ple, an online store selling a number of related items. The store might wish to infer the buying habitsof a user to strategically advertise goods that are mostlikely to be purchased. Similarly, an onlinemovie or music retailer might allow a user to rate music that they like and dislike. The system couldthen recommend similar music or movies. More formally, a collaborative filtering system takes asinput a set of k user/item/rating tuples and gives as output a function that predicts any unobservedtuple. The quality of such a system is usually measured by considering the root mean squared errorof the predictions on withheld test set.Good collaborative filtering systems consider groups of users, or perhaps them all, at once whenlearning preferences. This is because typically the observation vector for a single user is sparselypopulated and little can be inferred from the single user’s ratings in isolation. That is, one typicallyassumes that, though there are many distinct users of the system, many of them share similar tastes.These similar


View Full Document

CMU CS 15740 - Efficient Parallel Access Patterns for Sparse Matrices

Documents in this Course
leecture

leecture

17 pages

Lecture

Lecture

9 pages

Lecture

Lecture

36 pages

Lecture

Lecture

9 pages

Lecture

Lecture

13 pages

lecture

lecture

25 pages

lect17

lect17

7 pages

Lecture

Lecture

65 pages

Lecture

Lecture

28 pages

lect07

lect07

24 pages

lect07

lect07

12 pages

lect03

lect03

3 pages

lecture

lecture

11 pages

lecture

lecture

20 pages

lecture

lecture

11 pages

Lecture

Lecture

9 pages

Lecture

Lecture

10 pages

Lecture

Lecture

22 pages

Lecture

Lecture

28 pages

Lecture

Lecture

18 pages

lecture

lecture

63 pages

lecture

lecture

13 pages

Lecture

Lecture

36 pages

Lecture

Lecture

18 pages

Lecture

Lecture

17 pages

Lecture

Lecture

12 pages

lecture

lecture

34 pages

lecture

lecture

47 pages

lecture

lecture

7 pages

Lecture

Lecture

18 pages

Lecture

Lecture

7 pages

Lecture

Lecture

21 pages

Lecture

Lecture

10 pages

Lecture

Lecture

39 pages

Lecture

Lecture

11 pages

lect04

lect04

40 pages

Load more
Download Efficient Parallel Access Patterns for Sparse Matrices
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Efficient Parallel Access Patterns for Sparse Matrices and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Efficient Parallel Access Patterns for Sparse Matrices 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?