Slide 1Slide 2Slide 3Slide 4Slide 5Solving Large Scale Kernel Machines using Random FeaturesMotivation●Terabyte and even Petabyte scale datasets ●Batch kernel methods have good performance but do not scale well to massive datasets●Recall the kernel matrix K is n x n, where n is the number of samples●Suppose we have 10^6 samples. Storage requirements for K:8 bytes per double * 10^6 * 10^6 = 8 TB memory●Expensive to evaluate new test point:● => O(Nd)Random Features* to Approximate Kernel Functions●Approximate shift-invariant kernels (i.e. Gaussian): ●What do we gain?●Scale to very large datasets with competitive accuracy●O(D*d) operations to compute new test point●Linear learning methods for non-linear kernels*Rahimi and Recht. Random Features for Large-Scale Kernel Machines. NIPS 2007.z:Project Goals●Understand the technique of random features●Compare the performance of various random feature sets to traditional kernel methods●Evaluate the performance and feasibility of this technique on very large datasets, i.e. ImageNetResourcesPape rs:●Rahimi and Recht. Random Features for Large-Scale Kernel Machines. NIPS 2007.●Rahimi and Recht. Weighted Sums of Random Kitchen Sinks: Replacing minimization with randomization in learning. NIPS 2008.Code:●http://berkeley.intel-research.net/arahimi/c/random-features/random-features.tgzDatasets :●Datasets used in paper as a benchmark: http://berkeley.intel-research.net/arahimi/c/random-features/data/●Really big
View Full Document