DOC PREVIEW
Purdue CS 59000 - Statistical Machine learning

This preview shows page 1-2-3-26-27-28 out of 28 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 59000 Statistical Machine learningLecture 9Alan QiOutlineReview of parzen windowsK-nearest neighbor classificationLinear Regression with basis functionsRidge regression and lassoBayesian model selectionBayesian factor Empirical BayesianNonparametric Methods (4)Assume observations drawn from a density p(x) and consider a small region Rcontaining x such thatThe probability that K out of N observations lie inside R is Bin(K|N,P ) and if N is largeIf the volume of R, V, is sufficiently small, p(x) is approximately constant over R andThus3What is the relation to the histogram method?Nonparametric Methods (5)Kernel Density Estimation: fix V, estimate K from the data. Let R be a hypercube centred on x and define the kernel function (Parzen window)It follows that and hence4What are the relation to the histogram method and its drawback?Nonparametric Methods (5)To avoid discontinuities in p(x), use a smooth kernel, e.g. a GaussianAny kernel such thatwill work.h acts as a smoother.5Nonparametric Methods (6)Nearest Neighbour Density Estimation: fix K, estimate V from the data. Consider a hyperspherecentred on x and let it grow to a volume, V?, that includes K of the given N data points. ThenK acts as a smoother.6K-Nearest-Neighbours for Classification (1)Given a data set with Nkdata points from class Ckand , we haveand correspondinglySince , Bayes’ theorem gives7Then how to classify the data points?K-Nearest-Neighbours for Classification (2)K = 1K = 38K-Nearest-Neighbours for Classification (3)• K acts as a smother• For , the error rate of the 1-nearest-neighbour classifier is never more than twice the optimal error (obtained from the true conditional class distributions).9Nonparametric vs Parametric Nonparametric models (not histograms) requires storing and computing with the entire data set. Parametric models, once fitted, are much more efficient in terms of storage and computation.10Linear Regression11Basis FunctionsExamples of Basis Functions (1)Examples of Basis Functions (2)14Maximum Likelihood Estimation (1)Maximum Likelihood Estimation (2)Sequential EstimationRegularized Least SquaresMore RegularizersVisualization of Regularized RegressionBayesian Linear RegressionPosterior Distributions of ParametersPredictive Posterior DistributionExamples of PredictiveDistributionQuestionSuppose we use Gaussian basis functions.What will happen to the predictive distribution if we evaluate it at places far from all training data points?Equivalent KernelGivenPredictive mean is thusEquivalent kernelBasis Function: Equivalent kernel:GaussianPolynomialSigmoidCovariance between two predictionsPredictive mean at nearby points will be highly correlated, whereas for more distant pairs of points the correlation will be


View Full Document

Purdue CS 59000 - Statistical Machine learning

Documents in this Course
Lecture 4

Lecture 4

42 pages

Lecture 6

Lecture 6

38 pages

Load more
Download Statistical Machine learning
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Statistical Machine learning and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Statistical Machine learning 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?