Problem Set 210-601 Machine LearningMATLAB tips• Conventional wisdom: MATLAB hates loops– May be less of an issue with most recent versions– Ideally use matrix operations whenever possible• Examples:MATLAB tips• Conventional wisdom: MATLAB hates loops– May be less of an issue with most recent versions– Ideally use matrix operations whenever possible• Examples:MATLAB tips• Conventional wisdom: MATLAB hates loops– May be less of an issue with most recent versions– Ideally use matrix operations whenever possible• Examples:MATLAB tips• Conventional wisdom: MATLAB hates loops– May be less of an issue with most recent versions– Ideally use matrix operations whenever possible• Examples:MATLAB tips• Conventional wisdom: MATLAB hates loops– May be less of an issue with most recent versions– Ideally use matrix operations whenever possible• Examples:Logistic regression implementation• In my code that solves for w, the only loop is the one that iterates until the weight vector has converged• My code converges in ~40k iterations (35.6s) when– lambda = 0– w and v initialized to 1– k = 0.5• Can use ‘tic’ and ‘toc’ to time your codeLogistic regression implementation• Tracking the likelihood and norm of the difference in weight vectors at every iteration can be informative• L1 regularization, lambda = 50 500 1000 1500 2000 2500 3000 3500 4000 45000.9811.021.041.061.081.11.121.141.161.18x 10-3Logistic regression implementation• Tracking the likelihood and norm of the difference in weight vectors at every iteration can be informative• L1 regularization, lambda = 50 20 40 60 80 100 120 14011.0051.011.0151.021.0251.031.035x 10-3Logistic regression implementation• To test your code, observe what happens to the weight vector as you increase lambda-100 -80 -60 -40 -20 0 20 40 60 800510152025303540Bayesian linear regressionImage from Bishopxwwy10No observations1 observation2 observations20 observationsWe also discussed…• Why the Gaussian prior on w in Bayesian linear regression is a conjugate prior, as well as how to compute the posterior• How to compute information gain in a decision tree with continuous
View Full Document