CS 59000 Statistical Machine learningLecture 9Yuan (Alan) QiOutlineReview of model comparisonLinear classification:• Discriminant functions• Probabilistic generative models• Probabilistic discriminative modelsLikelihood, Parameter Posterior & EvidenceLikelihood and evidenceParameter posterior distribution and evidenceEvidence penalizes over-complex modelsGiven M parametersMaximizing evidence leads to a natural trade-off between data fitting & model complexity.Evidence Approximation & Empirical BayesApproximating the evidence by maximizing marginal likelihood. Where hyperparameters maximize the evidence .Known as Empirical Bayes or type2 maximum likelihoodModel Evidence and Cross-ValidationRoot-mean-square error Model evidenceFitting polynomial regression modelsClassification ApproachesDiscriminant functions: Directly assigns an input vector in a specific class Probabilistic generative models: Model the data generation process and use Bayes rule.Probabilistic discriminative models: Model the class-conditional densities directly.Distance from to decision surfaceHint:Fisher’s Linear Discriminantfind projection to a line s.t. samples from different classes are well separated.A naïve choice of separation measureProblem of Naïve Separation CriterionScatter of Data in Each ClassSolution: Normalization by ScatterFisher Linear DiscriminantCost FunctionWithin Class and Between Class Scatter MatricesGenerative eigenvalue problemMaximizeHow to derive the above equation?Fisher’s Linear DiscriminantExampleProjection that maximizes mean separation FLD ProjectionPerceptronGeneralized Linear ModelMinimizewhere M denotes the set of all misclassified patternsStochastic Gradient DescentProbabilistic Generative ModelsGaussian Class-Conditional DensitiesConditional densities of data:The posterior distribution for label/class:Maximum Likelihood EstimationLetLikelihood functionMaximum Likelihood EstimationDiscrete featuresNaïve Bayes classification:Probabilistic Discriminative ModelsInstead of modelingModel directlyLogistic RegressionLetLikelihood functionMaximum Likelihood EstimationNote
View Full Document