Unformatted text preview:

University of CaliforniaBerkeleyCollege of EngineeringDepartment of Electrical Engineeringand Computer SciencesProfessors : N.Morgan / B.GoldEE225D Spring,1999Pattern ClassificationLecture 8Speech Pattern Recognition• Soft pattern classification plus temporal sequence integration• Supervised pattern classification: class labels used in training• Unsupervised pattern classification: class labels not available or used• Training: learning parameters of classifier• Testing: classify independent test set, compare with labels and scoreFeature Extraction Criteria• Class discrimination• Generalization• Parsimony (efficiency)Feature Vector Size• Best representations for discrimination on training set are large (highly dimensioned)• Best representations for generalization to test set are (typically) succinct)Dimensionality Reduction• Principal components (i.e., SVD, KL transform, eigenanalysis ...)• Linear Discriminant Analysis (LDA)• Application-specific knowledge• Feature Selection via PR EvaluationPR Methods• Minimum Distance• Discriminant Functions• Linear Discriminant• Nonlinear Discriminant (e.g, quadratic, neural networks)• Statistical Discriminant FunctionsMinimum Distance• Vector or matrix representing element• Define a distance function• Choose the class of stored element closest to new input• Choice of distance equivalent to implicit statistical assumptions• For speech, temporal variability complicates thisProblems with Min Distance• Proper scaling of dimensions (size, discrimination)• For high dim, sparsely sampled spaceDecision Rule for Min Distance• Nearest Neighbor (NN) - in the limit of infinite samples, at most twice the error of optimum cl...• k-Nearest Neighbor (kNN)• Lots of storage for large problems; potentially large searchesSome Opinions• Better to throw away bad data than to reduce its weight• Dimensionality-reduction based on variance often a bad choice for supervised pattern recognitionDiscriminant Analysis• Discriminant functions max for correct class, min for others• Decision surface between classes• Linear decision surface for 2-dim is line, for 3 is plane; generally called hyperplane• For 2 classes, surface at• 2-class quadratic case, surface atTraining Discriminant Functions• Minimum distance• Fisher linear discriminant• Gradient learningGeneralized Discriminators - ANNs• McCulloch Pitts neural model• Rosenblatt Perceptron• Multilayer SystemsThe PerceptronMcCulloch-Pitts Neuron - Rosenblatt PerceptronPerceptron ConvergenceIf classes are linearly separable the following rule will converge in a finite number of steps :For each pattern x at time step k;Multilayer Perceptrons• Heterogeneous, “hard” nonlinearity :(DAID, 1961)• Homogeneous, “soft” nonlinearity (“modern” MLP)Some PR Issues• Testing on the training set• Training on the test set• No. parameters vs no. training examples: overfitting and overtrainingEE 225D N.MORGAN / B.GOLD LECTURE 8 8.1LECTURE ON PATTERN RECOGNITION University of CaliforniaBerkeleyCollege of EngineeringDepartment of Electrical Engineeringand Computer SciencesProfessors : N.Morgan / B.GoldEE225D Spring,1999Pattern Classification Lecture 8EE 225D N.MORGAN / B.GOLD LECTURE 8 8.2LECTURE ON PATTERN RECOGNITIONSpeech Pattern Recognition•Soft pattern classification plus temporal sequence integration•Supervised pattern classification: class labels used in training•Unsupervised pattern classification: class labels not available or usedEE 225D N.MORGAN / B.GOLD LECTURE 8 8.3LECTURE ON PATTERN RECOGNITIONFeatureExtractionPatternFeatureVe ct orClassificationx1x2 xd1kK<≤ωkEE 225D N.MORGAN / B.GOLD LECTURE 8 8.4LECTURE ON PATTERN RECOGNITION•Training: learning parameters of classifier•Testing: classify independent test set, compare with labels and scoreEE 225D N.MORGAN / B.GOLD LECTURE 8 8.5LECTURE ON PATTERN RECOGNITIONEE 225D N.MORGAN / B.GOLD LECTURE 8 8.6LECTURE ON PATTERN RECOGNITIONEE 225D N.MORGAN / B.GOLD LECTURE 8 8.7LECTURE ON PATTERN RECOGNITIONFeature Extraction Criteria•Class discrimination•Generalization•Parsimony (efficiency)EE 225D N.MORGAN / B.GOLD LECTURE 8 8.8LECTURE ON PATTERN RECOGNITIONplosive + vowel energies for 2 different gainstEt)()tEt()EE 225D N.MORGAN / B.GOLD LECTURE 8 8.9LECTURE ON PATTERN RECOGNITIONt∂∂CE t()logt∂∂ClogEt()log+()=t∂∂Et()log=EE 225D N.MORGAN / B.GOLD LECTURE 8 8.10LECTURE ON PATTERN RECOGNITIONFeature Vector Size•Best representations for discrimination on training set are large (highly dimensioned)•Best representations for generalization to test set are (typically) succinct)EE 225D N.MORGAN / B.GOLD


View Full Document

Berkeley ELENG 225D - Pattern Classification

Download Pattern Classification
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Pattern Classification and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Pattern Classification 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?