CS 6375 Machine Learning 2015 Spring Homework 4 Total points 100 points Part I Written problems 65 points Due 03 13 2015 1 Nearest neighbor classifier 10 pts For the data set shown below in the figure draw a decision boundary for k 1 and k 3 a rough sketch is okay What can you say about overfitting 1 NN 3 NN 2 Nearest neighbor classifier 10 pts Consider the k nearest neighbor rule in a 2 class problem with equal prior probabilities There is only one attribute with continuous values Assume that n data points your training set are generated independently in the following way first a class is randomly picked again they have equal prior probabilities then a sample is randomly generated for this class based on the classconditional probability Assume further that the class conditional probability densities P X C1 and P X C2 are uniform 0 1 and 10 11 respectively The distance used for kNN is just the absolute difference between two values Show that for an odd value of k the average probability of error is given by p n e 1 2n k 1 2 j 0 n j 3 Cross validation CV 10 pts The following table shows the training data for a binary classification task where the first column is the index for the data points the second one shows the value for the feature just one attribute and the last column gives the class label We will use nearest neighbor classifier for this problem sample 1 2 3 4 5 6 7 8 9 10 Attribute 1 0 1 0 7 1 0 1 6 2 0 2 5 3 2 3 5 4 1 4 9 Class label a What is the 5 fold CV error of 1 NN on this data set Split the data like this I instance 1 2 II 3 4 III 5 6 IV 7 8 V 9 10 b What is the 2 fold CV error of 3 NN on this data set Use the first 5 data points in the first subset and the last 5 points in the other subset 4 HMM application paper reading 15 pts Read a paper that uses HMM for some applications e g in bioinformatics speech recognition language processing Summarize the paper Please describe the task clearly Try to make it understandable by someone who is not working in that particular field but has some general machine learning background Please also explain what the hidden states and the transition probabilities are for the particular task and how those parameters are estimated 5 HMM 20 pts Considering a 3 state S1 S2 S3 2 output V1 V2 HMM specified by the following model parameters Initial state vector 1 0 0 0 5 0 4 0 1 State transition matrix A 0 0 0 6 0 4 0 0 0 0 1 0 V1 0 7 Observation probability B 0 6 0 2 V2 0 3 0 4 0 8 a Draw the state transition diagram for the above model labeling each arc with the probability of the transition b Show the trellis of the allowable paths for the model for an observation sequence of length 4 Assume the final state must be S3 Note for your graph use X axis for time steps and Yaxis for states Please use 1 as the starting time c For the observation sequence O V1V1V1V2 what is the probability P O model considering all possible paths that end in S3 at time T 4 Use forward algorithm for this question d For the same observation sequence what is P O model for the most likely path What is the most likely path Use Viterbi algorithm for this problem Again assume the final state must be S3
View Full Document