Midterm CS 6375 Spring 2015 Solutions The exam is closed book You are allowed a one page cheat sheet Answer the questions in the spaces provided on the question sheets If you run out of room for an answer use an additional sheet available from the instructor and staple it to your exam NAME UTD ID if known Question Points Score Decision Trees Linear Classi ers Neural Networks Support Vector Machines Point Estimation Naive Bayes Instance Based Learning Extra Credit Linear Regression 10 10 20 10 10 10 10 20 Total 100 1 Spring 2015 Final Page 2 of 16 March 26 2015 Spring 2015 Final Page 3 of 16 March 26 2015 Question 1 Decision Trees 10 points Consider the training dataset given below A B and C are the attributes and Y is the class variable A B C 1 0 0 1 0 0 0 1 1 0 1 1 Y 0 Yes 1 Yes 0 No 1 No 1 No 0 Yes a 2 points Can you draw a decision tree having 100 accuracy on this training set If you answer is yes draw the decision tree in the space provided below If your answer is no explain why Solution No Because Examples 2 and 4 have the same feature vector but different classes b 3 points Which attribute among A B and C has the highest information gain Explain your answer Solution IG A H 3 3 1 IG B H 3 3 1 IG C H 3 3 1 Therefore all three attributes have the same information gain 2 H 2 1 1 2 H 2 1 1 2 H 2 1 1 2 H 2 1 2 H 2 1 2 H 2 1 Spring 2015 Final Page 4 of 16 March 26 2015 c 3 points You are given a collection of datasets that are linearly separable Is it always possible to construct a decision tree having 100 accuracy on such datasets True or False Explain your answer Solution True A decision tree can represent any Boolean function Since a linear classi er can be represented using a Boolean function and any linear classi er will have 100 accuracy on the training dataset the decision tree representing the Boolean function associated with the linear classi er will also have 100 accuracy on the training dataset d 2 points You are given two decision trees that represent the same target concept In other words given the same input the two decision trees will always yield the same output Does it imply that the two decision trees have the same number of nodes True or False Explain your answer Solution False Consider two decision trees representing the target concept x1 x2 x2 x3 Draw one along the ordering x2 x1 and x3 and draw the second one along the ordering x1 x2 and x3 The rst decision tree has 7 nodes while the second one has 9 nodes Please verify Spring 2015 Final Page 5 of 16 March 26 2015 Question 2 Linear Classi ers 10 points Recall that a linear threshold function or a linear classi er is given by If w0 cid 80 class is positive otherwise it is negative Assume that 1 is true and 0 is false i wixi 0 then a 5 points Consider a function over n Binary features de ned as follows If at least k variables are false then the class is positive otherwise the class is negative Can you represent this function using a linear threshold function If your answer is YES then give a precise numerical setting of the weights Otherwise clearly explain why this function cannot be represented using a linear threshold function Solution YES Consider the following setting of weights wi 1 for all i and w0 n k 0 5 If at least k variables are false then cid 80 i wixi n k Therefore w0 wixi n k n k 0 5 0 5 0 On the other hand if fewer than k variables are false then cid 80 n k 1 Therefore i wixi n k 1 w0 wixi n k 1 n k 0 5 0 5 0 cid 88 i cid 88 i b 5 points Consider a linear threshold function over n real valued features having w0 0 namely the bias term is zero Can you always represent such a function using a linear threshold function having only n 1 features Answer YES or NO and brie y explain your answer Note that no credit will be given if your explanation is incorrect Solution NO Because we have to nd a weight vector v0 v1 vn 1 such that n cid 88 i 1 wixi v0 vixi n 1 cid 88 i 1 This is not possible because the left hand side has n variables and the right hand side has only n 1 variables Spring 2015 Final Page 6 of 16 March 26 2015 Question 3 Neural Networks 20 points a 10 points Draw a neural network that represents the function f x1 x2 x3 de ned below You can only use two types of units linear units and sign units Recall that the linear unit takes as input weights and attribute values and outputs w0 cid 80 i wixi while the sign unit outputs 1 if w0 cid 80 i wixi 0 and 1 otherwise x1 x2 x3 0 0 0 1 0 0 0 1 0 1 1 0 0 0 1 1 0 1 0 1 1 1 1 1 f x1 x2 x3 10 5 5 10 5 10 10 10 Note that to get full credit you have to write down the precise numeric weights e g 1 0 5 1 etc as well as the precise units used at each hidden and output node Solution Several solutions are possible The simplest solution is to have two hidden layers and one output node The rst hidden layer has three nodes with each unit being a sign unit the second layer has one node which is also a sign unit and the output node is a linear unit The three hidden nodes in the rst hidden layer represent the following functions respectively 1 If x1 x2 x3 evaluates to true output a 1 otherwise output a 1 An example setting of weights is w1 1 w2 1 and w3 1 and w0 0 5 2 If x1 x2 x3 evaluates to true output a 1 otherwise output a 1 An example setting of weights is w1 1 w2 1 and w3 1 and w0 0 5 3 If x1 x2 x3 evaluates to true output a 1 otherwise output a 1 An example setting of weights is w1 1 w2 1 and w3 1 and w0 0 5 The input to the hidden unit in the second hidden layer is the output of all hidden nodes in the rst hidden layer It represents the following function 1 If at least one of the inputs is a 1 the unit outputs a 1 otherwise it outputs a 1 It thus represents an OR node An example setting of weights is w0 0 …
View Full Document