Unformatted text preview:

1. (a). Name three classification techniques. No need to explain how they work. (b) (3%) How do you describe overfitting in classification? (c) (3%) Given the following decision tree, generate all the rules from the tree. Note thatwe have two classes, Yes and No. (d) List three objective interestingness measures of rules, and list two subjectiveinterestingness measures of rules. No need to explain.(e) (5) To build a naïve Bayesian classifier, we can make use of association rule mining.How to compute P(Ai = aj | C= ck) from association rules, where Ai is an attribute and ajis a value of Ai, and ck is a class value of the class attribute C?2. (10%) Given the following table with three attributes, a1, a2, and a3: a1 a2 a3C B HB F SA F FC B HB F GB E OWe want to mine all the large (or frequent) itemsets in the data. Assume the minimumsupport is 30%. Following the Apriori algorithm, give the set of large itemsets in L1, L2, ….,and candidate itemsets in C2, C3, …. (after the join step and the prune step). What additionalpruning can be done in candidate generation and how?AgeSexincomejobYes>= 40< 40MF>=50k<50kYesNoNoYesyn3. (10%) In the multiple minimum support association rule mining, we can assign a minimumsupport to each item, called minimum item support (MIS). We define that an itemset, {item1,item2, …}, is large (or frequent) if its support is greater than or equal to min(MIS(item1), MIS(item2), …..)Given the transaction data:{Beef, Bread}{Bread, Cloth}{Bread, Cloth, Milk}{Cheese, Boots}{Beef, Bread, Cheese, Shoes}{Beef, Bread, Cheese, Milk}{Bread, Milk, Cloth}If we have the following minimum item support assignments for the items in the transactiondata,MIS(Milk) = 50%, MIS(Bread) = 70%The MIS values for the rest of the items in the data are all 25%.Following the MSapriori algorithm, give the set of large (or frequent) itemsets in L1, L2, ….?4. (10%) Given the following training data, which has two attributes A and B, and a class C,compute all the probability values required to build a naïve bayesian classifier. Ignoresmoothing. Answer: P(C = y) = P(C= n) = P(A=m | C=y) = P(A=g | C=y) = P(A=h | C=y) = P(A=m | C=n) = P(A=g | C=n) = P(A=h | C=n) =P(B=t | C=y) = P(B=s | C=y) = P(B=q | C=y) = P(B=t | C=n) = P(B=s | C=n) = P(B=q | C=n) = 5. Using agglomerative clustering to cluster the following one dimensional data: 1, 2, 4, 6, 9,11, 20, 23, 27, 30, 34, 100, 120, 130. You are required to draw the cluster tree and write thevalue of the cluster center represented by each node next to the node. A B Cm t ym s yg q yh s yg q yg q ng s nh t nh q nm t n6. Given the following positive and negative data points, draw a possible decision tree partitionand a possible SVM decision surface respectively. 7. In a marketing application, a predictive model is built to score a test database to identifylikely customers. After scoring, the following configuration for 10 bins is obtained. Eachnumber of the second row is the number of positive cases in the test data that fall into thecorresponding bin. Draw the lift chart for the results. Your drawing should be reasonablyaccurate. Draw a SVM decision surfaceDraw a possible decision tree partition11031 2 120 130364 9 2311 20 3427 301.5 5 10 21.5 28.5 125116.730.326.85.53.2515.236.9Bin 1 Bin 2 Bin 3 Bin 4 Bin 5 Bin 6 Bin 7 Bin 8 Bin 9 Bin 10240 120 40 30 20 20 10 8 6 68. Given the classification results in the following confusion matrix, compute the classificationaccuracy, precision, and recall scores of the positive data. 9. Given the following table with three attributes, a1, a2, and a3: a1 a2 a3C B HB F SA F FC B HB F GB E Owe want to mine all the large (or frequent) itemsets using the multiple minimum supporttechnique. If we have the following minimum item support assignments for the items, MIS(a2=F) = 60%, The MIS values for the rest of the items in the data are all 30%.Following the MSapriori algorithm, give the set of large (or frequent) itemsets in L1, L2, ….and candidate itemsets in C2, C3, … (after the join step and the prune step)?Classified asCorrectPositive Negative50 10 Positive5 200


View Full Document

UIC CS 583 - Midterm Review

Download Midterm Review
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Midterm Review and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Midterm Review 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?