Unformatted text preview:

Data Mining (Extra Credits)Given Data (From Text Book)Transactional DatabaseMinimum Support CountFrequent one ItemsetsFrequent 2 ItemsetsContinued..Frequent 3 ItemsetsAll Frequent ItemsetsAssociation rulesAssociation Rules by Classification Selected RulesTest DataContinued..Continued..Slide Number 16Data Mining (Extra Credits)Data Mining (Extra Credits)Given Data Given Data (From Text Book)(From Text Book)y T100 - {M,O,N,K,E,Y}y T200 - {D,O,N,K,E,Y}y T300 - {M,A,K,E}y T400 - {M,U,C,K,Y}y T500 - {C,O,K,I,E}y Min_sup = 60% = Minimum Support Count(%)y Min_conf = 80% = Minimum Confidence(%)y Let the Classes be E (Item 11) and Y (Item 10).Transactional DatabaseTransactional DatabaseMItem 1DItem 2CItem 3OItem 4AItem 5UItem 6NItem 7KItem 8IItem 9YItem 10ClassEItem 11ClassT100+ - - + - - + + - + +T200- + - + - - + + - + +T300+ - - - + - - + - - +T400+ - + - - + - + - + -T500- - + + - - - + + - +Minimum Support CountMinimum Support County Total No. of Transaction = 5y Minimum Support Count = 60% of 5= 3Frequent one ItemsetsFrequent one ItemsetsOne item Itemsets M 3D 1C 2O 3A 1U 1N 2K 5E 4Y 3I 1Frequent One ItemsetsM 3O 3K 5E 4Y 3Minimum Support Count = 3Frequent 2 ItemsetsFrequent 2 ItemsetsCandidate 2- Itemsets{M,O} 1{M,K} 3{M,E} 2{M,Y} 2{O,K} 3{O,E} 3{O,Y} 2{K,E} 4{K,Y} 3{E,Y} 2Candidate 2-Itemsets After Pruning{M,O} 1{M,K} 3{M,E} 2{M,Y} 2{O,K} 3{O,E} 3{O,Y} 2{K,E} 4{K,Y} 3{E,Y} 2Use AprioriPrinciple for PruningContinued..Continued..Frequent 2- Itemsets{M,K} 3{O,K} 3{O,E} 3{K,E} 4{K,Y} 3Candidate 2-Itemsets After Pruning{M,O} 1{M,K} 3{M,E} 2{M,Y} 2{O,K} 3{O,E} 3{O,Y} 2{K,E} 4{K,Y} 3{E,Y} 2Minimum Support Count = 3Frequent 3 ItemsetsFrequent 3 ItemsetsCandidate 3- Itemsets{O,K,E} 3{K,E,Y} 2Use AprioriPrinciple for PruningCandidate 3-Itemsets After Pruning{O,K,E} 3Minimum Support Count = 3Frequent 3-Itemsets{O,K,E} 3-> There are no 4 ItemsetsAll Frequent ItemsetsAll Frequent Itemsetsy M-3y O-3y K-5y E-4y Y-3y {M,K} - 3y {O,K} - 3y {O,E} - 3y {K,E} - 4y {K,Y} - 3y {O,K,E} - 3Association rulesAssociation rulesy C(XÆY) = P(Y|X) = sup_count(XUY) /sup_count(X)We have included only 2 and 3 frequent item sets, because One Itemsets will not help us in making the association rules.y Let the Classes be Y (Item 10) and E (Item 11)y So, we are interested in finding the Rules of the form A -> Y (Item 10) and A -> E (Item 11)Association Rules by Classification Association Rules by Classification y Rule_No Rule Confidence Confidence(%)y R1 O Æ E3/4 75%y R2 K Æ E4/5 80%y R3 K Æ Y3/5 60%y R4 {O,K} Æ E 3/3 100%y Since our classes are E and Y, so with confidence of 80% the rule R3 cannot be included. So we reduce the confidence to 60% to include R3.Selected RulesSelected Rulesy Rule_No Rule [Actual Support Count, Actual Confidence]y R1 O Æ E [60%, 75%]y R2 K Æ E [80%, 80%]y R3 K Æ Y [60%, 60%]y R4 {O,K} Æ E [60%, 100%]Test DataTest DataMItem 1DItem 2CItem 3OItem 4AItem 5UItem 6NItem 7KItem 8IItem 9YItem 10ClassEItem 11ClassT100+ - - - + - - + - + -T200+ - - + - - + + + - +T300- - + + + - - - - + -T400+ - + + - + - - - - +T500- + + - - - - + + - +Continued..Continued..y T100 satisfies the rule:◦ K->Y [Success]y T200 satisfies the rule:◦ {O,K}->E [Success]y T300 satisfies the rule:◦ {C,O,A} ->Y [Failure]Continued..Continued..y T400 satisfies the rule:◦ O -> E [Success]y T500 satisfies the rule:◦ K -> E [Success]y Predictive Accuracy = 80%y Error Rate = 20%Thank


View Full Document
Download Problem from text book
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Problem from text book and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Problem from text book 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?