Unformatted text preview:

Slide Number 1Slide Number 2Slide Number 3Slide Number 4Slide Number 5Slide Number 6Slide Number 7Slide Number 8Slide Number 9Slide Number 10Slide Number 11Slide Number 12Slide Number 13CSE 634/590 Data miningSubmitted By:Moieed AhmedStudent Grade Income BuysCS High Low MilkCS High High BreadMath Low Low BreadCS Medium High MilkMath Low Low BreadStudent= CS(I1)Student=math (I2)Grade= high(I3)Grade=medium(I4)Grade=low(I5)Income=high(I6)Income=low(I7)Buys=mil k(I8)Buys=bread(I9)+ - + - - - + + -+ - + - - + - - +- + - - + - + - ++ - - + - + - + -- + - - + - + - +Item SetSupport Count{I1}3{I2}2{I3}2{I4}1{I5} 2{I6,} 2{I7} 3{I8} 2{I9} 3Item SetSupport Count{I1}3{I2}2{I3}2{I5} 2{I6,} 2{I7} 3{I8} 2{I9} 3Scan D for count of each candidateCompare candidate support count with minimum support countC1L1Let, the minimum support count be 2. Since we have 5 records => (Support) = 2/5 = 40%Let, minimum confidence required is 70%.Item Set{I1,I2}{I1,I3}{I1,I4}{I1,I5}{I1,I6}{I1,I7}{I1,I8}{I1,I9}{I2,I3}{I2,I4}{I2,I5}{I2,I6}{I2,I7}{I2,I8}{I2,I9}{I3,I4}{I3,I5}{I3,I6}{I3,I7}{I3,I8}{I3,I9}{I4,I5}{I4,I6}{I4,I7}{I4,I8}{I4,I9}{I5,I6}{I5,I7}{I5,I8}{I5,I9}{I6,I7}{I6,I8}{I6,I9}{I7,I8}{I7,I9}{I8,I9}Item SetSupport Count{I1,I2} 0{I1,I3}2{I1,I4}1{I1,I5}0{I1,I6}2{I1,I7}1{I1,I8}2{I1,I9}1{I2,I3}0{I2,I4}0{I2,I5}2{I2,I6}0{I2,I7}2{I2,I8}0{I2,I9}2{I3,I4}0{I3,I5}0{I3,I6}1{I3,I7}1{I3,I8}1{I3,I9}1{I4,I5}0{I4,I6}1{I4,I7}0{I4,I8}1{I4,I9}0{I5,I6}0{I5,I7}2{I5,I8} 0{I5,I9} 2{I6,I7} 0{I6,I8} 1{I6,I9} 0{I7,I8} 1{I7,I9} 2{I8,I9} 0Item Set Support Count{I1,I3}2{I1,I6}2{I1,I8}2{I2,I5}2{I2,I7}2{I2,I9}2{I5,I7}2{I5,I9} 2{I7,I9} 2C2 C2L2Generate C2 candidates from L1Scan D for count of each candidate Compare candidate support count with minimum support count1. The join step: To find Lk, a set of candidate k-itemsets is generated by joining Lk-1 with itself. This set of candidates is denoted Ck. Lk –ItemsetsCk – CandidatesConsidering {I2,I5} , {I7,I9} from L2 to arrive at L3 we Join L2xL2And thus we have {I2,I5,I7} , {I2,I5,I9} in the resultant candidates generated from L2Considering {I1,I3} , {I1,I6} from L2 we generate candidates {I1,I3,I6}2. The prune step:Ck is a superset of Lk, that is, its members may or may not be frequentAll candidates having a count no less than the minimum support count are frequent by definition, and therefore belong to Lk). Ck, however, can be hugeThus, {I2,I5,I7}, {I2,I5,I9} from join step are considered since they have minimum support but {I1,I3,I6} is discarded since it does not meet the support count needed.Item Set{I2,I5,I7}{I2,I5,I9}{I2,I7,I9}{I5,I7,I9}Generate C3 candidates from L2Scan D for count of each candidate Item SetSupport Count{I2,I5,I7}2{I2,I5,I9}2{I2,I7,I9}2{I5,I7,I9}2Compare candidate support count with minimum support countItem SetSupport Count{I2,I5,I7}2{I2,I5,I9}2{I2,I7,I9}2{I5,I7,I9}2C3L3Generating 4-itemset Frequent PatternGenerate C4 candidates from L3Item Set{I2,I5,I7,I9}C3C4Scan D for count of each candidate Item SetSupport Count{I2,I5,I7,I9}2C4Compare candidate support count with minimum support countItem SetSupport Count{I2,I5,I7,I9}2L4o When mining association rules for use in classification, we are only interested in association rules of the formo (p1 ^ p2 ^: : : pl ) Æ Aclass = C where the rule antecedent is a conjunction of items, p1, p2, : : : , pl (l n), associated with a class label, C.` In our example Aclass would be either ( I8 or I9 on RHS) that is to predict whether a student with given characteristics buys Milk / Bread.` Let, minimum confidence required be 70%` Considering, l={I2,I5,I7,I9}` It’s nonempty subsets are {{2},{5},{7},{9},{2,5},{2,7},{2,9},{5,7},{5,9},{7,9},{2,5,7},{2,5,9},{2,7,9},{5,7,9}}` R1 : I2 ^ I5 ^ I7 Æ I9 [40%,100%]◦ Confidence = sc{I2,I5,I7,I9}/ sc{I2,I5,I7} = 2/2 = 100%◦ R1 is Selected` Considering 3 itemset Frequent Pattern` R2 : I5 ^ I7 Æ I9 [40%,100%]◦ Confidence = sc{I5,I7,I9}/ sc{I5,I7} = 2/2 = 100%◦ R2 is Selected` R3 : I2 ^ I7 Æ I9 [40%,100%]◦ Confidence = sc{I2,I7,I9}/ sc{I2,I7} = 2/2 = 100%◦ R3 is Selected` R4 : I2 ^ I5 Æ I9 [40%,100%]◦ Confidence = sc{I2,I7,I9}/ sc{I2,I7} = 2/2 = 100%◦ R4 is SelectedConsidering 2 itemset Frequent Pattern` R5 : I5 Æ I9 [40%,100%]◦ Confidence = sc{I5,I9}/ sc{I9} = 2/2 = 100%◦ R5 is Selected` R6 : I2 Æ I9 [40%,100%]◦ Confidence = sc{I2,I9}/ sc{I9} = 2/2 = 100%◦ R6 is Selected` R7 : I7 Æ I9 [40%,100%]◦ Confidence = sc{I7,I9}/ sc{I9} = 2/2 = 100%◦ R7 is Selected` R8 : I1 Æ I8 [40%, 66%]◦ Confidence = sc{I1,I8}/ sc{I1} = 2/3 = 66.66%◦ R8 is Rejected` I2 ^ I5 ^ I7 Æ I9 [40%,100%]` I2 ^ I5 Æ I9 [40%,100%]` I2 ^ I7 Æ I9 [40%,100%]` I5 ^ I7 Æ I9 [40%,100%]` I5 Æ I9 [40%,100%]` I7 Æ I9 [40%,100%]` I2 Æ I9 [40%,100%]` We reduce the confidence to 66% to include I8 on R.H.S` I1 Æ I8 [40%,66%]Student Grade Income BuysMath Low Low BreadCS Low Low MilkMath Low Low MilkMath Low Low BreadCS Medium High Milk• First Tuple: Can be written as I2 ^ I5 ^ I7 Æ I9 [Success]The above rule is correctly classifiedAnd hence the Math student with low grade and low income buys bread• Second Tuple: Can be written as I1 Æ I8 [Success]The above rule is not correctly classified• Third Tuple: Can be written as I2 ^ I5 ^ I7 Æ I8 [Error]The above rule is not classifiedStudent Grade Income BuysMath Low Low BreadCS Low Low MilkMath Low Low MilkMath High Low BreadCS Medium High Bread• FourthTuple: Can be written as I2 ^ I7 Æ I9 [Success]The above rule is correctly classifiedAnd hence the Math student with low grade and low income buys bread• Fifth Tuple: Can be written as I1 Æ I9 [Success]The above rule is correctly classifiedHence we have 80% predictive accuracy.And 20% Error


View Full Document
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?