DOC PREVIEW
UCSB CS 290 - Association Rule Learning

This preview shows page 1-2-3-4 out of 11 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

UnsupervisedUnsupervisedData MiningData MiningUnsupervised Unsupervised Data MiningData MiningAssociation Rule LearninggAssociation Rule Analysis Popular in mining data bases Automated discovery of sets of variables that occur frequently or one(s) leading to other(s)2PR , ANN, & MLAssociation Rule Analysis (cont)3PR , ANN, & MLMarket Basket Analysis  Retail outletsPlacement of merchandises (affinity positioning)Placement of merchandises (affinity positioning) Cross advertisingBkBanks Insurance link analysis for fraud Medical symptom analysis4PR , ANN, & MLCo-occurrence MatrixCustomer 1: beer, pretzels, potato chips, aspirinCustomer 2: diapers, baby lotion, grapefruit juice, baby food, milkCustomer 3: soda, potato chips, milkCustomer 3: soda, potato chips, milkCustomer 4: soup, beer, milk, ice creamCustomer 5: soda, coffee, milk, breadCustomer 6: beer, potato chips Interesting cases can have 10^4 variables and 10^8 of samplesCiliiiti5Co-occurrence gives only pair-wise association PR , ANN, & MLPractical Solutions Run up against curse-of-dimensionalitiesWith 10^4 variables each with many possibleWith 10^4 variables, each with many possible values, need very large # of samples to populate the space,“bump”hunting in fine scale is notthe space, bump hunting in fine scale is not possible Look for regions in the probability spaces with high density Even for binary variables, there are 2^k (e.g., 2^{1 000} ibl 1 0lh2^{1,000} possible 1,0-tuples, must have efficient search algorithms 6PR , ANN, & MLSimplification Assuming binary variablesIf t f th bi iIf not, force them binaries  Instead of 6 different education levels, just 2 (ll db bl )(college and above, or below) Change of variables Initially (X1,…, Xp) Each with (S1, … Sp) possible values K = S1+ … Sp Create Zk binary variables7 1 if the corresponding variable Xi assuming value Sij 0 otherwisePR , ANN, & MLApriori Algorithm Threshold t 1stpass:  Single-variable set: must have occurrence larger than t 2ndpass: Pair-wise variable sets: together must have occurrence large than t… mth pass: Only those tuples in (m-1)thpass have probability yp()ppyhigher than t are considered To avoid combinatorial explosion, t cannot 8be too lowPR , ANN, & MLTuples to Rules Tuples {Zk} to A=>BA antecedentA antecedent B consequentT(A >B) t b bilit fT(A=>B): support, probability of simultaneously observing A and B P(A&B)C(A=>B) = T(A=>B)/T(A): confidenceC(A=>B) = T(A=>B)/T(A): confidence, probability of P(B|A)L(A=>B) = C(A=>B)/T(B): lift probability ofL(A=>B) = C(A=>B)/T(B): lift, probability of P(A&B)/(P(A)P(B))9PR , ANN, & MLExamples K={peanut butter, jelly, bread}{tbttjll}>bd{peanut butter, jelly} => bread Support of 0.03: if {peanut butter, jelly, bread} appears in 3% of sample baskets Confidence of 82%: if peanut butter and jelly are purchased, then 82% time bread is also Lift of 1.9: If bread appear in 43% of sampleLift of 1.9: If bread appear in 43% of sample baskets, then 0.82/0.43=1.910PR , ANN, & ML11PR , ANN, &


View Full Document

UCSB CS 290 - Association Rule Learning

Download Association Rule Learning
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Association Rule Learning and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Association Rule Learning 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?