DOC PREVIEW
UH COSC 6340 - COSC 6340 A SURVEY OF ASSOCIATION RULES

This preview shows page 1-2-3-4-30-31-32-33-34-62-63-64-65 out of 65 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

INTRODUCTIONASSOCIATION RULE PROBLEMBASIC ALGORITHMSIn this section we provide a survey of existing algorithms to generate association rules. Most algorithms used to identify large itemsets can be classified as either sequential or parallel. In most cases, it is assumed that the itemsets are identified aSequential AlgorithmsAISSETMConsider the example given in Table 4 to illustrate the apriori_gen(). Large itemsets after the third pass are shown in the first column. Suppose a transaction contains {Apple, Bagel, Chicken, Eggs, DietCoke}. After joining L3 with itself, C4 will beLarge Itemsets in the third pass (L3)Item3.1.4 Apriori-TIDHere it is noted that the work involved in generating Ck+1 does not depend on the size of database, rather on the size of Lk. Also, one can compute several families of Ck+1, Ck+2, . . . , Ck+e for some e>1 directly from Lk. The time complexity for deteItem3.2.4 Discussion3.2.5 Future of Parallel AlgorithmsCLASSIFICATION AND COMPARISON OF ALGORITHMS5.2 Temporal and Spatial Association Rules5.3 Quantitative Association RulesTable Name: Person5.5 Multiple Min-supports Association Rules6 MAINTENANCE OF DISCOVERED ASSOCIATION RULESSUMMARYBIBLIOGRAPHY[Lee1997a] S.D. Lee and David W. Cheung, Maintenance of Discovered Association Rules: When to Update? Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD), May 11, 1997, Tucson, ArizonaAPPENDIX A: Sample DatasetAPPENDIX B: Sample CodeAPPENDIX C: ProductsCost: Non KnownAPPENDIX D: Notation1 A SURVEY OF ASSOCIATION RULES Margaret H. Dunham, Yongqiao Xiao Le Gruenwald, Zahid Hossain Department of Computer Science and Engineering Department of Computer Science Southern Methodist University University of Oklahoma Dallas, Texas 75275-0122 Norman, OK 73019 ABSTRACT: Association rules are one of the most researched areas of data mining and have recently received much attention from the database community. They have proven to be quite useful in the marketing and retail communities as well as other more diverse fields. In this paper we provide an overview of association rule research. 1 INTRODUCTION Data Mining is the discovery of hidden information found in databases and can be viewed as a step in the knowledge discovery process [Chen1996] [Fayyad1996]. Data mining functions include clustering, classification, prediction, and link analysis (associations). One of the most important data mining applications is that of mining association rules. Association rules, first introduced in 1993 [Agrawal1993], are used to identify relationships among a set of items in a database. These relationships are not based on inherent properties of the data themselves (as with functional dependencies), but rather based on co-occurrence of the data items. Example 1 illustrates association rules and their use. Example 1: A grocery store has weekly specials for which advertising supplements are created for the local newspaper. When an item, such as peanut butter, has been designated to go on sale, management determines what other items are frequently purchased with peanut butter. They find that bread is purchased with peanut butter 30% of the time and that jelly is purchased with it 40% of the time. Based on these associations, special displays of jelly and bread are placed near the peanut butter which is on sale. They also decide not to put these items on sale. These actions are aimed at increasing overall sales volume by taking advantage of the frequency with which these items are purchased together. There are two association rules mentioned in Example 1. The first one states that when peanut butter is purchased, bread is purchased 30% of the time. The second one states that 40% of the time when peanut butter is purchased so is jelly. Association rules are often used by retail stores to analyze market basket transactions. The discovered association rules can be used by management to increase the effectiveness (and reduce the cost) associated with advertising,2 marketing, inventory, and stock location on the floor. Association rules are also used for other applications such as prediction of failure in telecommunications networks by identifying what events occur before a failure. Most of our emphasis in this paper will be on basket market analysis, however in later sections we will look at other applications as well. The objective of this paper is to provide a thorough survey of previous research on association rules. In the next section we give a formal definition of association rules. Section 3 contains the description of sequential and parallel algorithms as well as other algorithms to find association rules. Section 4 provides a new classification and comparison of the basic algorithms. Section 5 presents generalization and extension of association rules. In Section 6 we examine the generation of association rules when the database is being modified. In appendices we provide information on different association rule products, data source and source code available in the market, and include a table summarizing notation used throughout the paper. 2 ASSOCIATION RULE PROBLEM A formal statement of the association rule problem is [Agrawal1993] [Cheung1996c]: Definition 1: Let I ={I1, I2, … , Im} be a set of m distinct attributes, also called literals. Let D be a database, where each record (tuple) T has a unique identifier, and contains a set of items such that T⊆I An association rule is an implication of the form X⇒Y, where X, Y⊂I, are sets of items called itemsets, and X!Y=φ. Here, X is called antecedent, and Y consequent. Two important measures for association rules, support (s) and confidence (α), can be defined as follows. Definition 2: The support (s) of an association rule is the ratio (in percent) of the records that contain X"Y to the total number of records in the database. Therefore, if we say that the support of a rule is 5% then it means that 5% of the total records contain X"Y. Support is the statistical significance of an association rule. Grocery store managers probably would not be concerned about how peanut butter and bread are related if less than 5% of store transactions have this combination of purchases. While a high support is often desirable for association


View Full Document

UH COSC 6340 - COSC 6340 A SURVEY OF ASSOCIATION RULES

Download COSC 6340 A SURVEY OF ASSOCIATION RULES
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view COSC 6340 A SURVEY OF ASSOCIATION RULES and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view COSC 6340 A SURVEY OF ASSOCIATION RULES 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?