Data Mining: IntroductionChapter 1. IntroductionMotivation: “Necessity is the Mother of Invention”Evolution of Database TechnologyWhat Is Data Mining?Why Data Mining? — Potential ApplicationsSlide 7Market Analysis and Management (1)Market Analysis and Management (2)Corporate Analysis and Risk ManagementFraud Detection and Management (1)Fraud Detection and Management (2)Other ApplicationsExample: Amazon.com book recommendationsData Mining: A KDD ProcessSteps of a KDD ProcessData Mining and Business IntelligenceSlide 18Architecture of a Typical Data Mining SystemData Mining: On What Kind of Data?Data Mining Functionalities (1)Data Mining Functionalities (2)Data Mining Functionalities (3)Are All the “Discovered” Patterns Interesting?Market Basket AnalysisCan We Find All and Only Interesting Patterns?Data Mining: Confluence of Multiple DisciplinesData Mining: Classification SchemesA Multi-Dimensional View of Data Mining ClassificationOLAP Mining: An Integration of Data Mining and Data WarehousingAn OLAM ArchitectureMajor Issues in Data Mining (1)Major Issues in Data Mining (2)A Brief History of Data Mining SocietyWhere to Find References?SummaryReferencesData Mining: IntroductionDr. Hany SaleebChapter 1. IntroductionMotivation: Why data mining?What is data mining?Data Mining: On what kind of data?Data mining functionalityAre all the patterns interesting?Classification of data mining systemsMajor issues in data miningMotivation: “Necessity is the Mother of Invention”Data explosion problem Automated data collection tools and mature database technology lead to tremendous amounts of data stored in databases, data warehouses and other information repositories We are drowning in data, but starving for knowledge! Solution: Data warehousing and data miningData warehousing and on-line analytical processingExtraction of interesting knowledge (rules, regularities, patterns, constraints) from data in large databasesEvolution of Database Technology1960s:Data collection, database creation, IMS and network DBMS1970s: Relational data model, relational DBMS implementation1980s: RDBMS, advanced data models (extended-relational, OO, deductive, etc.) and application-oriented DBMS (spatial, scientific, engineering, etc.)1990s—2000s: Data mining and data warehousing, multimedia databases, and Web databasesWhat Is Data Mining?Data mining (knowledge discovery in databases): Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) information or patterns from data in large databasesWhat is not data mining?(Deductive) query processing. Expert systems or statistical programsWhy Data Mining? — Potential ApplicationsDatabase analysis and decision supportMarket analysis and managementtarget marketing, customer relation management, market basket analysis, cross selling, market segmentationRisk analysis and managementForecasting, customer retention, improved underwriting, quality control, competitive analysisFraud detection and managementOther ApplicationsText mining (news group, documents) and Web analysis.Intelligent query answeringBusiness outlookIndustry conditionsProduct offeringCustomer analysisStrategic optionsCompetitive actionsetcProblemdevelopmentand managementReporting and evaluationsProject designData collection andpreparationModel buildingValidationManagement’sDecision WorldInterfaceData Miner’sAnalytical WorldScope of Data MiningMarket Analysis and Management (1)Where are the data sources for analysis?Credit card transactions, loyalty cards, discount coupons, customer complaint calls, plus (public) lifestyle studiesTarget marketingFind clusters of “model” customers who share the same characteristics: interest, income level, spending habits, etc.Determine customer purchasing patterns over timeConversion of single to a joint bank account: marriage, etc.Cross-market analysisAssociations/co-relations between product salesPrediction based on the association informationMarket Analysis and Management (2)Customer profilingdata mining can tell you what types of customers buy what products (clustering or classification)Identifying customer requirementsidentifying the best products for different customersuse prediction to find what factors will attract new customersProvides summary informationvarious multidimensional summary reportsstatistical summary information (data central tendency and variation)Corporate Analysis and Risk ManagementFinance planning and asset evaluationcash flow analysis and predictioncontingent claim analysis to evaluate assets cross-sectional and time series analysis (financial-ratio, trend analysis, etc.)Resource planning:summarize and compare the resources and spendingCompetition:monitor competitors and market directions group customers into classes and a class-based pricing procedureset pricing strategy in a highly competitive marketFraud Detection and Management (1)Applicationswidely used in health care, retail, credit card services, telecommunications (phone card fraud), etc.Approachuse historical data to build models of fraudulent behavior and use data mining to help identify similar instancesExamplesauto insurance: detect a group of people who stage accidents to collect on insurancemoney laundering: detect suspicious money transactions (US Treasury's Financial Crimes Enforcement Network) medical insurance: detect professional patients and ring of doctors and ring of referencesFraud Detection and Management (2)Detecting inappropriate medical treatmentAustralian Health Insurance Commission identifies that in many cases blanket screening tests were requested (save Australian $1m/yr).Detecting telephone fraudTelephone call model: destination of the call, duration, time of day or week. Analyze patterns that deviate from an expected norm.British Telecom identified discrete groups of callers with frequent intra-group calls, especially mobile phones, and broke a multimillion dollar fraud. RetailAnalysts estimate that 38% of retail shrink is due to dishonest employees.Other ApplicationsSportsIBM Advanced Scout analyzed NBA game statistics (shots blocked, assists, and fouls) to gain competitive advantage for New York Knicks and Miami HeatAstronomyJPL and the Palomar
View Full Document