Chapter 1: Introduction to Data Mining, Warehousing, and VisualizationObjectives1-1: The Modern Data Warehouse1-2: Data Warehouse Roles and StructuresElements of a DWPosition of the Data Warehouse Within the Organization – Figure 1-2Data Mining Example Service Quality vs. TrainingPowerPoint PresentationSlide 131-4: The Cost of DWSlide 151-5: Data Mining: Farmers and Explorers1-6: Foundations of Data Mining1-6 & -7: The Foundations of Data MiningData Mining – A General ApproachA General Approach (continued)The Data Warehouse and Data MiningVolumes of Data – The Biggest Challenge1.9: Foundations of Data Visualization [DV]Dr. John Snow used a map to show the source of cholera was a water pump, thus proving the disease was water borne.DV: Opportunity and TimingSlide 31Slide 33DV & DM: Future Success DriversThe End11Modern Data Warehousing, Mining & Visualization, 2003, George MarakasChapter 1: Introduction to Data Mining, Warehousing, and VisualizationModern Data Warehousing, Mining, and Visualization: Core Concepts by George M. MarakasSpring 201221Modern Data Warehousing, Mining & Visualization, 2003, George MarakasObjectivesWhat is the purpose and motivation for developing a Data Warehouse (DW)?Position of DW within IT infrastructureRelationship between DW and business data martWhat can a DW do?Foundations for Data MiningSteps in a typical Data mining projectWhat is a “Correlation”? KEY CONCEPTHistory of Data Visualization vis-à-vis DW31Modern Data Warehousing, Mining & Visualization, 2003, George Marakas1-1: The Modern Data WarehouseA data warehouse is a copy of transaction data specifically structured for querying, analysis and reportingNote that the data warehouse contains a copy of the transactions. These are not updated or changed later by the transaction system.Also note that this data is specially structured, and may have been transformed when it was placed in the warehouse41Modern Data Warehousing, Mining & Visualization, 2003, George Marakas1-2: Data Warehouse Roles and StructuresThe DW has the following primary functions:It is a direct reflection of the business rules of the enterprise.It is the collection point for strategic information.It is the historical store of strategic information.It is the source of information later delivered to data marts.It is the source of stable data regardless of how the business processes may change.51Modern Data Warehousing, Mining & Visualization, 2003, George MarakasElements of a DWExtractTransformStore[ETS]61Modern Data Warehousing, Mining & Visualization, 2003, George MarakasPosition of the Data Warehouse Within the Organization – Figure 1-2111Modern Data Warehousing, Mining & Visualization, 2003, George MarakasData Mining ExampleService Quality vs. TrainingCourtesy: MicroStrategy (2005)121Modern Data Warehousing, Mining & Visualization, 2003, George MarakasSales Analysis-Determine real-time product sales to make vital pricing and distribution decisions. -Analyze historical product sales to determine success or failure attributes. -Evaluate successful products and determine key success factors. -Use corporate data to understand the margin as well as the revenue implications of a decision.-Rapidly identify a preferred customer segments based on revenue and margin. -Quickly isolate past preferred customers who no longer buy. -Identify daily what product is in the manufacturing and distribution pipeline. -Instantly determine which salespeople are performing, on both a revenue and margin basis, and which are behind. Financial Analysis-Compare actual to budgets on an annual, monthly and month-to-date basis. -Review past cash flow trends and forecast future needs. -Identify and analyze key expense generators.-Instantly generate a current set of key financial ratios and indicators. -Receive near-real-time, interactive financial statements. Human Resource Analysis-Evaluate trends in benefit program use. -Identify the wage and benefits costs to determine company-wide variation. -Review compliance levels for EEOC and other regulated activities.Other Areas-Warehouses have also been applied to areas such as: logistics, inventory, purchasing, detailed transaction analysis and load balancing. Examples of Common DW ApplicationsTable 1-1131Modern Data Warehousing, Mining & Visualization, 2003, George MarakasTable 1-2Costs-Hardware, software, development personnel and consultant costs.-Operational costs like ongoing systems maintenance. -Benefits Added Revenue-Will the new (business objective) process generate new customers (what is the estimated value?) -Will the new (business objective) process increase the buying propensity of existing customers (by how much?) -Is the new process necessary to ensure that the competition doesn't offer a demanded service that you can't match? Reduced costs-What costs of current systems will be eliminated? -Is the new process intended to make some operation more efficient? If so, how and what is the dollar value?Comparison of Typical DW Costs and Benefits141Modern Data Warehousing, Mining & Visualization, 2003, George Marakas1-4: The Cost of DWExpenditures can be categorized as one-time initial costs or as recurring, ongoing costs.The initial costs can further be identified as for hardware or software.Expenditures can also be categorized as capital costs (associated with acquisition of the warehouse) or as operational costs (associated with running and maintaining the warehouse)Cost of a Data Warehouse:Rule of Thumb: $1 million per 1 Terabyte of data151Modern Data Warehousing, Mining & Visualization, 2003, George MarakasRecurring Costs One-Time CostsCapital-Hardware maintenance-Software maintenance-Terminal analysis-MiddlewareHardware Software-Disk DBMS-CPU Terminal analysis-Network -Terminal Analysis Middleware Log utility Processing Metadata InfrastructureOperational-Ongoing refreshment-Integration transformation-Data model maintenance-Record identification maintenance-Metadata infrastructure maintenance-Archival of data-Data aging within the DW-Integration/transformation processing specification-Metadata infrastructure population-System of record definition-Data dictionary language definition-Network transfer definition-CASE/Repository interface-Initial data warehouse
View Full Document