Unformatted text preview:

Data Warehousing Built for Strategic planning not Tactical Why Do This Improve existing business processes Understand historical phenomena Seek patterns and golden nuggets It s hard and expensive Not for the weak of heart It s hard and expensive Not for the weak of heart DW DM Data Warehouse collect data Data Mining analyze collected data Yin Yang complementary opposites that interact within a greater whole Examples of DM Use Predict future price of fine French wines Determine best metrics to predict athletic performance baseball football Define best practices for certain medical diagnoses Refusing service to high risk people for auto insurance and rental cars Determining level of service for frequent flyers 4 Moneyball Subjectivity ruled the day in evaluating players he said We had a completely new set of metrics that bore no resemblance to anything you d seen We didn t solve baseball But we reduced the inefficiency of our decision making A particular problem in baseball is appearance bias the notion that some athletes look more like great baseball players than others It s also an issue in business DePodesta said citing a data point from Malcolm Gadwell on height and business success Gladwell found that although just 3 9 percent of American males are 6 foot 2 or taller about 30 percent of Fortune 500 CEOs are 6 foot 2 or taller 5 Comparison of Queries SQL OLAP Data Mining 6 Online Analytical Processing OLAP Examine collected data to find out why something happened One time event Is there a pattern or trend that can be exploited Why were sales in SW low for 4th Quarter 7 OLAP OLAP can look very much like complex SQL queries but there is a difference OLAP operates on multi dimensional data cubes OLAP provides drill up drill down to further explore what you see learn OLAP requires tools that allow pivoting cube formulas like totals aggregations 8 OLAP Example MicroStrategy 9 MicroStrategy Wisdom Tool I pulled these data points directly from the Facebook likes of each of the brand pages using a free consumer tool from MicroStrategy called Wisdom Using this tool I can even tell that Coca Cola fans are likely to also enjoy the odd Oreo cookie and bag of Pringles Wisdom illustrates the potential for companies to leverage the massive amount of data embedded in Facebook to help target more refined advertising campaigns build new product affinities and ultimately deliver greater value to their customers New analytics tools from MicroStrategy and others designed to integrate with social networks will make this possible 10 Data Mining Models and Tasks Directed Undirected 11 Directed Ex Classification Directed because it deals with discrete outcomes like pass fail sunny cloudy rain Build pre defined classes then fit new points into the tree Typically decision trees regression 12 Classification Process Model Construction Training Data Classification Algorithms YEARS TENURED Classifier Model NAME RANK Mike Assistant Prof Mary Assistant Prof Professor Bill Jim Associate Prof Dave Assistant Prof Anne Associate Prof 3 7 2 7 6 3 no yes yes yes no no IF rank professor OR years 6 THEN tenured yes 13 Classification Process Use the Model in Prediction Use test data to refine the model Classifier Run the model with new data and get prediction Testing Data Unseen Data Jeff Professor 4 YEARS TENURED Tenured NAME RANK Assistant Prof Tom Merlisa Associate Prof George Professor Joseph Assistant Prof 2 7 5 7 no no yes yes 14 Decision Trees Most common usage of classification methods A decision tree model consists of a set of decision rules for dividing a large heterogeneous population into smaller more homogeneous groups with respect to a particular target variable 15 An Example History of when I play golf Outlook Tempreature Humidity Windy Play sunny hot hot sunny overcast hot mild rain cool rain rain cool overcast cool sunny mild cool sunny rain mild sunny mild overcast mild overcast hot rain high high high high normal normal normal high normal normal normal high normal high false true false false false true true false false false true true false true N N P P P N P N P P P P P N mild Outlook sunny overcast overcast rain humidity P windy high normal true false N P N P Resulting decision tree model 16 Undirected Ex Association Undirected as there is no pre defined classes to fit data into Take existing data and see if things go together Looks for Associations Between Events Sometimes Called Market Basket Analysis People Who Buy Beer Buy Chips People Who Buy Soup Buy Crackers 17 Example A typical example is in retail where historic data might identify that customers who purchase the Gladiator DVD and the Patriot DVD also purchase the Braveheart DVD The historic data might indicate that the first two DVDs are purchased by only 5 of all customers But 70 of these then also purchase Braveheart This is an interesting group of customers As a business we may be able to take advantage of this observation by targeting advertising of the Braveheart DVD to customers who have purchased both Gladiator and Patriot 18 Market Basket Analysis MBA Retail each customer purchases different set of products different quantities different times MBA uses this information to Identify who customers are not by name Understand why they make certain purchases Gain insight about its merchandise products Fast and slow movers Products which are purchased together Products which might benefit from promotion Take action Store layouts Which products to put on specials promote coupons Combining all of this with a customer loyalty card it becomes even more valuable 19 Not Always Retail The Australian Health Insurance Commission discovered an unexpected correlation between two pathology tests performed by pathology laboratories and paid for by insurance Viveros et al 1999 It turned out that only one of the tests was actually necessary yet regularly both were being performed The insurance organization was able to reduce over payment by disallowing payment for both tests resulting in a saving of some half a million dollars per year 20 Lipstick the Economy Rising Lipstick Sales May Mean Pouting Economy LEAD STORY DATELINE The Wall Street Journal November 26 2001 Recently there have been decided increases in the sales of lipstick in the United States For example the company that produces MAC Cosmetics reported a 12 percent increase in sales over the last month Overall the market research firm Information Resources Inc reported that sales of


View Full Document

UMD BMGT 301 - Data Warehousing

Documents in this Course
Big Data

Big Data

27 pages

Hardware

Hardware

13 pages

Hardware

Hardware

10 pages

MIDTERM

MIDTERM

4 pages

Notes

Notes

13 pages

Notes

Notes

3 pages

Quiz 4

Quiz 4

4 pages

Quiz 2

Quiz 2

2 pages

Netflix

Netflix

1 pages

Notes

Notes

4 pages

Midterm

Midterm

6 pages

Netflix

Netflix

1 pages

Essay

Essay

6 pages

Notes

Notes

6 pages

Notes

Notes

7 pages

Final

Final

24 pages

Notes

Notes

2 pages

WEB PAGES

WEB PAGES

35 pages

Web 2.0

Web 2.0

13 pages

Summary

Summary

1 pages

Exam 1

Exam 1

10 pages

Notes

Notes

8 pages

Exam 1

Exam 1

23 pages

Load more
Download Data Warehousing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Data Warehousing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Data Warehousing and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?