# BGSU STAT 4440 - Logistic Regression - 1 slide per page (25 pages)

Previewing pages 1, 2, 24, 25 of 25 page document
View Full Document

## Logistic Regression - 1 slide per page

Previewing pages 1, 2, 24, 25 of actual document.

View Full Document
View Full Document

## Logistic Regression - 1 slide per page

36 views

Pages:
25
School:
Bowling Green State University - Main Campus
Course:
Stat 4440 - Data Mining in Business Analytics
Unformatted text preview:

DATA MINING Logistic Regression Dr brahim apar Assistant Professor Learning objective What is logistic regression Interpreting results of logistic regression Logistic Regression Extends idea of linear regression to situation where outcome variable is categorical Widely used particularly where a structured model is useful to explain profiling or to predict We focus on binary classification i e Y 0 or Y 1 The Logit Goal Find a function of the predictor variables that relates them to a 0 1 outcome Instead of Y as outcome variable like in linear regression we use a function of Y called the logit Logit can be modeled as a linear function of the predictors The logit can be mapped back to a probability which in turn can be mapped to a class Logistic Response Logit Function p probability of belonging to class 1 Need to relate p to predictors with a function that guarantees 0 1 Standard linear function as shown below does not Binary Response Variable and Linear Model LP Model 1 0 Logistic response Logit function ln 1 or 1 1 Linear Model and Logit Model LP Model 1 Logit Model 0 The Odds The odds of an event are defined as p Odds 1 p Or given the odds of an event the probability of the event can be computed by Odds p 1 Odds The Odds or ln 0 0 1 Odds and Logit as function of p An Example Universal Bank An Example Universal Bank ID Customer ID Age Customer s age in completed years Experience years of professional experience Income Annual income of the customer 000 ZIP Code Home Address ZIP code Family Family size of the customer CCAvg Avg spending on credit cards per month 000 Education Education Level 1 Undergrad 2 Graduate 3 Advanced Professional Mortgage Value of house mortgage if any 000 Personal Loan Did this customer accept the personal loan offered in the last campaign Securities Account Does the customer have a securities account with the bank Does the customer have a certificate of deposit CD account with the CD Account bank Online Does the customer use internet banking facilities CreditCard Does the customer use a credit card issued by UniversalBank Data Exploration Data Preprocessing Partition 60 training 40 validation Create 0 1 dummy variables for categorical predictors 1 1 0 0 Single Predictor Model Modeling loan acceptance on income x ln ln ln ln 1 1 6 2774 0 0385 Single Predictor Model ln ln 6 2774 0 0385 1 or 1 1 Interpretation ln ln 1 6 2774 0 0385 1 039 When everything else is held constant one thousand dollar increase in income increases odds of getting a personal loan by 1 039 times or increases 3 9 1 039 1 3 9 on average Seeing the Relationship Classifying Observations Model produces an estimated probability of being a 1 Convert to a classification by establishing cutoff level If estimated prob cutoff classify as 1 Ways to Determine Cutoff 0 50 is popular initial choice Additional considerations see Chapter 5 Maximize classification accuracy Maximize sensitivity subject to min level of specificity Minimize false positives subject to max false negative rate Minimize expected cost of misclassification need to specify costs Considering All Variables Interpretation Credit Card EduGrad 0 35 48 17 Performance Performance

View Full Document

Unlocking...