Factors Affecting the Income of Canada s Residents in the Group 5 Ben Wright 1970 s Bin Ren Hong Wang Jake Stamper James Rogers Yuejing Wu Data Source Census of Canada Collected by Canadian Government in 1971 102 different occupational categories 4 occupational categories had incomplete data Categories represent data aggregated over 1000 s of employees Definition of variables Gender of women in occupation Years of Education Average number of years of education per worker Job prestige rating assigned based on social survey conducted in the mid 1960 s Job types Blue collar e g janitor Professional e g lawyer White collar e g insurance agent What factors affected the occupational income of Canada s residents in 1971 Step1 Data preparation Removal of incomplete observations 4 types of employment were not classified into a type baby sitters athletes newsboys and farmers Removal of non descriptive statistics Census code Step2 Exploratory data analysis 1 Professional occupations have higher average income prestige scores and years of education of than blue and white collar jobs 2 White collar jobs on average employ a larger percentage of women Step3 pair wise scatter plot to see the relationships between variables 57 57 87 45 45 87 70 70 Step4 Linear regression Data output Variable Coefficie nt StdDev T value P stat Educatio n 131 18 288 75 454 0 650 Women 53 235 9 83 5 415 0 000000 50 Prestige 139 20 36 40 3 82 0 00024 Type b c 7 32 3037 27 0 002 0 998 Type prof R2 0 9023 516 47 3519 59 0 147 0 884 Type w c F stat 120 355 31 0 113 0 91 P value 3135 86 0 00000000000000022 Step5 Test the validity of linear regression Normality Data is skewed towards Step5 Test the validity of linear regression Heteroskedasticity Variance is not constant R2 90 Data is heteroskedastic need to perform data transformation Step6 Log Transformation log income Approximates a normal distribution Results of linear regression on log transformation Variable Coef StdDev T value P stat Education 0 0076 0 0255 0 3 0 765 Women 0 0085 0 0009 9 467 3 01e 15 Prestige 0 0208 0 0033 6 340 8 42e 09 Type b c 7 8720 0 1811 43 462 2e 16 Type prof 7 8584 0 3019 26 034 2e 16 Type w c 7 9428 0 2453 32 379 2e 16 education is not a significant variable and can be removed from the model Are different models needed for different ranges of variables Linear relations hip Linear relations hip Variable s Wome n Prestig e Type Linear model explains the entire range of observations Outliers affecting the model Possible outliers Model may not account for a variable which explains these data points Model disregarding outlier The total sum of squared residuals is further reduced by removing outliers Final Model This means that regardless of your job type if you switched between jobs with the same level of prestige e g 62 to one which had a lower percentage of women e g 57 to 10 you could increase you income Conclusions The level of prestige more than education associated with a particular occupation best describes the income it will earn Occupations which employ a higher percentage of women will offer a lower income Job type i e b c w c or prof can be used to explain income differences between occupations
View Full Document
Unlocking...