UT Knoxville STAT 201 - 5) sld_nominal_variables - D38770

Home> Schools> University of Tennessee> Statistics (STAT) > STAT 201> 5) sld_nominal_variables

DOC PREVIEW

UT Knoxville STAT 201 - 5) sld_nominal_variables

School name University of Tennessee

Course Stat 201- Introduction to Statistics

Pages 52

This preview shows page 1-2-3-25-26-27-28-50-51-52 out of 52 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 52 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 52 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 52 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 52 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 52 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 52 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 52 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 52 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 52 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 52 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 52 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

REPRESENTING NOMINALPREDICTOR VARIABLESNominal Variables-Predictors in a regression equation do not have to be quantitative variables-Predictors can be nominal variablesE.g.,-Male vs Female-American, German, Japanese, Italian persons-Different conditions of an experimentAnalysis Strategy for Nominal Variables-Typically the effect of a nominal variable is assessed with ANOVA-ANOVA is a special case of linear regression-We can assess nominal variables with OLS regressionNominal Variable in Regression-Beta assesses change in DV per unit change in IV-Need to represent the levels of the nominal variable with numbers to estimate amount DV changes with shifts from onelevel of a nominal variable to another level -regression doesn’t “understand” male or female-Multiple comparisons are necessary to extract all of the info from a nominal variable with more than 2 levels.-A variable with G-levels requires g-1 comparisons-Each g-1 comparison is represented by a separate predictorThree Coding Systems-Dummy Coding-Effects Coding-Contrast CodingSimilarity Among Coding Systems-When g-1 predictors are simultaneously included in the regression equation, the 3 coding systems produce the same results for the overall model (i.e., R2 and F-value). -So, when g-1 predictors are treated as a set to represent the nominal variable, the 3 coding systems produce the same conclusion regarding the omnibus effect of the nominal variable.Difference Among Coding Systems-Each coding system tests different hypotheses regarding the comparisons among the levels of the nominal variable-So, the coding systems produce different estimated regression parameters (B) for the g-1 predictors and statisticaltests of those predictors (i.e., t)(When the nominal variable has only 2-levels, the effects coding and contrast coding systems are identical & both result in the same statistical test – i.e. t- & p-value – of the regression parameter as does dummy coding. However, the value of the regression parameter will differ for the dummy and contrast/effects coding.)A Data Set-Randomly assign depressed patients to one of three therapy conditions. Subsequently, assess depression (1-8, higher numbers = greater depression)NoTherapySmilingTherapyExerciseTherapy7 2 37 1 16 1 26 2 -- 2 -50.6x4n60.1x5n00.2x3n-unequal n, is not a problem for the regression analysis(may compromise causal inference)Data In SASTo Compare ANOVA & RegressionA one-factor ANOVA indicated that the omnibus effect of therapy on depression was significant, F(2, 9) = 64.97, p = .0001. Two orthogonal contrasts revealed that depression was greater in the no-therapy condition than in the mean of the smiling and exercise therapy conditions, F(1, 9) = 123.4, p = .0001, and the latter conditions did not differ, F(1, 9) = 0.64, p = .4433.Therapy: F(2, 9) = 64.97, p = .0001No-therapy vs Smiling&Exercise: F(1, 9) = 123.4, p = .0001Smiling vs Exercise: F(1, 9) = 0.64, p = .4433Dummy Coding-One of the G-levels of the nominal variable is treated as a reference level-g-1 predictors compare other levels to the reference level(only when g-1 predictors are fully partialled)E.g., Therapy has 3-levels so need 2 predictors-treat no-therapy as reference levelpredictor1: smiling vs no-therapypredictor2: exercise vs no-therapyCreating g-1 Dummy-Coded Predictors-Participant receives either 0 or 1 on each g-1 predictor-Receive a 1 if in the condition being compared to the reference level-Receive a 0 if not in the condition being compared to the reference level-Receive a 0 if in the condition that serves as reference levelDummy Coding as a Function of ConditionUnpartialled & Partialled Predictors-Unpartialled X1 compares smiling with “not-smiling” (i.e., a weighted mean of exercise and no-therapy)- Unpartialled X2 compares exercise with “not-exercise” (i.e., a weighted mean of smiling and no-therapy)-X1 and X2 contain redundant information and are correlated-Partialling X2 from X1 removes “exercise” info and creates acomparison of smiling with no-therapy-Partialling X1 from X2 removes “smiling” info and creates a comparison of exercise with no-therapyPartialling yields unique meaning to dummy codingCreating Dummy-Coded Predictors in SASTesting Therapy in SAS-The set of g-1 predictors contain the effect of Therapydepression = d_smil d_exer vs depression =Difference between models yields the effect of Therapyproc reg;model depress = d_smil d_exer;run;SAS Output-Model comparison reveals same results as ANOVA.Omnibus effect of Therapy: F(2, 9) = 64.97, p = .0001Interpretation of Regression ParametersCondition MeanNo-therapy Smiling Exercise6.50 1.60 2.00Depression = 6.50 – 4.90(d_smil) – 4.50(d_exer)-Y-intercept: mean of the reference level-smildB_ = -4.90 (difference between smiling and no-therapy)Mean level of depression is 4.9 points less in smiling-therapy condtion than no-therapy condition (i.e., 1.60 – 6.50 = -4.90)-exerdB_=-4.50 (difference between exercise and no-therapy)Mean level of depression is 4.5 points less in exercise-therapy condtion than no-therapy condition (i.e., 2.00 – 6.50 = -4.50)Recovering Condition Means from Model-Plug 0/1 codes for each condition in the model-No-therapy coded 0 for d_smile and 0 for d_exerDepression = 6.50 – 4.90(0) – 4.50(0) = 6.50-Smiling therapy coded 1 for d_smile and 0 for d_exerDepression = 6.50 – 4.90(1) – 4.50(0) = 1.60-Exercise therapy coded 0 for d_smile and 1 for d_exerDepression = 6.50 – 4.90(0) – 4.50(1) = 2.00sr2 for Dummy-Coded Predictors-The unique interpretation of the dummy coded predictors are obtained by partialling each from the other-Therefore, the sr2 are from a simultaneous regression and they do not sum to the R2 of the full model-Because the g-1 predictors are a set (i.e., Therapy) it doesn’t make sense to enter them hierarchically (can be entered as a set in a hierarchical model to partial-out effects of other variables)-sr2 for d_smil indicates the proportion of variability in depression that is explained by the difference between smilingtherapy and no-therapyEffects Coding-Partialled effects-coded predictor compares the mean of a given level of nominal variable with the unweighted mean of the means of all levels of the nominal variableE.g., 3therapynoexercisesmilingsmilingXXXX-Derived from an ANOVA framework in which the “effect”

View Full Document