DOC PREVIEW
UT SW 388R7 - Solving Standard Multiple Regression Problems

This preview shows page 1-2-3-26-27-28 out of 28 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1 of 28 Solving Standard Multiple Regression Problems We have used regression analysis to examine relationships among interval level dependent variables and interval and dichotomous level independent variables. However, many important independent variables, such as race, religion, marital status, etc., are all nominal level variables. We can extend our use of multiple regression to include these variables by dummy coding nominal level variables. We can also apply dummy-coding to ordinal level variables and avoid any contention about using ordinal level variables in statistics requiring interval level data. Dummy-coding is done by converting the categories of the nominal level variable into a set of dichotomous variables. The number of dichotomous variables needed to dummy code a variable is one less than the number of categories in the original variable. Thus, it a variable had three categories, we would retain its information using two dummy-coded variables. If a variable had four categories, it would require us to replace it with three dummy-coded variables, and so forth. The basic idea behind dummy coding is to select one category of the variable as the “reference” category, and compare all other categories to it. A difference in means would be reported as the difference between one of the retained categories and the reference category. For example, if we were studying participation in extracurricular activities by class standing, we would have categories for freshmen, sophomores, juniors, and seniors. If we selected freshmen as the reference category, we would interpret the b coefficients as the difference between freshmen and sophomores, between freshmen and juniors, and between freshmen and seniors. This dummy-coding strategy is referred to as “indicator” coding to distinguish it from “deviation” coding (also called “effects” coding). Deviation coding lets us compare each category of the independent variable to the average for all categories. Instead of interpreting the b coefficients as the difference between freshmen and sophomores, we would interpret the deviation-coded variable as the difference between sophomores and all students. Deviation coding, like indicator coding, requires us to designate one category as the reference category that is excluded from the regression. An example of indicator coding is shown in the following table: Categories of Original Variable: New Variable SophStudents New Variable JrStudents New Variable SrStudents Freshman 0 0 0 Sophomores 1 0 0 Juniors 0 1 0 Seniors 0 0 1 If a student was in the freshman reference category, they would have zeros for all three of the new variables. If a student was a sophomore, they would have a one for the new variable SophStudents, and a zero for the other new variables. If a student was a junior, they would have a one for the new variable JrStudents, and a zero for the other new variables. If a student was a senior, they would have a one for the new variable SrStudents, and a zero for the other new variables.2 of 28 Deviation coding differs by assigning a negative one (-1) to the reference category rather than a zero, as shown in the following table: Categories of Original Variable: New Variable SophStudents New Variable JrStudents New Variable SrStudents Freshman -1 -1 -1 Sophomores 1 0 0 Juniors 0 1 0 Seniors 0 0 1 Interpretation of the relationships based on the two-coding schemes are different because each is making a different comparison. Indicator coding compares each category to the reference category, while deviation coding compares each category to the mean of all categories. Since the comparisons are different, the slope(b) coefficients for the categories will be different under the two schemes. In some SPSS procedures, dummy-coding is done automatically if you declare the variable to be a factor (general linear models) or a categorical covariate (logistic regression). These procedures offer a limited number of choices over which category is treated as the reference group, usually the last or first category. To choose another category, we would have to recode the variable so that our chosen group had the highest or lowest code number. We can also use SPSS Recode commands to dummy code any variable and use the results in any procedure that we want. The script which we will use for this week’s problems does deviation coding for each non-metric variable, including variables that are dichotomous and could be used in the regression analysis without dummy coding. The rationale for dummy-coding dichotomous variables is to make their interpretation consistent with the interpretation for nominal variables, i.e. comparing the category to the mean for both categories.3 of 28 The Problem in Blackboard The Statement about Level of Measurement The problem statement tells us: • the variables included in the analysis • whether each variable should be treated as metric or non-metric • the reference category for non-metric variables to be dummy coded • the alpha for both the statistical relationships and for diagnostic tests The first statement in the problem asks about level of measurement. Standard multiple regression requires the dependent variable and the metric independent variables be interval level, and the non-metric independent variables be dummy-coded if they are not dichotomous. The only way we would violate the level of measurement would be to use a nominal variable as the dependent variable, or to attempt to dummy-code an interval level variable that was not grouped.4 of 28 Marking the Statement about Level of Measurement Satisfying the Assumptions of Multiple Regression Mark the check box as a correct statement because: • The metric dependent variable "occupational prestige score" [prestg80] was interval level, satisfying the requirement for dependent variables. • "Income" [rincom98] is ordinal level, but the problem calls for treating it as metric, applying the common convention of treating ordinal variables as interval level. • "Highest academic degree" [degree] is ordinal level and could be treated as metric, but the problem calls for dummy-coding it, so we will satisfy the requirement by treating it as non-metric. • The non-metric independent variable "race" [race] was nominal level, but will satisfy the requirement for independent variables when dummy coded. The next four statements identify the


View Full Document

UT SW 388R7 - Solving Standard Multiple Regression Problems

Documents in this Course
Load more
Download Solving Standard Multiple Regression Problems
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Solving Standard Multiple Regression Problems and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Solving Standard Multiple Regression Problems 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?