Unformatted text preview:

AgendaSoc 5811 Lab #1211.28.05I. Welcome1. Review last lab.2. Lab handouts, datasets, and other information can be found at:http://www.tc.umn.edu/~long0324/II. Objectives1. Multiple regression analysis.2. Dummy variables and interaction terms.3. We’ll briefly discuss outliers, multicollinearity, and other important issues necessary for your final paper.4. Next week we will discuss more advanced techniques regarding dummy variables, interaction terms, multicollinearity, and outliers. Everything we do today, however, should get you pretty far on your final paper.III. Using sets when you have lots of variables…1. Many of you are using datasets with hundreds, or even thousands, of variables. SPSS allows you to create variables sets so that only the variables you are interested in show up in the dialog boxes. 2. To create a set of variables, go to Utilities and Define Sets. Name your subset and place the appropriate variables in the box. After you have added all the variables, click Add set.3. To use the set you have created, go to Utilities and Use Sets. Place only the sets you want to use in the Sets in Use box (i.e., remove ALLVARIABLES, but keep your new set and NEWVARIABLES so that any variables you construct later will be included, as well.)4. Create a set with the following 2002 GSS variables we will be using today:educ paeduc papres80 age region polviews sex rincom98 attend pres80III. Multiple regression example1. What is the relationship between a respondent’s education (educ) and his or herfather’s education (paeduc) and job prestige (papres80)? 2. After checking the frequencies for missing values, construct a matrix scatterplot to determine if a linear relationship exists between the three variables.3. The assumptions for multivariate regression are a little different than for bivariate regression. Each is listed below, but we will focus primarily on the properties of the error term. a. The relationships between the dependent variable and each of the dependent variables must be linear. Just like bivariate! Also, the model must be properly specified (we will discuss this briefly).b. All variables must be measured without error. This is probably the hardest to ensure, since some error is common in survey construction and aggregate data.c. The error term, or residual, must be conditionally normal and homoskedastic. It also has a mean of zero (why is this true?). Lastly, the error term cannot be correlated with other independent variables in the model. To determine if you met the assumptions forthe error term, save the unstandardized and standardized residuals when conducting the regression analysis (click the Save button in the regression window). To assess conditional normality, construct histograms of the residual for different values of the independent variable (just like bivariate regression). To assess homoskedasticy, plot the residual with the independent variable and look at the variance around the regression line. Does it appear equal?d. Lastly, the error terms in systems of equations cannot be correlated. This is beyond the scope of this class, but it will be important later in your life. I promise.4. Conduct a multiple regression analysis with the above variables. The SPSS process is the same as for bivariate regression, only with more than one independent variable. Be sure to check the linearity assumption before you begin,but you must check the error assumptions after you have saved the residuals from the regression analysis.5. What did you find?a. First, did we meet the assumptions for multivariate regression?b. How much of the variance explained in education is explained by the linear combination of father’s education and job prestige? What is the adjusted variance explained?c. Which independent variable has the biggest impact? How do the standardized coefficients tell us this?d. Write the equation for the regression equation:6. What does SPSS do with cases with missing values for some of the independent variables? The default for SPSS is called listwise deletion, in which all cases with any missing value for any of the independent or dependent variablesis excluded from the analysis completely. The alternative is pairwise deletion, in which correlations for each independent and dependent pair are calculated using all cases that have values for both variables, regardless of missing values on other independent variables. However, in this case you are potentially calculating correlations for different groups within the same regression equation. Thus, it is best to stick to the default.7. What are control variables? Control variables are simply uninteresting variables that still explain a lot of the variance in the model. If you do not includecontrol variables, you may get spurious effects from other independent variables. For example, population and GDP largely explain many cross-national phenomena, which isn’t very interesting. By always including GDP and population in the model, however, you get more “real” effects of the other independent variables. Another example is predicting prison sentences based on gender and controlling for violent crime. In other words, violent crime convictions will always be a big predictor of long prison sentences. There is nospecial procedure for using control variables in SPSS. Just add them to the model like any other independent variable.IV. Multiple regression with dummy variables1. Recall that dummy variables allow you to compare groups within the regression equation. Coefficients for dummy variables are not slopes, but rather differences in constants. Dummy variables can be constructed form any nominal category. Remember to always exclude a dummy category in your equation, or the model will “blow up.” 2. In the spirit of last year’s election, test if one’s region (region) has any effect on their political views (polviews), controlling for age (age), education (educ), and income (rincom98). Construct dummy variables for region, excluding Pacific as your reference category. Be sure to check the frequencies of your other independent variables for missing values.3. After conducting the regression, what did you find? What relationships were significant? Is it surprising that respondents in the eastern-southern United States have higher constants? (Think about how the polviews scale is constructed.)4. Construct a dummy variable for female (using sex).


View Full Document

U of M SOC 5811 - Lecture notes

Documents in this Course
Load more
Download Lecture notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?