DOC PREVIEW
PSU STAT 501 - Qualitative predictor variable

This preview shows page 1 out of 4 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Stat 501 L ab 131 A t wo-group qualitativ e predictor variableThe dexterity.txt data set contains the results of a study on the effect of biofeedback and manual dexterityon the ability of patients to perform a complicated task accurately. Twenty-eight (28) patients were randomlyselected from those referred for physical therapy. The 28 patients were then randomly assigned to eitherreceive or not receive biofeedback. Specifically, the data collected were:• the response y, the number of consecutive repetitions of the task completed before an error was made• x1= dexter, a score of the patient’s manual dexterity• x2= bio, a qualitative variable indicating “yes” the patient received biofeedback or “no” the patientdid not receive biofeedback.We’ll use the data set to get practice at fitting linear regression models with a two-group qualitative variable:1. Create a scatter plot with y on the y-axis and dexter on the x-axis — in doing so, use the qualitative(“grouping”) variable bio to denote whether each data point is from a patient who receiv ed biofeedbackor from a patient who did not.2. Use Minitab’s Manip >> Code command to create a new indicator variable, bio01,say,thatcontains(0,1) codes — the number 0 when bio = no and the number 1 when bio = yes. If we assume the datafollow the regression model:Yi= β0+ β1x1+ β2x2+ εiwhere x2= bio01, what is the mean response function for the group of patients who did not receivebiofeedback? And, what is the mean response function for the group of patients who did receiv ebiofeedback?3. Fit the multiple linear regression model with y astheresponseandx1= dexter and x2= bio01 aspredictors. What is the estimated regression function for the group of patients who did not receivebiofeedbac k? And, what is the estimated regression function for the group of patients who did receivebiofeedback?4. What do eac h of the estimated regression coefficients b0,b1and b2tell us?5. To get a visual picture of the two different estimated regression functions, create another scatter plotof the data, but this time “annotate” the graph with the two estimated lines. What is the definingcharacteristic of the estimated regression lines? Do the lines appear to fit the data well?6. What parameter quantifies the difference in the mean number of consecutive repetitions between pa-tients receiving biofeedback and those who don’t (for all levels of manual dexterity)? Use the regressionoutput you obtained in question (3) to estimate this parameter with 95% confidence. Can w e be con-fident that there is a difference in the mean number between the two groups?7. No w, let’s in vestigate what the impact a different coding scheme has on the regression analysis. Createa new indicator variable, bio11,say,thatcontains(−1, 1) codes — the number −1 when bio = no andthe number 1 when bio = yes. Now, if we assume the data follow the regression model:Yi= β0+ β1x1+ β2x2+ εiwhere x2= bio11, what is the mean response function for the group of patients who did not receivebiofeedback? And, what is the mean response function for the group of patients who did receiv ebiofeedback?18. Now, fit the m u ltiple linear regression model with y as the response and x1= dexter and x2= bio11as predictors. What is the estimated regression function for the group of patients who did not receivebiofeedbac k? And, what is the estimated regression function for the group of patients who did receivebiofeedback? Are the estimated regression functions different than those obtained when using the (0, 1)coding scheme?9. What do each of the estimated regression coefficients b0,b1and b2tell us no w? Is the interpretationof any of the estimated regression coefficients different than the interpretations when using the (0, 1)coding scheme?2 A three-group qualitativ e predictor variableThe data set realestate.txt contains a random sample of data collected on residential sales in a large cit y.The variables collected are:• y = sales, the sales price, in thousands of dollars• area, the area of the home in hundreds of square feet• bed, the number of bedrooms in the house• rooms, the total number of rooms in the house• age, the age of the house, in years• loc,whereA represents a home in the inner suburbs, B represents a home in the outer suburbs, andC represents a home in the downtown areaWe’ll use the data set to get practice at fitting a linear regression model with a three-group qualitativevariable as well as interaction terms:1. Create a scatter plot with y = sales on the y-axis and area on the x-axis — in doing so, use thequalitative (“grouping”) variable loc to denote whether each data point is from a house in the innersuburbs, the outer suburbs or downtown. If you attempted to draw a best fitting line through eac h ofthe three sets of data poin ts, would your lines be parallel?2. Because the qualitative variable loc distinguishes between three groups (A, B, and C), we need tocreate two indicator variables, x2and x3, say, in order to fit a linear regression model to these data.The new indicator variables should be defined as follows:locx2x3A 10B01C00Use Minitab’s Manip >> Code command to create the new indicator variables in your worksheet. Ifwe assume the data follow the regression model:Yi= β0+ β1x1+ β2x2+ β3x3+ β12x1x2+ β13x1x3+ εiwhere x1= area and x2and x3are defined as above, what is the mean response function for the housesin the inner suburbs? for houses in the outer suburbs? for houses in the downtown area? Does thismodel allow the three regression lines to intersect?3. Fit the above m ultiple linear regression model with y = sales as the response and x1= area andx2and x3as the indicator variables — be sure to include the interaction terms as well. What is theestimated regression function for houses in the inner suburbs? for houses in the outer suburbs? forhouses in the downtown area? (For our purposes here, don’t worry about the potential outliers andinfluential points that Minitab flags.)24. To get a visual picture of the three different estimated regression functions, create another scatter plotof the data, but this time “annotate” the graph with the three estimated lines. (Don’t forget the F3key will erase any previous work in the Graph >> Plot command.) What is the defining characteristicof the estimated regression lines? Do the lines appear to fit the data well?5. Recall that the three estimated regression


View Full Document

PSU STAT 501 - Qualitative predictor variable

Documents in this Course
VARIABLES

VARIABLES

33 pages

Load more
Download Qualitative predictor variable
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Qualitative predictor variable and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Qualitative predictor variable 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?