DOC PREVIEW
Duke STA 101 - Lab Assignment 5

This preview shows page 1 out of 3 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 3 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 3 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

STA 101.02 May 27, 20022002: Summer Session ILab Assignment 5: Simple Linear RegressionFitting a Regression Line.Download the class data from the course website, and open it with JMP-IN. Let’s explore our classsurvey data.1. Analyze the distributions of all the variables. Are any of them normally distributed? If yes,which ones?Yes...height, mom’s height, and GPAWhich results do you find the most surprising?This is subjective, but I found it interesting that so few students and their parentsare left handed.2. Look at the correlations between all of the continuous variables (see lab 4 for directions).• Choose the variables that have strongest positive correlations? Repeat for negativecorrelations.strongest positive–time spent outside of class in major and non-major (.6332)strongest negative–tv and GPA (-.7294)• Which pair(s) of variables appear to have the weakest correlations?haircut and nonmajor (.0013)3. Plot Exercise (x) and GPA (y).• Describe the shape of the scatter plot.There appears to be a negative trend (ie, negative correlation between exer-cise and nonmajor) in the data, as well as a couple possible outliers.• Is the relationship between Exercise and GPA positively or negatively correlated?negatively correlated• Add the 95% density ellipse to the scatter plot. Are there any outlying points? If so,what is their gender? To remove the ellipse, select the red arrow next to fit below thescatter plot and choose Remove Fit.Yes, 2 outlying points corresponding to 1 male and 1 female.• Let’s fit a regression line to our data. Click on the red arrow next to Bivariate Fit andchoose Fit Line. What is the regression equation? How are these values interpreted?GPA = 3.39 - .025*Exercise is the regression equation. For each unit increasein Exercise, the GPA decreases by .025 units.• Do students with higher GPA’s tend to exercise more or less than those with lowerGPA’s?less• Does this mean extra exercise causes low GPA’s?NO!!! There could be confounding variables.• What is the rms error?.381• How well does the line fit the data?fairly well...the p-value for the Goodness of Fit test is .014, indicating a goodfit.• Select the red arrow next to Linear fit and choose Plot Residuals. Are the residualshomoscedastic? Is there a pattern to the residuals?Yes, the residuals appear to be homoscedastic (ie, constant variance in thespread of the residual points). And, there does not appear to be any patternsin the residuals–a further indicator that this is a good model to fit the data.4. To fit the regression lines by class year, first remove the previous fitted line by selecting thered triangle by Linear fit and choosing Remove Fit. Then, select the red triangle besideBivariate Fit and choose Group By and then select Class. Now, go back to Bivariate Fit andselect Fit Line, this will calculate the regression lines for the each class separately. Removethe fit for grad students.• Is the relationship the same for these three groups? If not, explain how they differ.No, seniors have a positive slope, while sophomores and juniors have a neg-ative slope (where the sophomores have the steepest slope).• Which group has the smallest slope? Largest?smallest–seniorslargest–sophomores• Are the correlations between the groups positively or negatively correlated?see above• How do the R2and RMS error values differ between the groups?sophomores: R2=.79, rms=.15juniors: R2=.23, rms=.36seniors: R2=.0062, rms=.44• For which group is Exercise a better predictor of GPA?juniors based on the p-value of the goodness of fit test (.012) and also juniorsappear to have the best residual plot (homoscedastic without any patterns).Eyeing the Least Squares Line.The principle of least squares can be easily seen with one of the sample scripts included in theSample Scripts folder. Select File → Open, from the Open Dialogue box, change the Files ofType drop down to list ”JSL Scripts (*.JSL)”, then select the subdirectory JMP IN Scripts,and choose the script demoLeastSquares.jsl. To execute the script, select Edit → Run Script.You will see a scatter-plot with two small rectangles on it. These two rectangular handles aredraggable, and are used in this case to move the line to a position that you think best summarizesthe data. Press the Your Residuals button. Use the handles to move the line around until youthink the residuals (in blue) are as small as they can be. (Be sure to click on the handles. If youmiss, the program will create a new data point. If this happens click on Delete Last Point torecreate original dataset.) Press the Your Square button. You will see each residual expandedinto a square with the height of the square equal to the value of the residual. The area of thesesquares represent the quantity (predicted - actual)2. Again try to minimize the total area coveredby the blue squares. Press the LS Line and check to see if you correctly determined the leastsquares line. Press the LS Residuals and LS Squares as a further check. How close was your2line to the actual least squares line? Close the least squares demonstration window, and then closethe script


View Full Document

Duke STA 101 - Lab Assignment 5

Download Lab Assignment 5
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lab Assignment 5 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lab Assignment 5 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?