DOC PREVIEW
ISU STAT 511 - HOMEWORK

This preview shows page 1 out of 4 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1Stat 511 HW#3 Spring 2004 1. The lm function in R allows one to do weighted least squares, i.e. minimize ()2ˆii iwy y−∑for positive weights iw . For the 1V case of the Aitken model of Problem 6 from HW 2, find the BLUEs of the 4 cell means using lm and an appropriate vector of weights. (Type > help(lm) in R in order to get help with the syntax.) 2. On the Web page http://www.public.iastate.edu/~vardeman/stat511/511data.html you will find the file homes.TXT We're going to do some statistical analysis on this set of home sale price data obtained from the Ames City Assessor’s Office. Data on sales May 2002 through June 2003 of 121 and 2 story homes built 1945 and before, with (above grade) size of 2500 sq ft or less and lot size 20,000 sq ft or less, located in Low- and Medium-Density Residential zoning areas are given in this file. 88n = different homes fitting this description were sold in Ames during this period. (2 were actually sold twice, but only the second sales prices of these were included in our data set.) For each home, the value of the response variable recorded sales price of the homePrice = and the values of 14 potential explanatory variables were obtained. These variables are - Size , the floor area of the home above grade in sq ft - Land , the area of the lot the home occupies in sq ft - Bed Rooms , a count of the number in the home - Central Air , a dummy variable that is 1 if the home has central air conditioning and is 0 if it does not - Fireplace , a count of the number in the home - Full Bath , a count of the number of full bathrooms above grade - Half Bath , a count of the number of half bathrooms above grade - Basement , the floor area of the home’s basement (including both finished and unfinished parts) in sq ft - Finished Bsmt , the area of any finished part of the home’s basement in sq ft - Bsmt Bath , a dummy variable that is 1 if there is a bathroom of any sort (full or half) in the home’s basement and is 0 otherwise - Garage , a dummy variable that is 1 if the home has a garage of any sort and is 0 otherwise - Multiple Car , a dummy variable that is 1 if the home has a garage that holds more than one vehicle and is 0 otherwise - (2 )Style Story , a dummy variable that is 1 if the home is a 2 story (or a 122 story) home and is 0 otherwise - ( )Zone Town Center , a dummy variable that is 1 if the home is in an area zoned as “Urban Core Medium Density” and 0 otherwise The first row of the file has the variable names in it. (You might open this file by double clicking on the link to have a look at it.) While connected to the network, enter these data into R using the command2> homes<-read.table("http://www.public.iastate.edu/~vardeman/stat511/homes.TXT",header=T) In theory, one should also be able to get this loaded by placing homes.TXT into an appropriate directory of the local machine (where R knows to search) and issuing the above command with homes.TXT only (instead of the URL). But although I could make it work last year, I haven’t managed to get this to work this year. Use the command > homes to view the data frame. It should have 15 columns and 88 rows. Now create two matrices that will be used to fit a regression model to some of these data. Type > Y<-as.matrix(homes[,1]) > X<-as.matrix(homes[,c(2,5,10,11,13)]) Note the use of []to select columns from the data frame. Here, the function as.matrix is used to create a matrix from one or more columns of the data frame. To add a column of ones to the model matrix, type > X0<-rep(1,length(Y)) > X<-cbind(X0,X) Make a scatterplot matrix for 12 5,,, ,yx x x… . To do this, first load the lattice package. (Look under the "Packages" heading on the R GUI, select "Load package" and then lattice.) Then type > splom(~homes[,c(1,2,5,10,11,13)],aspect="fill") If you had to guess based on this plot, which single predictor do you think is probably the best predictor of Price? Do you see any evidence of multicollinearity (correlation among the predictors) in this graphic? Also compute a sample correlation matrix for 1234 5, , , , and yx x x x x. You may compute the matrix using the cor() function and round the printed values to four places using the round() function as > round(cor(homes[,c(1,2,5,10,11,13)]),4) Use the qr() function to find the rank of X . Use R matrix operations on the X matrix and Y vector to find the estimated regression coefficient vector OLSb , the estimated mean vector ˆY , and the vector of residuals ˆ=−eYY. Plot the residuals against the fitted means. After loading the MASS package, this can be done using the following code.3 > b<-solve(t(X)%*%X)%*%t(X)%*%Y > yhat<-X%*%b > e<-Y-yhat > par(fin=c(6.0,6.0),pch=18,cex=1.5,mar=c(5,5,4,2)) > plot(yhat,e,xlab="Predicted Y",ylab="Residual",main="Residual Plot") Type > help(par) to see the list of parameters that may be set on a graphic. What does the first specification above do, i.e. what does fin=c(6.0,6.0) do? Plot the residuals against home size. You may use the following code. > plot(homes$Size,e,xlab="Size",ylab="Residual",main="Residual Plot") And you can add a smooth trend line to the plot by typing > lines(loess.smooth(homes$Size,e,0.90)) What happens when you type > lines(loess.smooth(homes$Size,e,0.50)) (The values 0.90 and 0.50 are values of a "smoothing parameter." You could have discovered this (and more) about the loess.smooth function by typing > help(loess.smooth)) Now plot the residuals against each of 234 5,, and xxx x. Create a normal plot from the values in the residual vector. You can do so by typing > qqnorm(e,main="Normal Probability Plot") > qqline(e) Now compute the sum of squared residuals and the corresponding estimate of 2σ, namely ()()()2ˆˆranknσ′−−=−YY YYX Use this and compute an estimate of the covariance matrix for OLSb , namely ()12σ−′XX Sometimes you may want to write a summary matrix out to a file. This can be done as follows. First prepare the row and columns labels and round all entries to 4 places using the code > case<-1:88 > temp<-cbind(case,homes[,c(2,5,10,11,13)],Y,yhat,e) > round(temp,4)4Then with the MASS package loaded (in order to make the write.matrix function available). The code > write.matrix(temp,file="c:/temp/regoutput.out") will then write output to the file c:/temp/regoutput.out (you may choose another name and destination for this file).


View Full Document

ISU STAT 511 - HOMEWORK

Download HOMEWORK
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view HOMEWORK and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view HOMEWORK 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?