DOC PREVIEW
UCLA STATS 100C - Homework

This preview shows page 1 out of 2 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 2 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 2 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

University of California, Los AngelesDepartment of StatisticsStatistics 100C Instructor: Nicolas ChristouHomework 5Exercise 1Please refer to homework 4, exercise 3.a. Test the overall significance of the model. The easiest way to do this is to find first SSE and SST .Then you can compute SSR and then the F statistic.b. Test the following hypothesis:H0: β1− 2β2= 0H0: β1− 2β26= 0The test statistic will be:t =a0ˆβ − 0sepa0(X0X)−1a.Before you compute test statistic above write the vector a, that will help you to extract the elementsneeded from (X0X)−1to find the var(ˆβ1− 2ˆβ2).c. Find a confidence interval for E(yg) when x0g= (1 24 29). Use:ˆyg± tα2;n−k−1seqx0g(X0X)−1xg.d. Compute R2for these data.Exercise 2Show that the error sum of squares SSE = e0e, is equal to the following expressions:SSE = Y0Y −ˆβ0X0Xˆβ = Y0Y −ˆβ0X0Y = Y0Y −ˆβ0X0ˆY.Exercise 3Suppose for a multiple regression problem the units of the the ithindependent variable are in millimeters.Explain what would happen to the estimateˆβiof βiand to its variance if we express the ithindependentvariable in meters instead of millimeters. Hint: Multiply X by a diagonal matrix containing 0.001 in the(i, i)thposition and 1’s in the other diagonal positions.Exercise 4Most people like to eat out at restaurants that offer supposedly top quality food. But do we pay for the qualityof food or for something else... One hundred restaurants were selected from the area of Westwood, Brentwood,and Santa Monica (these are 2000 data). The source of this data set is from http://www.zagat.com and canbe accessed in R as follows:a <- read.table("http://www.stat.ucla.edu/~nchristo/statistics100C/restaurant.txt", header=TRUE)In this data set there are four variables:food: The food rating for each restaurant on a scale from 1-30 (30 being the best).decor: The decor rating for each restaurant on a scale from 1-30.ser: The service rating for each restaurant on a scale from 1-30.cost: The cost ($) for dinner including one drink and the tip for each restaurant.Answer the following questions:a. Construct the scatterplots of the variable cost on each of the other three variables.b. Run the following 3 regressions:i. cost on foodii. cost on decoriii. cost on serc. From your answer to question (b), with which of the three independent variables is cost most correlated?d. Use the fitted line of the best of the three regressions to predict the cost of a dinner at a restaurantwhich has food rating 20, decor rating 16, and service rating 13.e. Now, add the three ratings (food, deco, and service) to create a total rating variable. So, now youhave a new variable called total. Plot cost against total.f. Regress cost on total. Is there a stronger relationship (R2) between cost and total, than any of theprevious regressions from question (b)? Write down the fitted regression line.g. Check the assumptions of the model of part (f). Plot and print the residuals against the fitted valuesand against total. Are there any violations of the assumptions?h. Using the model of part (f) predict the cost of a dinner at a restaurant of your choice (perhaps yourfavorite restaurant if you have one). You will need of course the values of food, ser, and decor.This regression model was first applied in data from restaurants in New York City by Professor Jeff Simonoffof the Statistics Department of the Stern School of Business at New York University.Exercise 5Use the data of exercise 4. Consider the multiple regression modelcosti= βo+ β1foodi+ β2decori+ β3seri+ ia. Construct the n × 4 design matrix X, and compute the X0X.b. Obtain the least squares estimatesˆβ using matrix and vector operations.c. Verify that your answers are correct by using the lm function in R.d. Compute the hat matrix H and show only the elements of the first 5 rows and


View Full Document

UCLA STATS 100C - Homework

Documents in this Course
Load more
Download Homework
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Homework and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Homework 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?