PRACTICE PROBLEMS for MIDTERM 1 2006 22S 30 105 Statistical Methods and Computing Spring 2005 Instructor Cowles Midterm 1 Show your work on any problems that involve calculations There are 50 total points on this midterm Point values for each question are shown in parentheses I will grade on a curve Name Course no 30 or 105 2 The dataset for this problem includes distances and cheapest airline fares to certain destinations for passengers flying out of Baltimore MD as of 1 8 1995 The variables are dest dist fare destination distance from Baltimore in miles fare in dollars a 4 Refer to the stem and leaf plot for the dist variable below What is the median of the distribution of values 1 6 The faculty in the department of Linguistics at the UI are Jill Beckman Maureen Burke Rob Chametzky William Davies Alice Davison Elena Gavruseva Marc Light Rosemary Plapp Catherine Ringen Jerzy Rubach Roumyana Slabakova Bob Wachal Stem 14 12 10 8 6 4 2 0 Leaf 0 2 0 5 149 18 17 9 Multiply Stem Leaf by 10 2 62964 88145 83083 69453 46109 59505 69680 00900 19687 12633 Use the list of random digits above to choose a simple random sample of three of these people to serve on a committee Write enough on the list of names and make markings on the list of random digits so that I can tell what procedure you used Write the names of the three people you selected here 1 1 1 1 3 2 2 1 b Refer to the SAS output attached to the end of the exam to answer the following questions i 1 Which is the response variable dist or fare ii 2 What proportion of the variability in the response variable is explained by the explanatory variable numeric answer iii 2 Suppose that city A is 100 miles farther from Baltimore than city B How much higher or lower a fare would the regression model predict for city A than for city B iv 2 Two points are plotted as circles on the scatterplot If these points were removed would the sample correlation coefficient r be more likely to get larger or get smaller Briefly justify your answer vskip 1 0 in 1 2 v 2 One point on the scatterplot is plotted as an x Is this point likely to be influential Briefly justify your answer 3 5 In 2002 the scores of the 1 3 million students who took the Scholastic Aptitude Test SAT could be described by a Normal distribution with mean 1020 and standard deviation 207 What proportion of students scored between 1000 and 1400 iii none of the above b 2 The population of interest is circle one i ii iii iv v all people living in the Iowa City Coralville area all people listed in the Iowa City Coralville phone directory the people living in the homes whose telephone numbers are selected the people who actually answer the questions none of the above c 2 What data type is the variable number of colds in the past year from question d circle one i ii iii iv v vi binary nominal ordinal discrete quantitative continuous quantitative none of the above d 2 What data type is the variable general health from question c circle one 4 2 At a political gathering there are 25 people over age 50 and 15 people under age 50 You choose at random 5 of those over age 50 and separately choose at random 3 of those under age 50 to interview about attitudes toward Social Security reform The sample of 8 people that you obtain is a circle one a convenience sample i ii iii iv v vi binary nominal ordinal discrete quantitative continuous quantitative none of the above e 2 Which plot or plots from the list below could be used to summarize the distribution of the variable general health from question c circle as many as are correct b judgment sample c simple random sample d stratified random sample e none of the above 5 Researchers wish to investigate the relationship between the number of hours of sleep that people living in the Iowa City Coralville area get each night and the number of colds they get each year The researchers randomly select 100 numbers from the Iowa City Coralville telephone directory They call these numbers and ask the following questions of the person who answers the telephone i ii iii iv v vi vii viii bar graph boxplot histogram line plot pie chart scatterplot stem and leaf plot none of the above a How old are you b How many hours did you sleep last night c How would you describe your general health excellent good fair poor d How many colds did you have in the past year a 2 This research study is circle one i an experiment ii an observational study 6 Consider the variable age in undergraduate students at the university of Iowa a 1 The distribution of ages of undergraduate students is most likely to be circle one i skewed to the left ii skewed to the right iii approximately symmetric b 2 Briefly justify your answer to the preceding question 3 4 c 1 From the list below circle the best choice of numeric summary of this variable i frequency table ii mean and standard deviation iii 5 number summary The REG Procedure Model MODEL1 Dependent Variable fare d 2 Briefly justify your answer to the preceding question Number of Observations Read Number of Observations Used 12 12 Analysis of Variance Source DF Sum of Squares Mean Square Model Error Corrected Total 1 10 11 24735 14060 38795 24735 1406 02467 Root MSE Dependent Mean Coeff Var 37 49700 166 50000 22 52072 R Square Adj R Sq F Value Pr F 17 59 0 0018 0 6376 0 6013 Parameter Estimates 5 Variable DF Parameter Estimate Standard Error t Value Pr t Intercept dist 1 1 82 57767 0 11776 22 74905 0 02808 3 63 4 19 0 0046 0 0018 6 Model MODEL1 Dependent Variable fare 300 x 250 fare 200 150 100 o o 50 0 250 500 750 1000 1250 1500 dist 7
View Full Document