39-41. Suppose a population consisting of hair color and eyeHair ColorRow TotalBrownBlackFairRedBrown43828811516857Grey, Green1387746946533132Blue8071891768472811Column Total26322829116T-TEST OF DIFFERENCE OF MEANS FOR TWO YEARS ASSUMING INDEPENREGRESSION OF PCB_85 ON LNPCB_84REGRESSION OF LNPCB_85 ON LNPCB_841 Final Exam ⎯ Biometrics 301, Fall 2000 NAME: SOLUTIONS SSN: DISCUSSION SECTION: ___ Pollution of waterways is one of the most serious problems facing the world today. Pollutant levels in various bodies of water are important to study in order to understand the extent of the problem. In particular, prediction of future levels of pollutants based on current characteristics is used in order to develop strategies to address pollution. The following analysis is based on the 1984 and 1985 concentrations of PCBs, measured in parts per billion (ppb), in water samples taken from 37 U.S. bays and estuaries. The locations were chosen because it was believed that these bays and estuaries represent “typical” waterways along the coastlines of the United States. The scientists sampled at the same location within each waterway each year and the same laboratory and technicians were used to analyze the water samples. The raw data are listed below. LOCATION PCB_84 PCB_85 LOCATION PCB_84 PCB_85 Casco Bay 95.28 77.55 Merrimack River 52.97 29.23Salem Harbor 533.58 403.10 Boston Harbor 17104.9 736.00Buzzards' Bay 308.46 192.15 Narragansett Bay 159.96 220.60E. Long Island Sound 10.00 8.62 W. Long Island Sound 234.43 174.31Raritan Bay 443.89 529.28 Delaware Bay 2.50 130.67Lower Chesapeake Bay 51.00 39.74 Pamlico Sound 0.00 0.00Charleston Harbor 9.10 8.43 Sapelo Sound 0.00 0.00St. Johns River 140.00 120.04 Tampa Bay 0.00 0.00Apalachicola Bay 12.00 11.93 Mobile Bay 0.00 0.00Round Island 0.00 0.00 Mississippi R. Delta 34.00 30.14Barataria Bay 0.00 0.00 San Antonio Bay 0.00 0.00Corpus Christi Bay 0.00 0.00 San Diego Harbor 422.10 531.67San Diego Bay 6.74 9.30 Dana Point 7.06 5.74Seal Beach 46.71 46.47 San Pedro Canyon 159.56 176.90Santa Monica Bay 14.00 13.69 Bodega Bay 4.18 4.89Coos Bay 3.19 6.60 Columbia River Mouth 8.77 6.73Nisqually Beach 4.23 4.28 Commencement Bay 20.60 20.50Elliott Bay 20.60 20.50 Lutak Inlet 5.50 5.80Nahku Bay 6.60 5.08 Question 1. One reason that this data set was collected was to determine whether PCB levels in the U.S. waterways decreased between 1984 and 1985. A statistical test will be performed. A (2 pts). State the hypotheses to be tested in words. Ho: mean PCB levels in U.S waterways is the same for the two years 1984 and 1985 HA: mean PCB level was lower in 1985 than in 1984 B (2 pts). State the hypotheses to be tested in statistical terms (i.e. use the Greek symbols for the population parameters being tested!). Ho: µ84 = µ85 i.e. µ84 - µ85 =0 HA: µ84 > µ85 i.e. µ84 - µ85 >02 E (3 pts). Of the tests you learned this semester, which is appropriate for these hypotheses and data? Circle the correct answer. Test of a Single Binomial Proportion Test of a Single Population Mean Paired T-test For the Difference of Two Means Test of Two Population Proportions Test of Two Population Means Using Independent Samples F (5 pts). For whichever test you believe to be appropriate, Does it appear that the assumptions of the test have been met? Describe each assumption and your conclusions for each assumption. Note: if it isn’t possible to tell from the information given, say so. Assumptions: Conclusions: The samples are paired Yes, since data was collected in the same manner in each waterway in each year The n pairs are a random sample of pairs No, the waterways were selected to represent a typical set of waterways in the US The number of pairs is large Yes, n=37 which is larger than 30 so we can invoke the Central Limit Theorem and assume that the sampling distribution of the mean is Normal G (3 pts). The value of the test statistic is –0.99879 and the p-value of the test is 0.1623. State your conclusion. Use a significance level of α = 0.08. p-value = 0.1623 >> α = 0.08. Hence we fail to reject the null hypothesis. There is insufficient evidence to suggest that mean PCB levels fell between 1984 and 1985. Further, these results are possibly suspect because the assumption of a random sample was not met. Questions 2-3. Next, some regression analyses were performed on the data. As part of these analyses, the PCB data were transformed where LNPCB_84 = log10 (PCB_84 + 1) and LNPCB_85 = log10 (PCB_85 + 1) and log10 is logarithm to the base 10. Two regression analyses were performed. In the first, the response variable is PCB_85 and the explanatory variable is LNPCB_84. In the second regression, the response variable is LNPCB_85 and the explanatory variable is LNPCB_84.3 Question 2. Refer to the analysis of PCB_85 as a function of LNPCB_84 below. Bivariate Fit of PCB_85 By LNPCB_84 -1000100200300400500600700800 PCB_8501234LNPCB_84 Linear Fit PCB_85 = -81.87351 + 142.23544 LNPCB_84 Summary of Fit RSquare 0.660298 Root Mean Square Error 104.0024 Mean of Response 96.48486 Observations 37 Parameter Estimates Term Estimate Std Error t Ratio Prob>|t|Intercept -81.87351 27.56704 -2.97 0.0054LNPCB_84 142.23544 17.2446 8.25 <.0001 -150-5050150250Residual01234LNPCB_84 This is a potential outlier because it does not follow the pattern of the rest of the data These are both potential outliers because they are well above the fitted line (large positive residuals) There is definite curvature that is not captured by a straight line There is definite curvature that is not captured by a straight line A (3 pts). Are there any obvious outliers or influential observations which could be affecting the fit of the estimated regression line? Explain briefly (use the space next to the printout above). Be sure to identify the suspect observations. B (3 pts). Based on the plots, is a linear regression analysis appropriate for these data? Explain your answer briefly (use the space next to the printout above).4 Question 3. Refer to the regression analysis of LNPCB_85 as a function of LNPCB_84. Bivariate Fit of LNPCB_85 By LNPCB_84 -0.500.511.522.53LNPCB_850 1 2 3 4LNPCB_84 Linear Fit LNPCB_85 =
View Full Document