Unformatted text preview:

MIT OpenCourseWare http://ocw.mit.edu 18.443 Statistics for Applications Spring 2009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.18.443 Problem Set 6 1. In this problem, please save all the parameter values and expected values you estimate, not only to those digits displayed, but to all the digits your calculator computes, by storing subtractions resulting in loss of significant digits. Also, some estimates found in Prob1 will be used again in Problem 2. In a random sample of 10, 000 people, t he following probabilities were observed would be predicted for blood types: Type Observed Theory I Theory II A 0.4541 u(1 − v) p 2 + 2pr B 0.0774 (1 − u)v q 2+ 2qr O 0.4384 (1 − u)(1 − v) r 2 AB 0.0301 uv 2pq According to Theory I, there are two forms (alleles) of one gene, C and c, andu − uthem in your calculato r memory. One reason is that there can be cancellation in lem or two forms of another gene, D and d. Here C has probability , so c has probability 1 , and D has probability v, so d has probability 1 − v. Combinations of genes would give the following blood types: CD = AB, Cd = A, cD = B, cd = O. Having C is independent of having D. (A person gets one allele from each parent; we assume one is dominant over another, as in CC or Cc both giving “C,” and DD or Dd both giving “D.”) Estimate the probability u from the observed relative frequency of A’s plus AB’s, and v from B’s plus AB’s. In Theory II, there are three forms of one gene, C, c and t. The probabilities are p for C, q for c, and r for t, so p + q + r = 1. Each person has two of the alleles, o ne from each parent. Since Cc = cC, Ct = tC and ct = tc (we assume here it doesn’t matter which parent each came from), there are 6 possible different combinations of two alleles a person can have. Blood groups would be determined as follows: CC or Ct is A, cc or ct is B, Cc is AB, and tt is O. Estimate (p + r)2 from the observed relative frequencies of A’s plus O’s, then take the square root to estimate p + r. Estimate q as 1 − p − r. Likewise estimate (q + r)2 from the observed relative frequencies of B’s plus O’s, take the square root to estimate q + r and then p as 1 − q − r. Then estimate r as 1 − p − q. Test both theories by chi-squared. Which fits the data better? Is either theory rejected by the data, a t the 0.05 level? 2. For the same data as in the previous problem, again test the same two hypotheses, but instead of a χ2 test, use the Wilks likelihood ratio test. Since the maximum likelihood is hard to compute for Theory II, don’t compute it, but again use the estimates as for the χ2 test. We know that the likelihood, maximized at the unknown MLEs, will be at least as large as it is at the estimated parameters used. B y Wilks’s t heorem, if a d-dimensional hypothesis H0 is true, then for n large (which it is), his li kelihood ratio statistic has a distribution approximately χ2 k−1−d, since the dimension of the full multinomial model is k − 1. Beside doing the tests, see if the values of the statistics are similar, for a non-rejected hypothesis. 1� � 3. Consider Problem 21 in Section 11.6, pp. 462-463 of Rice, but changed as follows. Omit the smal lest and largest measurements for each of the two types of bearings, leaving 8 for each. Then, for the data on Type I, normality is not rejected, but for Type II it is. (a) In what way(s) can one see that the Type II data may be non-normal? (b) The non-normality indicates t hat the analysis i n part (a) of Rice’s problem would be inappropriate, so we don’t need to do it, and this also answers his part (c). So, compare the two samples by a nonparametric method as in Rice’s part (b). 4. Problem 38 i n §11.6 pp. 466-467 of Rice, but only for Pyrazolone-T and omitti ng the observations (0, 0), leaving 9 pairs of observations. 5. An example on pp. 477-478 of Rice has a table with data from 7 different labs. The Shapiro-Wilk test was applied to the four separate data sets from Labs 1, 2, 3, and 7, and none of the four tests rejected normality, where the mean and varia nce could depend on the lab (even though as the box plot shows, the observation 3.81 from Lab 7 might be suspected of being an outlier). (a) Find the sample variance of the 10 observati ons from Lab 2. (b) Find the sample variance of the 10 observations from Lab 3. (c) Test whether these two variances are t he same, using an F-test a t level 0.05. It should be a two-sided test, but the tables of the F distribution in Rice are for one-sided tests. So, see if the ratio of the larger to the smaller of the two sample variances is larger than the 0.975 quantile of the F distribution with the appropriate degrees of freedom i n numerator and denominator. (d) Find also the p-value of the test, in other words the probability of observing a value of the F statistic a s large or larger than the one Fobs actually observed. You can do this in R by finding the probability that F ≤ Fobs as pf(Fobs, n1, n2) where n1 and n2 are the degrees of freedom in the numerator and denominator respectively, then subtracting from 1 to get P (F > Fobs). (If R isn’t conveniently available to you there is a p-value calculator for F distributions provided on the Web at http:// davidmlane.com/hyperstat/F table.html. Or, when I Google “F distribution” I find t he site as the third listed one after Wikipedia and Wolfram.) Then multiply by 2 to get a p-value for the 2-sided test. (e) There are 27 = 21 possible comparisons of two different labs. The p-value from part (d) is not really valid because the specific comparison was chosen after noticing from the box plots that the Lab 2


View Full Document
Download Problem Set #6
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Problem Set #6 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Problem Set #6 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?