DOC PREVIEW
UW-Madison STAT 333 - Final Examination

This preview shows page 1 out of 3 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 3 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 3 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Prof. Bret Larget Name:Math 325W/525WFinal ExaminationPlease show work! The score you earn on each problem is based on your complete solution, not only on the final answer.You may use your textbook, a calculator, and two sheets of prepared notes.Problem 1: (10 points)Investigators are interested in the effect of marijuana use during pregnancy on birth weight. Low birth weight is associatedwith many problems that occur later in life. The investigators take a sample of 17 pregnant marijuana users from a volunteersin a local shelter and take a second sample of 26 women, presumed not to be marijuana users, from an area hospital. A 95%confidence interval for the difference in head circumference (associated with body weight) is 0.90 ± 0.29 cm.(a) Can it be concluded that using marijuana during pregnancy decreases a baby’s head circumference? Explain.(b) If cost and ethical considerations were irrelevant and statistical concerns were the only consideration, briefly describean experiment design that would provide unbiased estimates of the effects of marijuana use during pregnancy.Problem 2: (15 points)There were 27 players drafted in the first round of the 1991 NBA draft. (Their draft positions ranged from 1 to 27 where 1is the first person drafted.) Starting salaries ranged from a low of $180,000 to a high of $3,333,333. Two players, the 15thand 25th players selected did not sign with the teams that drafted them. For the 25 players who signed contracts, the meanand standard deviation of the starting salaries are $1,320,000 and $885,000 respectively. The mean and standard deviationof the draft positions are 13.52 and 7.93 respectively. The correlation coefficient is −0.887.(a) Write down the regression equation for predicting starting salary based on draft position.(b) Use the equation to predict what salary the player drafted 15th might have expected.(c) Use the equation to predict what salary the player drafted 25th might have expected.(d) For each position lower in the draft, by how much does the starting salary decrease?(e) What graph would you make to check on the validity of a linear fit to this data?Problem 3: (15 points)Does living in a rural area decrease total cholesterol levels? Three urban dwellers have total serum cholesterol measurementsof 205, 196, and 241. Two rural dwellers have total serum cholesterol measurements of 129 and 175.(a) Use the t-tools to test the null hypothesis that mean cholesterol levels are equal in rural and urban areas versus thealternative that mean cholesterol levels are lower in rural areas than in urban areas. Express the p-value as an areaunder a t curve.(b) Test the same hypotheses with a permutation test. (Note. You can find the p-value without calculating the test statisticfor all cases).(c) Test the same hypotheses with a rank-sum test. (Note. You can find the p-value without calculating the test statisticfor all cases).(d) The data was not randomly sampled from the populations of interest. How does this affect any inferences you maymake?(e) What confounding variables might affect conclusions from a similar study random samples of much greater size?Problem 4: (10 points)A biologist wishes to see if brain size in mammals is associated with average litter size. The biologist divides species into agroup of 51 for whom the average litter size is less than two and a group of 45 for whom the average litter size is at leasttwo. For each species, a relative brain size is calculated as 1000 × Brain Weight / Body Weight. Summary statistics for thesample data and the log-transformed sample data are shown below.n mean std. dev. min Q1median Q3maxavg. litter < 2raw data 51 6.886 5.460 0.42 2.48 5.00 10.48 20.00log-transformed data 51 1.552 0.952 −0.868 0.908 1.609 2.348 2.996avg. litter ≥ 2raw data 45 10.968 9.837 0.94 3.39 7.97 18.61 36.35log-transformed data 45 1.949 1.016 −0.062 1.221 2.076 2.924 3.593(a) Sketch side-by-side boxplots of the raw data and a separate side-by-side boxplot of the log-transformed data.(b) Decide if the transformation is appropriate before further analysis. Explain your decision.(c) Find a 95% confidence interval for the difference in population means based either the raw data or the transformeddata. Summarize your inference in the units of the original problem.Problem 5: (10 points)A researcher suspects that an antibody (CCK) may differ with gastrointestinal health. In a study with 25 guinea pigs, thereare 9 healthy controls, 8 with gall stones, and 8 with ulcers.Partial output from S-PLUS is below.Df Sum of Sq Mean Sq F Value Pr(F)treat 0.4328218Residuals 0.5989222(a) Use the information in the table and the problem description to complete an ANOVA table similar to the on Display5.19 on page 136 of The Sleuth.(b) Use the F tables beginning on page 712 to find a range for the p-value.(c) Does it appear that CCK is associated with gastrointestinal health in guinea pigs? Write a brief paragraph thatinterprets the results of the ANOVA study in the context of the problem.Problem 6: (10 points)Biologists are interested in determining a relationship between the number of species on an island as a function of the log ofarea of the island (A), the average elevation of the island (B), and the distance to the nearest island (C). They gather datafrom 30 islands in the Galapagos Archipelago. The fits from all possible models with only main effects are shown below.Model RSS dfnone 381081 29A 146848 28B 173547 28C 381006 28AB 136309 27AC 130055 27BC 173534 27ABC 126681 26(a) Which model is the best according to the Cp criterion?(b) Which model is the best according to BIC?Problem 7: (15 points)Researchers in Europe assess the effect of the Chernobyl disaster by measuring the amounts of radioactive cesium in plantsand soil. They wish to find an equation to predict the cesium amounts in mushrooms based on the soil measurements. Fivesamples yield the following concentrations (in Bq/kg).mean standard deviationmushroom 9 20 15 46 190 56.0 76.2soil 55 415 475 82 1310 467.4 507.8The regression equation is (mushroom conc.) = −6.4115 + 0.1335 (soil conc.) and the residual sum of squares is 4852.4.(a) Sketch a scatter plot of the data with the regression line drawn in.(b) Which point do you expect to be most influential? Explain.(c) Calculate the leverage of this point.(d) Calculate the Cook’s distance of the point.(e) Comment on the validity of the regression line for predicting mushroom cesium concentration when the soil


View Full Document

UW-Madison STAT 333 - Final Examination

Download Final Examination
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Final Examination and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Final Examination 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?