1 Statistics 201 Exam 1 – from Fall 2013 Practice Exam 1 for Spring 2014 - KEY This practice exam is provided solely for the purpose of familiarizing you with the format and style of the Stat 201 exams. There is no explicit or implicit guarantee that the upcoming exam will ask similar questions. If you use the practice exam as your only tool to help you prepare for the upcoming exam, you most likely will not do well on the exam. You should still do the things you would have done if you did not have access to this practice exam, such as re‐read the text, go over your class notes, re‐work the online homework problems, and look at the list of exam topics provided and make sure that you understand all the concepts listed within it. 2 1. A study at a medical center examined 129 people to see if contracting hepatitis C was associated with having diabetes. Researchers used medical records to determine whether or not the subject has diabetes and/or hepatitis C. (Report your answers to TWO decimal places.) i) (3 points) What percent of the subjects do not have hepatitis C? 96 / 129 = .7442 = 74.42% (3 points) What percent of the subjects who have diabetes do not have hepatitis C? 39 / 60 = .6500 = 65.00% ii) (3 points) What percent of the subjects with hepatitis C do not have diabetes? 12 / 33 = .3636 = 36.36% iii) (4 points) Do you think there is an association between hepatitis C virus and diabetes? Circle your answer. YES NO Using the mosaic plot below, briefly explain your answer in the space to the right of the plot. HAS diabetes Does NOT have diabetes Total HAS hepatitis C 21 12 33 Does NOT have hepatitis C 39 57 96 Total 60 69 129 Seems that subjects with diabetes were more likely to have Hepatitis C. The “heights” of the bars are not equal. 3 2. (5 points) The response times of a particular web server are approximately normally distributed with a mean of 120 milliseconds and a standard deviation of 15 milliseconds. Given that any value from a normal distribution with a z-score of -0.841 represents the 20th percentile, use this fact to calculate the 80th percentile for these response times. ** Recall that the normal distribution is symmetric, therefore the z-score for the 20th percentile would be the negative value of the z-score for the 80th percentile. Then, solve for y. z = (y - µ) / σ .841 = (y – 120) / 15 y = 132.615 milliseconds 3. (5 points) Suppose that the wait time to reach a technical support person at a large software company is approximately normally distributed, with a mean of 4 minutes and a standard deviation of 0.5 minutes. Approximately what percent of customers will wait less than 2.75 minutes? Use one of the screenshots on the next page to help you answer this question. Show any work below, and write your final answer on the blank line below. Report your final answer to 4 decimal places. z = (y - µ) / σ z = (2.75 – 4) / 0.5 = -2.5 Customers who have wait times less than 2.75 minutes are 2.5 standard deviations BELOW the average wait time of 4.0 minutes. Using the circled graphic on the next page, it shows a standard normal model with a mean of 0 and a standard deviation of 1, and the area shaded that represents 2.5 standard deviations ABOVE the mean. Since the normal model is symmetric, if we know that 1 – this shaded area will give us the area we are interested in. So the final answer is 1 - 0.9938 = 0.0062. Final Answer is ____0.0062_______.4 Problem 3, continued.5 4) A survey of college freshmen asked how many minutes per day they talk on the phone. The box plots below display the data for female and male students. Answer the questions below based on what you see in the image. i) (3 points) Which of the following statements is true? (circle the best answer): a) The median number of minutes spent on the phone by males is higher than the median number of minutes for females. b) The minimum number of minutes is lower for males than the minimum for females. c) Females have the greatest IQR. d) None of the statements are true. ii) (3 points) The group that seems to have the most outliers is (circle the best answer): a) males b) females c) none of the groups appear to have outliers d) cannot be determined from box plots iii) (3 points) The group with the least amount of overall variation is (circle the best answer): a) males b) females c) the groups appear to be similar in terms of overall variation d) cannot be determined from box plots6 5. The following data are from a Sporting Goods Company that manufactures and sells products. This small subset of the company’s employee data is from its human resource department. I C Q Q Q C C Q years employed # Vacation days Vision insuance? Yes or No? 401k contibution amount (%) Employee Department salary Location Abram, Doug Sales 35000 2 8 Y Boston 2% Baker, Baker Production 41000 1 0 Y Chicago 3% Baker, Christine Administration 29000 1 6 Y Boston 1% Johnson, Paul Production 49000 5 5 N Nashville 3% Parker, Laura Production 65000 8 12 Y Boston 5% Edwards, John Maintenance 30000 2 0 Y Boston 0% Deal, Karl Administration 32000 4 8 Y Nashville 3% Smith, Smith Maintenance 32000 2 0 N Nashville 0% Wood, Janice Sales 38000 2 0 N Chicago 0% Bunch, Chris Sales 41000 3 5 Y Nashville 4% Morris, Bradley Administration 35000 6 15 Y Chicago 5% i) (4 points) Indicate whether the variables are quantitative (Q), categorical (C), or identifier (I) in the space above the column headings. Label each column as only ONE type of variable. ii) (2 points) Name two variables listed above that one could use to create a scatter plot. Years
View Full Document