**Unformatted text preview:**

Name: Nicole TobisonBios 6102Homework 3: Categorical data analysesDue: 3/16/2021 (Tue) noon, submit to Moodle- 5 points for each question <full score: 110>- Include all related SAS outputs under the corresponding question- Show your steps of calculation and related SAS outputs in the ‘main’ answer sheet. The SAS codes are just for reference. - Late policy: minus 10 points for each day after deadline, not accept after the 5thday1. In a sample of 20 items, I found six to be defective. In constructing a confidence interval for the proportion of defectives, I should use:a. the plus four method.b. the large sample interval.c. neither method.2. A sample of 75 students found that 55 of them had cell phones. The margin of error for a 95%confidence interval estimate for the proportion of all students with cell phones is:The margin of error is 0.10 or 10%. 1.96 √5575×207575 = 0.1003. We want to construct a 95% confidence interval for the true proportion of all adult males who have spent time in prison, with a margin of error of 0.02. From previous studies, we believe the proportion to be somewhere around 0.07. What is the minimum required sample size?The minimum required sample size is 626 adult males who have spent time in prison. n = 1.960.02¿¿¿Round up to 62614. You want to know which of two manufacturing methods will be better. You create 10 prototypes using the first process, and 10 using the second. There were 3 defectives in the first batch and 5 in the second. Find a 95% confidence interval for the difference in the proportion of defectives.The 95% confidence interval for the difference in the proportion of defectives is [-0.620, 0.220].0.3 – 0.5 ± 1.96 √0.3(1−0.3)10+0.5(1−0.5)10 = [-0.620373168, 0.220373168]5. A company wants to evaluate whether the response rate of the new drug B is better than the standard drug A. The data are listed below. Use the two-proportion approach to answer thefollowing questions. Drug A (standard) Drug B (New)Sample size 120 150Response rate 15% 22%5.a Provide the null and alternative hypothesis for comparing the response rate between these two drugs. Ho: P1 – P2 = 0Ha: P1 – P2 ≠ 0Where p1 = the true proportion of response rates for drug A, and p2 = the true proportion of response rates for drug B. 5.b Give a 95% confidence interval for the difference (with calculation and SAS outputs) and briefly summarize what the data show. A 95% confidence interval for the difference is [-0.162, 0.022].0.15 – 0.22 ± 1.96 √0. 15(1−0.15)1 20+0. 22(1−.22)1 50 = [-0.162067858, 0.022067858]We are 95% confident that the interval from -0.162 to 0.022 captures the true difference of the proportions of the response rates between two drugs. Because the interval contains 0, there is no significant difference in response rate between these two drugs. 5.c Perform the significant test to evaluate the hypothesis. Your answer should include the test statistic, p-value and conclusion. Show your calculation and SAS outputs. 2z = (0.15−0.22)√0.1889(1−0.1889)(1120+1150)=−1.46 Chi-square = 2.132 p-value = 0.1442Because p = 0.1442 > 0.05, we fail to reject the null hypothesis, and we conclude that there is no significant difference between the proportions of response rates for the two drugs. 6. One of the questions in a survey of high school students ask about lying to teachers. The following table gives the numbers of students who said that they lied to a teacher at least once during the past year, classified by gender. The researchers want to know whether gender is a risk factor for lying to teachers. Use the two-way table approach to answer the following questions. GenderLied at least once Male FemaleNo 9,659 4,620Yes 3,228 10,2956a. (1) List the null and alternative hypothesis and (2) Calculate appropriate percents to describe the results of this question (calculation & SAS) Ho: no association between lying and genderHa: there is an association between lying and gender^p1 = 0.25; 25% of male students lied to teachers at least once.^p2 = 0.69; 69% of female students lied to teachers at least once. 6b. Use the Chi-square test to test whether there is a significant association between genderand lied at least once to a teacher. (manual & SAS) X2=(3228−6263.082)6263.0822+(10295−7248.69)7248.692+(9659−6623.918)6623.9182+(4620−7666.31)7666.312=5352.1966Chi-square = 5352.1966 (manual); 5351.9397 (SAS)p-value = <0.0001There is a significant association between gender and lying at least once to a teacher.6c. Summarize your conclusion. 3Because p = <0.0001 which is less than 0.05, we reject the null hypothesis, and we conclude that there is a significant difference between the proportions of male students who lie at least once to a teacher and female students who lie at least once to a teacher.7. A student survey reveals that only 564 of 1,200 students surveyed voted in the past student government elections. After closer inspection, it is discovered that 226 female students voted and 338 male students voted. Of the 1200 students participating in the survey, 710 were female and 490 were male. What is the odds ratio of a male student voting in the past student electionsto a female student voting in the past student elections?pmale = 0.6898 pfemale = 0.3183oddsmale = 2.2237 oddsfemale = 0.4669log odds male = log(2.2237) = 0.7992 log odds female = log(0.4669) = -0.7616B0 = -0.7616 B1 = 1.5608odds ratio (male to female) = e−0.7616+1.5608e−0.7616=e1.5608=4.7626The odds of voting in the past student elections for a male student are 4.7626 times the odds for a female student. 8. A physician wants to know whether age at breast cancer diagnosis has a significant impact onbreast cancer stage (high vs. low). He used the following SAS codes to categorize age at diagnosis to three sub-groups. The following logistic regression is to predict high tumor stage. Based on the following information, calculate the odds of high tumor stage for the three age groups (<=45, 45.1-50, and >50). if Age_at_pathological_diagnosis=. then age_dx_g3=.; else if Age_at_pathological_diagnosis<=45 then age_dx_g3=1; else if Age_at_pathological_diagnosis<=50 then age_dx_g3=2; else age_dx_g3=3;Odds age <=45: e−0.6391=0.5278Odds age 45.1 – 50: e0.3337=1.3961Odds age >50: e0.0412=1.042149. Myopia (i.e., nearsightedness) is a result of environmental and genetic factors. In Singapore the percent of military personnel having myopia increased dramatically over a 20-year

View Full Document