UI STAT 2010 - Statistical Methods and Computing - D2161869

Home> Schools> University of Iowa> Statistics (STAT) > STAT 2010> Statistical Methods and Computing

DOC PREVIEW

UI STAT 2010 - Statistical Methods and Computing

School name University of Iowa

Course Stat 2010- Statistical Methods and Computing

Pages 5

This preview shows page 1-2 out of 5 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 5 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

22S 30 105 Statistical Methods and Computing Instructor Cowles Lab 6 Apr 9 2008 Inference for Proportions 1 Inference about a single population proportion Diana M Bailey The American Journal of Occupational Therapy 1990 conducted a study to examine the reasons why occupational therapists have left the field of occupational therapy Her sample consisted of female certified occupational therapists who had left the profession either permanently or temporarily Out of 696 subjects who responded to the data gathering survey 438 or 63 had planned to take time off from their jobs to have and raise children On the basis of these data we wish to compute a confidence interval for the unknown proportion in the sampled population whose reason for leaving the field is to take time off to have and raise children 7 At the 01 significance level carry out a hypothesis test of the hypotheses H0 p 0 25 Ha p 6 0 25 8 Can you reject H0 What does this mean substantively 9 Interpret the p value SAS code Creating the dataset 1 What is the sampled population 2 What is are the population parameter s of interest data leave input child count datalines Y 438 N 258 Proc freq makes a table of counts and percents 3 Is this a one sample paired sample or two independent sample problem 4 Are the rules of thumb met so that we can use a normal approximation to carry out our test proc freq data leave tables child weight count run SAS output Cumulative Cumulative The FREQ Procedure 5 What is the point estimate for p the proportion of occupational therapists who leave the field for reasons other than having and raising kids Cumulative Cumulative child Frequency Percent Frequency Percent N 258 37 07 258 37 07 Y 438 62 93 696 100 00 6 What is the 95 confidence interval What does the confidence interval mean 1 2 To carry out a one sample z test of the hypothesis To get a level 1 confidence interval for the true population proportion p add the binomial alpha alpha0 option to the end of the tables statement This code requests a 95 c i To get a 99 c i your would specify alpha 01 Note that this code also automatically also produces a hypothesis test of H0 p 0 5 H 0 p p0 Ha p 6 p0 add the binomial p p0 option on the end of the tables statement The following code tests the null hypothesis that the population proportion of occupational therapists leaving the field for reasons other than to have and raise kids is 0 25 Note that it also automatically produces a 95 c i for p proc freq data leave tables child binomial alpha 05 weight count run SAS output proc freq data leave tables child binomial p 0 25 weight count run The FREQ Procedure Cumulative Cumulative child Frequency Percent Frequency Percent N 258 37 07 258 37 07 Y 438 62 93 696 100 00 SAS output The FREQ Procedure Cumulative Cumulative child Frequency Percent Frequency Percent N 258 37 07 258 37 07 Y 438 62 93 696 100 00 Binomial Proportion for child N Proportion 0 3707 ASE 0 0183 95 Lower Conf Bound 0 3348 95 Upper Conf Bound 0 4066 Binomial Proportion for child N Proportion 0 3707 ASE 0 0183 95 Lower Conf Bound 0 3348 95 Upper Conf Bound 0 4066 Exact Conf Bounds 95 Lower Conf Bound 95 Upper Conf Bound Exact Conf Bounds 95 Lower Conf Bound 95 Upper Conf Bound 0 3347 0 4078 Test of H0 Proportion 0 5 ASE under H0 Z One sided Pr Z Two sided Pr Z 0 3347 0 4078 0 0190 6 8229 0001 0001 Test of H0 Proportion 0 25 ASE under H0 Z One sided Pr Z Two sided Pr Z 3 0 0164 7 3532 0001 0001 2 Comparing two population proportions Research has suggested that alcoholism may be related to clinical depression An investigation by Winokur and Coryell American Journal of Psychiatry 1991 ex4 plored this possible relationship In 210 families of females with clinical depression they found that alcoholism was present in 89 In 299 control families alcoholism was present in 94 Do these data provide evidence that alcoholism occurs in a different proportion of families in which unipolar major depression occurs than in which there is no diagnosis of depression Carry out a hypothesis test at the 05 significance level 1 What is are the populations of interest First we must key in our data data depress input depress alcohol count datalines Y Y 89 Y N 121 N Y 94 N N 205 run Next we use the Chi square test option of proc freq to do the hypothesis test 2 What is are the population parameters of interest 3 Is this a one sample paired sample or two independent sample problem proc freq data depress tables depress alcohol chisq weight count run SAS output TABLE OF DEPRESS BY ALCOHOL 4 Is the hypothesis one or two sided DEPRESS Frequency Percent Row Pct Col Pct N Y N 205 94 40 28 18 47 68 56 31 44 62 88 51 37 Y 121 89 23 77 17 49 57 62 42 38 37 12 48 63 Total 326 183 64 05 35 95 5 What are the null and alternative hypotheses for the test 6 Are the rules of thumb met so that we can use a normal approximation to carry out our test 7 If the null hypothesis is true what is our best estimate based on this data of the common proportion of alcoholism in both populations of families 8 What is your conclusion based on the statistical analysis ALCOHOL 299 58 74 210 41 26 509 100 00 STATISTICS FOR TABLE OF DEPRESS BY ALCOHOL Statistic 5 Total DF 6 Value Prob Chi Square 1 6 415 0 011 Likelihood Ratio Chi Square 1 6 385 0 012 Continuity Adj Chi Square 1 5 949 0 015 Mantel Haenszel Chi Square 1 6 402 0 011 Fisher s Exact Test Left 0 996 Right 7 46E 03 2 Tail 0 015 Phi Coefficient 0 112 Contingency Coefficient 0 112 Cramer s V 0 112 Sample Size 509 3 Proc freq for data read in from a dataset of individual observations Do not use the weight statement in proc freq if each observation should be given weight 1 Here is an example problem based on the datasets dieldrin dat from the course web page Stacy Perriman and Whitney 1985 studied pesticide residues in human milk in Western Australia in 1979 80 Earlier research had discovered high pesticide levels Stacey et al hoped to show that levels had decreased due to stronger government controls over the use of pesticides on food crops They did find decreases for several types of pesticides but levels of dieldrin had increased substantially This dataset has information from 45 donors The variables are data milk infile group ftp pub kcowles datasets dieldrin dat input age newburb termite above run proc freq data milk tables above binomial alpha 01 run 4 …

View Full Document