DOC PREVIEW
Duke STA 101 - Chi-square test

This preview shows page 1-2-24-25 out of 25 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Chi-square testMore types of inference for nominal variablesGoodness of fit testSlide 4Slide 5Slide 6Goodness of fit test: Test statisticGoodness of fit test: Test statisticGoodness of fit test: Calculate p-valueSlide 10Chi-squared tableJMP output admissionsGoodness of fit test: Judging p-valueIndependence testSample of conditional frequenciesTest of independenceImplications of independenceSlide 18Slide 19Test of independence:Slide 21JMP output for chi-squared testChi-squared test detailsChi-squared test itemsChi-squared testFPP 28Chi-square testMore types of inference for nominal variablesNominal data is categorical with more than two categoriesCompare observed frequencies of nominal variable to hypothesized probabilitiesOne categorical variable with more than two categoriesChi-squared goodness of fit testTest if two nominal variables are independentTwo categorical variables with at least one having more than two categoriesChi-squared test of independenceGoodness of fit testDo people admit themselves to hospitals more frequently close to their birthday?Data from a random sample of 200 people admitted to hospitalsDays from birthdayNumber of admissionswithin 7 118-30 2431-90 6991+ 96Goodness of fit testAssume there is no birthday effect, that is, people admit randomly. Then, Pr (within 7) = = .0411 Pr (8 - 30) = = .1260 Pr (31-90) = = .3288 Pr (91+) = = .5041 So, in a sample of 200 people, we’d expect to be in “within 7” to be in “8 - 30” to be in “31 - 90” to be in “91+”Goodness of fit testIf admissions are random, we expect the sample frequencies and hypothesized probabilities to be similarBut, as always, the sample frequencies are affected by chance errorSo, we need to see whether the sample frequencies could have been a plausible result from a chance error if the hypothesized probabilities are true. Let’s build a hypothesis testGoodness of fit testHypothesisClaim (alternative hyp.) is admission probabilities change according to days since birthdayOpposite of claim (null hyp.) is probabilities in accordance with random admissions.H0 : Pr (within 7) = .0411 Pr (8 - 30) = .1260 Pr (31-90) = .3288 Pr (91+) = .5041HA : probabilities different than those in H0 .Goodness of fit test: Test statisticChi-squared test statistic€ X2= sum(observed - expected)2expected ⎛ ⎝ ⎜ ⎞ ⎠ ⎟Goodness of fit test: Test statistic€ X2= sum(observed - expected)2expected ⎛ ⎝ ⎜ ⎞ ⎠ ⎟= .94 + .057 + .16 + .23 =1.397Cell Obs Exp Dif Dif2Dif2/ExpIn 78-3031-9091+Goodness of fit test: Calculate p-valueX2 has a chi-squared distribution with degrees of freedom equal to number of categories minus 1. In this case, df = 4 – 1 = 3.Goodness of fit test: Calculate p-valueTo get a p-value, calculate the area under the chi-squared curve to the right of 1.397Using JMP, this area is 0.703. If the null hypothesis is true, there is a 70% chance of observing a value of X2 as or more extreme than 1.397Using the table the p-value is between 0.9 and 0.70Chi-squared tableJMP output admissions31 - 90 8 - 30 91+ Within 731 - 90 8 - 30 91+ Within 731 - 908 - 3091+Within 7TotalLevel 69 24 96 11 200Count0.345000.120000.480000.055001.00000Prob 4 LevelsFrequencies31 - 908 - 3091+Within 7Level0.345000.120000.480000.05500Estim Prob0.329000.126000.504000.04100Hypoth ProbLikelihood RatioPearsonTest 1.3063 1.3974ChiSquare 3 3DF0.72760.7061Prob>Chisq Method: Fix hypothesized values, rescale omittedTest ProbabilitiesDaysDistributionsGoodness of fit test: Judging p-valueThe .70 is a large p-value, indicating that the difference between the observed and expected counts could well occur by random chance when the null hypothesis is true. Therefore, we cannot reject the null hypothesis. There is not enough evidence to conclude that admissions rates change according to days from birthday.Independence testIs birth order related to delinquency?Nye (1958) randomly sampled 1154 high school girls and asked if they had been “delinquent”. Eldest 24 450In Between 29 312Youngest 35 211Only 23 70Sample of conditional frequencies% Delinquent for each birth order statusBased on conditional frequencies, it appears that youngest are more delinquentCould these sample frequencies have plausibly occurred by chance if there is no relationship between birth order and delinqeuncyOldest .05Middle .085Youngest .14Only .25Test of independenceHypothesesWant to show that there is some relationship between birth order and delinquency.Opposite is that there is no relationship.H0 : birth order and delinquency are independent.HA : birth order and delinquency are dependent.Implications of independenceExpected countsUnder independence, Pr(oldest and delinquent) = Pr(oldest)*Pr(delinquent)Estimate Pr(oldest) as marginal frequency of oldestEstimate Pr(delinquent) as marginal frequency of delinquentHence, estimate Pr(oldest and delinquent) asThe expected number of oldest and delinquent, under independence, equalsThis is repeated for all the other cells in tableTest of independenceExpected countsNext we compare the observed counts with the expected to get a test statisticOldest 45.59 428.41In Between32.80 308.2Youngest 23.66 222.34Only 8.95 84.05Use the X2 statistic as the test statistic:245.4205.84)05.8470(95.8)95.823(34.222)34.222211(66.23)66.2335(2.308)2.308312(80.32)80.3229(41.428)41.428450(49.45)59.4524(222222222XTest of independence:Calculate the p-valueX 2 has a chi-squared distribution with degrees of freedom:df = (number rows – 1) * (number columns – 1) In delinquency problem, df = (4 - 1) * (2 - 1) = 3.The area under the chi-squared curve to the right of 42.245 is less than .0001. There is only a very small chance of getting an X2 as or more extreme than 42.245.JMP output for chi-squared testFreq: Column 3B ir t h O r d e rEldestIn Betw eOnly ChildYoungest 450 38.99 43.14 94.94428.407 1.0883 24 2.08 21.62 5.0645.592710.2263 312 27.04 29.91 91.50 308.2 0.0468 29 2.51 26.13 8.5032.7998 0.4402 70 6.07 6.71 75.2784.0546 2.3500 23 1.99 20.72 24.738.9454122.0819 211 18.28 20.23 85.77222.338 0.5782 35 3.03 31.53 14.23 23.662 5.4327


View Full Document

Duke STA 101 - Chi-square test

Download Chi-square test
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Chi-square test and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Chi-square test 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?