1Stat 13, UCLA, Ivo DinovSlide 1UCLA STAT 13Introduction toStatistical Methods for the Life and Health SciencesInstructor: Ivo Dinov, Asst. Prof. of Statistics and NeurologyTeaching Assistants:Brandi Shanata & Tiffany HeadUniversity of California, Los Angeles, Fall 2007http://www.stat.ucla.edu/~dinov/courses_students.htmlStat 13, UCLA, Ivo DinovSlide 2Lecture Set 8 The T Test Wilcoxon-Mann-Whitney TestStat 13, UCLA, Ivo DinovSlide 3ApplicationExample: Nine observations of surface soil pH were made two different locations. Does the data suggest that the true mean soil pH values differ for the two locations? Test using α= 0.05, and be sure to check any necessary assumptions for the validity of your test.Location 1 Location 2 8.10 7.85 7.89 7.30 8.00 7.73 7.85 7.27 8.01 7.58 7.82 7.27 7.99 7.50 7.80 7.23 7.93 7.41 LineChartDemo1bhttp://socr.ucla.edu/htmls/SOCR_Charts.htmlStat 13, UCLA, Ivo DinovSlide 4ApplicationQQNormalPlotDemo: http://socr.ucla.edu/htmls/SOCR_Charts.htmlTo meet the assumption of normality (necessary for the t-test with such a small sample size in each group), we will calculate a normal probability plot for each group.Stat 13, UCLA, Ivo DinovSlide 5ApplicationQQData2DataDemo: http://socr.ucla.edu/htmls/SOCR_Charts.htmlEqu-distributed samples in the two groups? Calculate a QQ probability plot of one group against the other.Stat 13, UCLA, Ivo DinovSlide 6ApplicationBoxAndWhiskerCHartDemo1: http://socr.ucla.edu/htmls/SOCR_Charts.htmlz #1 Formulate hypothesesHo: μ1–μ2= 0 (no difference between the true mean soil pH of locations 1 & 2)Ha: μ1–μ2!= 0 (there is a difference between the true mean soil pH of locations 1 & 2)7.85,7.3,7.73,7.27,7.58,7.27,7.5,7.23,7.41Location 28.1,7.89,8,7.85,8.01,7.82,7.99,7.8,7.93Location 12Stat 13, UCLA, Ivo DinovSlide 7Applicationz #2 Calculate the test statisticDescriptive Statistics: Location 1, Location 2 Variable N N* Mean SE Mean StDev Minimum Q1 Median Q3Location 1 9 0 7.9322 0.0335 0.1005 7.8000 7.8350 7.9300 8.0050Location 2 9 0 7.4600 0.0740 0.2220 7.2300 7.2700 7.4100 7.6550Variable MaximumLocation 1 8.1000Location 2 7.8500827.5081.00460.79322.702121=−−=−−=− yysSEyyt081.09222.091005.02222212121=+=+=−nsnsSEyyStat 13, UCLA, Ivo DinovSlide 8Applicationz #3 Calculate the p-value()()dfnSEnSESESEdf 1103.1119074.0190335.0074.00335.0114422224214122221≈=−+−+=−+−+=p < 2(0.0005) = 0.001 (SOCR)Stat 13, UCLA, Ivo DinovSlide 9Applicationz #4 ConclusionBecause p < 0.001 < 0.05, we will reject Ho.CONCLUSION: These data show that there is a statistically significant true mean difference in the pH of Location 1 and Location 2 (P < 0.001).Stat 13, UCLA, Ivo DinovSlide 10Applicationz #5 SOCR Analysis: http://www.socr.ucla.edu/htmls/SOCR_Analyses.htmlStat 13, UCLA, Ivo DinovSlide 11Applicationz #5 SOCR Analysis: http://www.socr.ucla.edu/htmls/SOCR_Analyses.htmlResult of Two Independent Sample T-Test:Variable 1 = Location 1Sample Size = 8Sample Mean = 6.169Sample Variance = .386Sample SD = .621Variable 2 = Location 2Sample Size = 8Sample Mean = 5.291Sample Variance = .425Sample SD = .652Degrees of Freedom = 14Pooled Sample Variance = .405Pooled Sample SD = .637T-Statistics (Pooled) = -2.757One-Sided P-Value (Pooled) = .008Two-Sided P-Value (Pooled) = .015Stat 13, UCLA, Ivo DinovSlide 12Applicationz Confidence interval for μ1–μ2 Suppose we calculated a 95% confidence interval to be: Does this interval surprise you?()()()()() ())650.0 ,294.0(081.0201.2472.0081.0)11(460.7932.7)(025.0025.02121=±=±−=±−−tSEdftyyyy3Stat 13, UCLA, Ivo DinovSlide 13Applicationz Corresponding computer output:Two-Sample T-Test and CI: Location 1, Location 2 Two-sample T for Location 1 vs Location 2N Mean StDev SE MeanLocation 1 9 7.932 0.100 0.033Location 2 9 7.460 0.222 0.074Difference = mu (Location 1) - mu (Location 2)Estimate for difference: 0.47222295% CI for difference: (0.293459, 0.650985)T-Test of difference = 0 (vs not =): T-Value = 5.81 P-Value = 0.000 DF = 11Stat 13, UCLA, Ivo DinovSlide 14CI and Hypothesis-Testing relationshipz Consider a 95% confidence interval for μ1–μ2and it's relationship to the t test at α= 0.05 Both use and in their calculationsCI: Ts:21yy −21yySE−()()21221)(yySEdftyy−±−α()21021yysSEyyt−−−=Stat 13, UCLA, Ivo DinovSlide 15CI and Hypothesis-Testing relationshipz With a t test we reject Hoif the p-value is less than αand fail to reject otherwise this is the same thing as saying we reject if tsis beyond+t0.025, and fail to reject otherwiseStat 13, UCLA, Ivo DinovSlide 16CI and Hypothesis-Testing relationshipz Focusing on the upper half of the distribution and remembering the symmetry: we fail to reject when Further manipulation gives us:zTherefore, we fail to reject Ho: μ1–μ2= 0 (for the not equal to alternative), if the confidence interval contains 0.025.02121tSEyyTyys<−=−)()(0)()()()(0)()()()()(21212121212121025.021025.021025.021025.021025.021025.0025.021yyyyyyyyyyyyyySEtyySEtyySEtyySEtyySEtyySEtSEtyy−−−−−−−−−>>+−=+−−<<−−−=<−<−=<−Stat 13, UCLA, Ivo DinovSlide 17CI and Hypothesis-Testing relationshipz If a two-tailed t test and a confidence interval give us the same result, why learn both? There are advantages to each one Confidence interval: shows magnitude of difference between μ1and μ2T test: has p-value which describes the strength of evidence that μ1and μ2are really different.Stat 13, UCLA, Ivo DinovSlide 18More on the significance level α• Choose a significance level BEFORE analyzing the dataExample: Say df = 15 and a = 0.05• If tsis in either tail we will reject Ho. The chance of this happening due to random variation is 0.05. I.e., P(reject Ho) = 0.05, if Hois true.• Because we are assuming that Ho is true, all tsvalues on the t curve would only deviate from 0 because of sampling error.• This means:95% would fail to reject Ho2.5% would reject Ho(-ts)2.5% would reject Ho (ts)In other words, a total of 5% would reject Howhen Hois actually true. This is an incorrect conclusion just because of sampling error!4Stat 13, UCLA, Ivo DinovSlide 19More on the significance level α• When we are analyzing one data set in real life at the 0.05 level and our conclusion is to reject Hothere are two possible scenarios:1. Hois in fact false2. Hois true, but we were unlucky (5%)Stat 13,
View Full Document