DOC PREVIEW
UCLA STATS 10 - Significance Testing

This preview shows page 1-2-3-4 out of 11 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1STAT 10, UCLA, Ivo DinovSlide 1UCLA STAT 10Introduction toStatistical ReasoningzInstructor: Ivo Dinov, Asst. Prof. In Statistics and NeurologyzTeaching Assistants: Yan Xiong, Will Anderson,UCLA StatisticsUniversity of California, Los Angeles, Winter 2002http://www.stat.ucla.edu/~dinov/STAT 10, UCLA, Ivo DinovSlide 2Chapter 26: Significance Testing --Using Data to Test HypotheseszGetting StartedzWhat do we test? Types of hypotheseszMeasuring the evidence against the nullzHypothesis testing as decision makingzWhy tests should be supplemented by intervalsSTAT 10, UCLA, Ivo DinovSlide 3ESP (extra sensory perception) or just guessing?0.198 0.200 0.202 0.204 0.206 0.208True value forjust guessing (0.200)Pratt & Woodruff’sproportion (0.2082)Deck of equalnumber of Zener/Rhinecardsn=60,000 random drawsresulting in 12,489 correct guessesCan sampling variations alone account for Pratt & Woodruff’s success rate = 20.82% correct vs. 20% expected.Sample proportionsFrom 7 just-guessing gamesDifferent?STAT 10, UCLA, Ivo DinovSlide 4ESP or just guessing?0.196 0.198 0.200 0.202 0.204 0.206 0.2080.194True value for just guessingPratt & Woodruff’sproportionSample proportions from 400“just-guessing” experimentsSTAT 10, UCLA, Ivo DinovSlide 5Was Cavendish’s experiment biased?A number of famous early experiments of measuring physical constants have later been shown to be biased.Mean density of the earthTrue value = 5.517Cavendish’s data: (from previous Example)5.36, 5.29, 5.58, 5.65, 5.57, 5.53, 5.62, 5.29, 5.44, 5.34, 5.79, 5.10, 5.27, 5.39, 5.42, 5.47, 5.63, 5.34, 5.46, 5.30, 5.75, 5.68, 5.85n = 23, sample mean = 5.483, sample SD = 0.1904STAT 10, UCLA, Ivo DinovSlide 6Was Cavendish’s experiment biased?5.45 5.50 5.55 5.60Truevalue (5.517)Cavendishmean (5.483)21.5% of the means weresmaller than this.0335 .0335SD=0.1904SD=0.1904N(5.517,0.1904)Simulate taking400 sets of 23measurementsfromN(5.517,0.1904).Plotted are theresults of thesample means.Are the Cavendishvalues unusuallydiff. From truemean?2STAT 10, UCLA, Ivo DinovSlide 7Cavendish: measuring distances in std errors-3 -2 -1012320.5% of samples had tvalues smaller than this.844 .844Cavendish t -value = 0.844009.1.3 Sample t -values from 400 unbiased experiments (each t -value is distance between sample mean and 5.517 in std errors).00Cavendishdata lies withinthe central 60% of the distributionSTAT 10, UCLA, Ivo DinovSlide 8-3 -2 -101230.204 0.2040.844 0.844Figure 9.1.4Student(df=22) density.STAT 10, UCLA, Ivo DinovSlide 9Measuring the distance between the true-value and the estimate in terms of the SEz Intuitive criterion: Estimate is credible if it’s not far away from its hypothesized true-value!z But how far is far-away?z Compute the distance in standard-terms:z Reason is that the distribution of T is known in some cases (Student’s t, or N(0,1)). The estimator (obs-value) is typical/atypical if it is close to the center/tail of the distribution.SEterValueTrueParameEstimatorT−−−−====STAT 10, UCLA, Ivo DinovSlide 10Comparing CI’s and significance testsz These are different methods for coping with the uncertainty about the true value of a parameter caused by the sampling variation in estimates.z Confidence interval: A fixed level of confidence is chosen. We determine a range of possible values for the parameter that are consistent with the data (at the chosen confidence level).z Significance test: Only one possible value for the parameter, called the hypothesized value, is tested. We determine the strength of the evidence (confidence) provided by the data againstthe proposition that the hypothesized value is the true value.STAT 10, UCLA, Ivo DinovSlide 11Reviewz What intuitive criterion did we use to determine whether the hypothesized parameter value(p=0.2 in the ESP Example, and µ= 5.517 in Earth density ex.) was credible in the light of the data?(Determine if the data-driven parameter estimate is consistent with the pattern of variationwe’d expect get if hypothesis was true. If hypothesized value is correct, our estimate should not be far from its hypothesized true value.)z Why was it that µ= 5.517 was credible in Ex. 2, whereas p=0.2 was not credible in Ex. 1?(The first estimate is consistent, and the second one is not, with the pattern of variation of the hypothesized true process.)STAT 10, UCLA, Ivo DinovSlide 12Reviewz What do t0-values tell us? (Our estimate is typical/atypical, consistent or inconsistent with our hypothesis.)z What is the essential difference between the information provided by a confidence interval (CI) and by a significance test (ST)?(Both are uncertainty quantifiers. CI’s use a fixed level of confidence todetermine possible range of values. ST’s one possible value is fixed and level of confidence is determined.)3STAT 10, UCLA, Ivo DinovSlide 13Guiding principlesWe cannotrule in a hypothesized value for a parameter, we can only determine whether there is evidence to rule out a hypothesized value.The null hypothesis tested is typically a skeptical reactionto a research hypothesisHypothesesSTAT 10, UCLA, Ivo DinovSlide 14Commentsz Why can't we (rule-in) prove that a hypothesized value of a parameter is exactly true?(Because when constructing estimates based on data, there’s always sampling and may be non-sampling errors, which are normal, and will effect the resulting estimate. Even if we do 60,000 ESP tests, as we saw earlier, repeatedly we are likely to get estimates like 0.2 and 0.200001, and 0.199999, etc. – non of which may be exactly the theoretically correct, 0.2.)z Why use the rule-out principle? (Since, we can’t use the rule-in method, we try to find compelling evidence against the observed/data-constructed estimate – to reject it.)z Why is the null hypothesis & significance testing typically used?(Ho: skeptical reaction to a research hypothesis; ST is used to check if differences or effects seen in the data can be explained simply in terms of sampling variation!)STAT 10, UCLA, Ivo DinovSlide 15Commentsz How can researchers try to demonstrate that effects or differences seen in their data are real? (Reject the hypothesis that there are no effects)z How does the alternative hypothesis typically relate to a belief, hunch, or research hypothesis that initiates a study? (H1=Ha: specifies the type of departure from the null-hypothesis, H0(skeptical reaction), which we are expecting (research hypothesis itself).z In the Cavendish’s mean Earth density data,


View Full Document

UCLA STATS 10 - Significance Testing

Download Significance Testing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Significance Testing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Significance Testing 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?