Unformatted text preview:

PSY 394U Do It Yourself Statistics Chapter 4 One Sample Tests In this chapter we will follow up with some more concrete examples based upon the concepts introduced in the last chapter We will learn how to determine whether a descriptive statistic such as the median mean standard deviation etc is or is not consistent with a predicted value We will also learn how to determine the range of values over which we can expect a statistic to vary that is how to determine the sampling distribution In fact these are really two side of the same coin In order to use Monte Carlo and bootstrapping methods we are going to need to know how to probe our sampling distributions for some information The three most common types of information we extract from a sampling distribution are 1 the standard deviation which is the standard error of the mean under traditional methods 2 95 confidence intervals and 3 a probability of some particular value arising from the conditions that generated the sampling distribution a significance These three are tightly coupled and can be thought of as different ways of expressing the same basic information Going to the movies Consider the data shown in Figure 4 1 which shows a histogram of how many people out of a sample of 100 have seen x movies in the past week These data moviewatch txt are available on the webpage download them and work through the examples in this chapter The data are highly skewed because the vast majority of people 70 have either seen 0 or 1 movie in the past week and 7 people in the sample must be movie critics having seen 7 or more movies per week Let s say we wanted to find out if people typically watched more than one movie per week We could tackle this problem a couple different ways First let s take a bad but very easy approach let us test whether mean number of movies watched was larger than 1 The mean of our moviewatch data is 1 59 and traditional statistics using Central Limit Theorem CLT can easily tell us how how likely it is that this mean comes from a sampling distribution that is truly centered around 1 PSY 394U Do It Yourself Statistics Figure 4 1 The number of people reporting having seen x movies last week vs x the number of movies With the MATLAB Statistics Toolbox or any off the shelf statistical software we can simply do a one sample t test of the hypothesis that our measured mean is significantly greater than 1 without the need to understand the underlying principles We type h p ci stats ttest moviewatch 1 05 right and then look at the output h p 1 0 0092 ci 1 1812 Inf stats tstat 2 3962 df 99 PSY 394U Do It Yourself Statistics sd 2 4622 If you remember what a t test is about this should be fairly clear even if you are new to MATLAB If you are rusty on the t test however what the above command is saying is test to see if the mean of moviewatch is greater than a mean of 1 0 What the output is telling us is that if the true mean were 1 movie per week and the data were distributed normally and we were willing to accept the mean as a good measure of central tendency for these data then there is about 1 chance the p value of 0 0092 that we would have seen a mean as large or larger than the one we actually obtained Note that this is below two of the common cut off values for statistical significance 0 01 and 0 05 A more do it yourself approach but one still reliant on the above assumptions is following First we compute the standard deviation of the data and then use it to compute the expected standard error of the sampling distribution of the mean using Central Limit Theorem mymean mean moviewatch myn length moviewatch mysd std moviewatch myse mysd sqrt myn the mean compute number of samples the standard deviation the standard error by CLT Now we can picture what the sampling distribution of the mean should look like we just need to draw a Gaussian distribution whose mean is our measured mean 1 590 and whose standard deviation is the standard error we just computed 0 246 We also know that around 95 of the distribution should fall between the mean and 2 standard errors which is about 1 098 and 2 082 This gives us a way to check our drawing xvals linspace 0 3 distofmeans normpdf x mymean myse figure plot xvals distofmeans make and x axis normal dist plot it and draw a dashed line at x 1 for reference line 1 1 0 max distofmeans LineStyle The result is shown in Figure 4 2 and should look very much like what you get when you enter the above commands Notice that this analysis gives us qualitatively the same result as the traditional t test it looks fairly unlikely that our measured mean 1 59 and a mean of 1 belong to the same distribution To be more quantitative about this we could compute the area of our sampling distribution less than a mean of 1 normcdf 1 mymean myse ans 0 0083 PSY 394U Do It Yourself Statistics And this gives us about a 1 chance of seeing a mean as small or smaller than 1 given that the true mean is equal to 1 59 our measured mean Notice that we ve asked the mirror image question from the traditional t test could a value of 1 come from a distribution centered on 1 59 vs could a value of 1 59 come from a distribution centered on 1 but it amounts to the same thing and we get the same answer when we assume the same standard error about these numbers 1 Figure 4 2 The sampling distribution of the mean for the number of movies per week by Central Limit Theorem The dashed line shows that an average of 1 movie per week is highly unlikely Alternatively we can report our mean value with its 95 confidence interval To compute the confidence interval under Central Limit Theorem we can use inverse normal probability density function norminv or use the ci value from ttest function 1 The small discrepancy comes from the fact that the t test uses Gosset s i e Student s t distribution rather than the standard normal distribution which is technically correct when estimating the population variance from a sample variance The difference is negligible for large n 30 sample sizes PSY 394U Do It Yourself Statistics ci norminv 025 975 mymean myse ci 1 1074 2 0726 In English this function call says Give me the 2 5 and 97 5 percentiles of a normal distribution whose mean is the same as mean of my distribution and whose standard deviation is the same as standard error of my distribution Note that the 95 confidence interval does not include 1 and thus …


View Full Document

UT PSY 394U - Chapter 4 One Sample Tests

Documents in this Course
Roadmap

Roadmap

6 pages

Load more
Loading Unlocking...
Login

Join to view Chapter 4 One Sample Tests and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Chapter 4 One Sample Tests and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?