Unformatted text preview:

Introduction to Statistics in Psychology PSY 201 Professor Greg Francis INFERENTIAL STATISTICS INFERENTIAL STATISTICS when we get a set of data it is either from all possible sources population or a subset of sources sample to estimate a parameter with X with s Sampling distribution of the mean with inferential statistics we take a random sample and try to infer something about the population Why this course exists we want to do two things Lecture 16 1 test hypotheses about parameters measures of the population 1 Sample randomly Calculate the estimate 2 Compare the estimate to an underlying distribution of estimates from other samples 3 Consider probability associated with outcomes with random sampling and make inferences 2 estimate parameters 2 3 SAMPLING DISTRIBUTION DISTRIBUTION suppose we have a population with a mean and a standard deviation the different X i sample means that are calculated will be related to each other because they all come from the same population which has a population mean of this distribution involves frequencies of means rather than frequencies of scores suppose we take a different sample from the population and calculate a sample mean X 2 suppose we take a different sample from the population and calculate a sample mean X 3 we can consider a distribution of the sample means same idea as distribution of sum of dice roles 0 03 Frequency suppose we take a sample from the population and calculate a sample mean X 1 for most of inferential statistics we do not deal with the frequency distribution of scores A sampling distribution is the distribution of values of the statistic under consideration from all possible samples of a given size currently the statistic is the sample mean X 0 02 0 01 0 0 20 40 60 80 100 120 Sample Mean 4 5 6 SAMPLING DISTRIBUTION CENTRAL LIMIT THEOREM STANDARD ERROR how do we get the sampling distribution fortunately there are theorems that tell us what the distribution will look like theorems on unbiased estimates also give us the sampling distribution variance and standard deviation as the sample size n increases the sampling distribution of the mean for simple random samples of n cases taken from a population with a mean equal to and a finite variance equal to 2 approximates a normal distribution denote the sampling distribution variance as e g suppose you have a population of 5 people with math scores and you take sample sizes of 3 you must consider every possible group of 3 people from the population turns out there are 10 such groups NOTE the number of samples is greater than the size of the population For a population of size 50 with samples of size 30 there are 47 129 212 243 960 such groups another theorem based on unbiased estimation tells us that the mean of the sampling distribution is 2 X it turns out that 2 2 X n where 2 variance in the population n size of sample 7 8 9 STANDARD ERROR WHY BOTHER WHY BOTHER of course the standard deviation of the sampling distribution is the square root of the variance suppose you know that for a population 455 and 100 an example involving SAT scores this is something we can work with 2 X X or X n also called the standard error of the mean 10 0 04 0 03 then we know the following about a sampling distribution involving sample sizes of 144 students 0 02 0 01 0 400 420 440 460 480 500 Sample Mean 1 The distribution is normal 2 The mean of the distribution is 455 convert sample mean values to z scores 3 The standard error of the mean is 100 144 8 33 calculate percentages proportions 11 12 PROBABILITY PROBABILITY SAMPLING DISTRIBUTION we can answer questions like everything is just like before what is the probability of randomly selecting a sample with a mean X such that convert raw scores sample means to z scores X z X 440 455 z440 1 80 8 33 460 455 z460 0 60 8 33 the sampling distribution has two critical properties 440 X 460 area under the curve 0 04 From the standard normal table we can calculate that the area between those z scores is 0 03 0 02 0 01 0 400 420 440 460 480 1 As sample size n increases the sampling distribution of the mean becomes more like the normal distribution in shape even when the population distribution is not normal 2 As the sample size n increases the variability of the sampling distribution of the mean decreases the standard error decreases 0 6898 500 Sample Mean 13 14 15 SHAPE VARIABILITY VARIABILITY with large sample sizes all sampling distributions look like normal distributions from our calculation of standard error X n means the conclusions we draw from sampling distributions are not dependent on the shape of the population distribution we see that increasing n makes for smaller values of X but if n 20 100 X 22 36 20 compare to the 8 33 with n 144 a remarkable result that is due to the central limit theorem e g for n 144 in our previous example X 8 33 0 04 0 03 0 02 0 01 0 04 0 400 420 440 0 02 0 01 0 400 420 440 460 480 500 Sample Mean 16 460 Sample Mean 0 03 17 18 480 500 VARIABILITY VARIABILITY SAMPLING OR if n 1000 100 3 16 X 1000 compare to the 8 33 with n 144 increasing the sample size decreases the variability of sample means to use the sampling distribution like we want to we must have random samples makes sense if you think about it without random sampling our calculations about probability of sample means are not valid this will get more important later 0 12 0 12 0 1 0 1 0 08 0 08 0 06 0 06 0 04 0 04 0 02 0 02 0 lots of methods of sampling that emphasize different aspects of the data 400 420 440 460 480 500 Sample Mean 0 400 420 440 460 480 500 Sample Mean 19 20 21 TYPES OF SAMPLING WHY STATISTICS WORKS CONCLUSIONS more detail in PSY 203 Experimental Methods we have two ways of finding the sampling distribution of the mean sampling distribution looks like a normal distribution simple random sampling systematic sampling cluster sampling stratified random sampling 1 gather lots of samples calculate means and standard deviations virtually impossible methods of calculating mean and standard deviation if and are known 2 calculate mean and standard deviation of the population use central limit theorem relatively easy samples must be randomly selected the central limit theorem allows us to do inferential statistics without it much of this course would not exist actually there is one other way to do statistics 22 23 24 NEXT TIME using sampling distributions evidence that the theorems work and Marvel at my predictive powers 25


View Full Document

Purdue PSY 20100 - Lecture notes

Loading Unlocking...
Login

Join to view Lecture notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture notes and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?