Unformatted text preview:

Introduction to Statistics inPsychologyPSY 201Professor Greg FrancisLecture 16Sampling distribution of the meanWhy this course exists.INFERENTIAL STATISTICSwhen we get a set of data it is eitherfrom all possible sources (population)or a subset of sources (sample)with inferential statistics we take arandom sample and try to infersomething about the populationwe want to do two things1. test hypotheses about parameters (mea-sures of the population).2. estimate parameters.2INFERENTIAL STATISTICSto estimate a parameter(µ with X, σ with s,...)1. Sample randomly. Calculate the es-timate.2. Compare the estimate to an under-lying distribution of estimates fromother samples.3. Consider probability associated withoutcomes with random sampling andmake inferences.3SAMPLINGsuppose we have a population with amean µ and a standard deviation σsuppose we take a sample from thepopulation and calculate a samplemean X1suppose we take a different samplefrom the population and calculate asample mean X2suppose we take a different samplefrom the population and calculate asample mean X34DISTRIBUTIONthe different Xisample means that arecalculated will be related to each otherbecause they all come from the samepopulation, which has a populationmean of µwe can consider a distribution of thesample means(same idea as distribution of sum ofdice roles)0 20 40 60 80 100 120Sample Mean 00.010.020.03Frequency5DISTRIBUTIONthis distribution involves frequencies ofmeans rather than frequencies ofscoresfor most of inferential statistics we donot deal with the frequencydistribution of scoresA sampling distribution is thedistribution of values of the statisticunder consideration, from all possiblesamples of a given size.currently, the statistic is the samplemean X6SAMPLING DISTRIBUTIONhow do we get the samplingdistribution?e.g.suppose you have a population of 5people with math scoresand you take sample sizes of 3you must consider every possible groupof 3 people from the populationturns out there are 10 such groupsNOTE: the number of samples isgreater than the size of the population!For a population of size 50 withsamples of size 30 there are47,129,212,243,960 such groups7CENTRAL LIMIT THEOREMfortunately, there are theorems thattell us what the distribution will looklikeas the sample size (n) increases, thesampling distribution of the mean forsimple random samples of n cases,taken from a population with a meanequal to µ and a finite variance equalto σ2, approximates a normaldistributionanother theorem based on unbiasedestimation tells us that the mean ofthe sampling distribution is µ8STANDARD ERRORtheorems on unbiased estimates alsogive us the sampling distributionvariance and standard deviationdenote the sampling distributionvariance asσ2Xit turns out thatσ2X=σ2nwhere• σ2= variance in the populatio n• n = size of sample9STANDARD ERRORof course the standard deviation of thesampling distribution is the squareroot of the varianceσX=!""#σ2XorσX=σ√nalso called the standard error ofthe mean10WHY BOTHER?suppose you know that for apopulation, µ = 455 and σ = 100(an example involving SAT scores)then we know the following about asampling distribution involvingsample sizes of 144 students1. The distribution is normal.2. The mean of the distribution is 455.3. The standard error of the mean is100/√144 = 8.33.11WHY BOTHER?this is something we can work with!400 420 440 460 480 500Sample Mean 00.010.020.030.04convert sample mean values to z-scorescalculate percentages, proportions12PROBABILITYwe can answer questions likewhat is the probability of randomlyselecting a sample with a mean X suchthat440 < X<460 ?area under the curve400 420 440 460 480 500Sample Mean 00.010.020.030.0413PROBABILITYeverything is just like b eforeconvert raw scores (sample means) toz-scoresz =X − µσXz440=440 − 4558.33= −1.80z460=460 − 4558.33=0.60From the standard normal table, wecan calculate that the area betweenthose z-scores is0.689814SAMPLING DISTRIBUTIONthe sampling distribution has twocritical properties1. As sample size (n) increases, the sam-pling distribution of the mean be-comes more like the normal di str ibu-tion in shape, even when the popu-lation distribution is not normal.2. As the sample size (n) increases, thevariability of the sampling distribu-tion of the mean decreases (the stan-dard error decreases).15SHAPEwith large sample sizes, all samplingdistributions look like normaldistributionsmeans the conclusions we draw fromsampling distributions are notdependent on the shape of thepopulation distribution!a remarkable result that is due to thecentral limit theorem16VARIABILITYfrom our calculation of standard error:σX=σ√nwe see that increasing n makes forsmaller values of σXe.g. for n = 144 in our previousexample σX=8.33400 420 440 460 480 500Sample Mean 00.010.020.030.0417VARIABILITYbut if n = 20,σX=100√20= 22.36compare to the 8.33 with n = 144400 420 440 460 480 500Sample Mean 00.010.020.030.0418VARIABILITYOR if n = 1000,σX=100√1000=3.16compare to the 8.33 with n = 144400 420 440 460 480 500Sample Mean 00.020.040.060.080.10.1219VARIABILITYincreasing the sample size decreasesthe variability of sample meansmakes sense if you think about it400 420 440 460 480 500Sample Mean 00.020.040.060.080.10.1220SAMPLINGto use the sampling distribution likewe want to, we must have randomsampleswithout random sampling, ourcalculations about probability ofsample means are not valid(this will get more important later)lots of methods of sampling thatemphasize different aspect s of the data21TYPES OF SAMPLINGmore detail in PSY 203: ExperimentalMethods.• simple random sampling• systematic sampling• cluster sampling• stratified random sampling22WHY STATISTICS WORKSwe have two ways of finding thesampling distribution of the mean1. gather lots of samples, calculate meansand standard deviations (virtually im-possible)2. calculate mean and standard devia-tion of the p opulation, use centrallimit theorem (relatively easy)the central limit theorem allows us todo inferential statistics, without it,much of this course would not exist(actually there is one other way to do statistics...)23CONCLUSIONSsampling distribution looks like anormal distributionmethods of calculating mean andstandard deviation if µ and σ areknownsamples must be randomly selected24NEXT TIMEusing sampling distributionsevidence that the theorems workandMarvel at my predictive


View Full Document

Purdue PSY 20100 - Lecture notes

Download Lecture notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?