CV Sample(2, 2)2(4, 4)4(8, 8)8(10, 10)10(14, 14)14All three characteristics (shape, location, dispersion) depend on the sampling method, i.e. all can change if the method changes Effects Of Sample Size 1) Let’s take samples of size 3 with replacement.The total number of possible samples is 35. Sample(2, 2, 2)(4, 4, 4)(8, 8, 8)(10, 10, 10)(14, 14, 14)(2, 4, 10)(2, 4, 14)(2, 8, 14)(4, 8, 14)( 8, 10, 14)(4, 4, 14)(4, 14, 14 )(8, 8, 10)(8, 10, 10)(8, 8, 14)Increasing the sample size made the shape even more normal-looking and decreased the variability as well.TOPIC (8) – Sampling Variability And Sampling Distributions 8-1 TOPIC (8) – SAMPLING VARIABILITY AND SAMPLING DISTRIBUTIONS Recall that we typically cannot census the entire population of interest so we take a sample from that population in order to make estimates and draw conclusions about the population. The sample mean x is the estimator of the unknown population mean μ. Similarly, the sample standard deviation s is the estimator of the unknown population standard deviation σ . Numerical summaries describing sample data are called STATISTICS. Numerical summaries describing population data are called PARAMETERS. Statistics Parameters Mean x μ Standard Deviation s σ Coefficient of Variation CV CV Median m M Correlation Coefficient r ρ Proportion of Successes p π Any statistic that can be calculated from a sample will vary from sample to sample. As a result these statistics have “sampling variability” which implies that they have probability distributions, known as sampling distributions.TOPIC (8) – Sampling Variability And Sampling Distributions 8-2 A) SAMPLING DISTRIBUTION of the Sample Mean x Important Point: The value of x will vary with each sample taken from the population.TOPIC (8) – Sampling Variability And Sampling Distributions 8-3 EXAMPLE Suppose we had a very small population of 5 units with X-values {2, 4, 8, 10, 14}. Here, μ = 7.6 and σ = 4.2778. What is the frequency distribution of the sample mean x based on a random sample of 2 units? 51015Count Axis0 2.5 5 7.5 10 12.5 15 Sample x (2, 4) 3 (2, 8) 5 (2, 10) 6 (2, 14) 8 (4, 8) 6 (4, 10) 7 (4, 14) 9 (8, 10) 9 (8, 14) 11 (10, 14) 12 (4, 2) 3 (8, 2) 5 (10, 2) 6 (14, 2) 8 (8, 4) 6 (10, 4) 7 (14, 4) 9 (10, 8) 9 (14, 8) 11 (14, 10) 12 (2, 2) 2 (4, 4) 4 (8, 8) 8 (10, 10) 10 (14, 14) 14 Mean of x: 6.7)14...3(251=++=xμStd. Deviation of x: 03.322778.42===σσx We can think of the list of sample mean X values as a population!TOPIC (8) – Sampling Variability And Sampling Distributions 8-4 Some Things To Note About The Behavior Of Sample Means: 1) x varies from sample to sample (called SAMPLING VARIABILITY) 2) the average of the = the average of the sample means population sampled μμx= The sample mean x is said to be UNBIASED for the population mean μ 3) The frequency distribution of the sample means does not match the distribution of the original population centered in the same place but the shape and variability (range) are different 4) Knowing the frequency distribution for the sample means allows us to calculate probabilities for the sample mean.TOPIC (8) – Sampling Variability And Sampling Distributions 8-5 For example, what is the probability that the mean from a randomly selected sample is within 1 unit of the true population mean? Pr(6.6 < x < 8.6) = 5/25 = 20% Interpret this? 5) the variability of the < the variability of the sample means X-values in the population sampled σσx< 6) The frequency distribution of the sample means is called the SAMPLING DISTRIBUTION of x. a) Its shape and its variability, σx, depend on the sample size. b) Its center, μx, depends on whether the sampling is unbiased or not. All three characteristics (shape, location, dispersion) depend on the sampling method, i.e. all can change if the method changesTOPIC (8) – Sampling Variability And Sampling Distributions 8-6 Effects Of Sample Size 1) Let’s take samples of size 3 with replacement. The total number of possible samples is 35. ExpectedNormalFrequency Distribution of Sample Means, n=3Upper Boundaries (x <= boundary)No of obs0123456789101102468101214 Sample (2, 4, 8) (2, 8, 10) (2, 10, 14) (4, 8, 10) (4, 10, 14) (2, 2, 4) (2, 4, 4) (8, 14, 14) (10, 10, 14) (10, 14, 14) (2, 2, 2) (4, 4, 4) (8, 8, 8) (10, 10, 10) (14, 14, 14) (2, 4, 10) (2, 4, 14) (2, 8, 14) (4, 8, 14) ( 8, 10, 14) (2, 2, 8) (2, 8, 8) (2, 2, 10) (2, 10,10) (2, 2, 14) (2, 14, 14) (4, 4, 8) (4, 8, 8) (4, 4, 10) (4, 10, 10) (4, 4, 14) (4, 14, 14 ) (8, 8, 10) (8, 10, 10) (8, 8, 14) μx=76. 54.232778.4===nxσσ Increasing the sample size made the shape even more normal-looking and decreased the variability as well.TOPIC (8) – Sampling Variability And Sampling Distributions 8-7 What is the answer to the question about the probability Pr(6.6 < x < 8.6) now? We can get an approximate answer using the fact that it looks like x is normally distributed with a mean of 7.6 and a standard deviation of 2.5: which is higher than the 20% we calculated for sample means based on sampling 2 units without replacement. So taking a larger sample is better for accuracy of the sample mean. Let’s put what we’ve learned about sample means (plus some additional information) into one statement:TOPIC (8) – Sampling Variability And Sampling Distributions 8-8 SAMPLING DISTRIBUTION of x: Suppose we have a population with a mean μ and a standard deviation σ and we take a sample of size n. As long as the sample is random and either we keep the sample size to less than 5% of the population or otherwise we sample with replacement, the frequency distribution of the sample mean has the following characteristics: 1. μμx= 2. σσxn= 3. The shape of the distribution is as follows: a) a bell-curve (Normal), if the original population that we sampled has a Normal distribution. b) (CENTRAL LIMIT THEOREM) a bell-curve if the sample size is relatively large regardless of the shape of the frequency distribution of the original population. “relatively large” = 30 or moreTOPIC (8) – Sampling Variability And Sampling Distributions 8-9 EXAMPLE In a study of the evolutionary history of the amphipod Gammarus minus, one of the variables used to distinguish subspecies is the length of the
View Full Document