Slide 1OverviewOverviewOverviewOverviewOverviewOverviewOverviewSummaryOverviewPROBABILITY AND STATISTICS IN COMPUTER SCIENCE AND SOFTWARE ENGINEERING Chapter 9: Statistical Inference1OVERVIEWWe’ve been exploring how to create confidence intervals for estimators of population parameters …We saw that if the estimator was unbiased and Normally Distributed, the confidence interval had the form ,where was the standard error of the estimatorThis was an interval centered around the estimate , and the value was the marginThe z-value was a standard normal variable; was the confidence level•62OVERVIEWWe then applied this to the sample mean estimator, which is unbiased and Normally distributed (or approximately Normal for large n)We had earlier derived the form of the standard error for the sample mean, so the confidence interval became Here is the standard deviation of the population distribution, assumed known, and n was the size of the sample•63OVERVIEWWe applied this to the difference of sample means (collected from two populations),Again, we assume the standard deviations of the populations are knownWe saw how to select n large enough to make the margin smaller than a given tolerance•64OVERVIEWWe then began to explore what to do when the population standard deviation was unknown …If the sample size n is large, and the estimator is Normally distributed, we can approximate the standard error of the estimator with the sample estimate …,which in the case of the sample mean would beSimilar formula works for the difference of sample means•65OVERVIEWWe applied this idea to and estimator for population proportions …If we have an population with a subpopulation A, and we want to estimate the probability that a random observation from the population will belong to the subpopulation, we can take a sample of size n and computeThis estimator had an unknown standard error (it depends on the value we are trying to approximate)•66OVERVIEWWe found a confidence interval for this estimator, which is unbiased and approximately normally distributed (it had the form a sum of random variables) for large n …There was similar for the difference in two population proportions and we again saw how to select n large enough to make the margin smaller than a selected tolerance•67OVERVIEWFinally, we saw what to do if we had a Normally distributed unbiased estimator, an unknown population standard deviation, and n was small …This required the use of the Student’s t-distribution, and using it we can form confidence intervals for the sample mean:We used the t-distribution with n-1 degrees of freedom to find this margin•68SUMMARY9Sample SizeSmall (n < 30) Large (n30)Variance/Standard DeviationKnown Z variable Z variableUnknown T variable Z variableSample SizeSmall (n < 30)Variance/Standard DeviationKnown Z variable Z variableUnknown T variable Z variable•This represents the random variable used to create a confidence interval for a sample mean•If the sample size is small, we assume the population distribution is normal•For unknown variance, we use the sample variance as an estimateOVERVIEWIn this lecture, we will continue to explore what happens when we do not know the standard deviation of the underlying population distributionWe’ll see how to compare the means of two populations with unknown standard deviationsThis will be broken up into two cases: When both populations have the same (unknown) standard deviation, and when they have different standard deviationsAgain, we will estimate the standard deviation(s) with our sample statisticsWe will then move on to Hypothesis TestingWe’ll see how to use the tools we have been developing to test hypotheses, and we’ll develop the language necessary to formulate hypothesis testing
View Full Document