Unformatted text preview:

Is  known? Is n large? What statistic is used? What is its distribution?Degrees of Freedom (df)Statisticians use the terms "degrees of freedom" to describe the number of values in the final calculation of a statistic that are free to vary. Consider, for example, the statistic S-square, called sample variance and defined as follows: (xk - ) 2 (x1 - ) 2 + (x2 - ) 2 + … + (xn - ) 2S2 = ------------------- = ------------------------------------------------- n – 1 n - 1Interpretation of S2: average squared deviation (approximately, because n-1 other than n is used as denominator)Sample standard deviation: S =  S2Interpretation: average deviation (average of magnitudes or absolutes of deviations) approximately.If S is small, the sample data are more concentrated around the mean of the sample.measures the central tendency of the data, and is used to estimate population mean.S or S2 measures the variation/dispersion of data (concentrated our more spread out), and is used to estimate population standard deviation or variance. To calculate the S2 of a random sample, we must first calculate the mean of that sample and then compute the sum of the several squared deviations from that mean. While there will be n such squared deviations, only (n - 1) of them are, in fact, free to assume any value whatsoever [Suppose that the first n-1 squared deviations (x1 - ) 2, (x2 - ) 2, …, (xn-1 - ) 2 are free to assumevalues, i.e., x1, x2, …, xn-1 are free to assume values, then xn can not free to take a value because = (1/n) (x1+ x2 + … + xn-1 + xn) and hence xn = n  – (x1+ x2 + … + xn-1).]This is because the final squared deviation from the mean must include the one value of X such that the sum of all the Xs divided by n will equal the obtained mean of the sample. All of the other (n - 1) squared deviations from the mean can, theoretically, have any values whatsoever. For these reasons, the statistic S2 is said to have only (n - 1) degrees of freedom.In other words, once the sample mean  is specified, only n-1 of the n xk’s [equivalently, n-1 of the n squared deviations (xk - )2] are free to take their values. That is why the “degrees of freedom” is defined as n – 1, not n.A further discussion is from the following identity (obtained by inserting  and simplifying):(xk - ) 2 (n-1) S2 n ( - )2  ----------------- = ---------------- + ----------------- 2 2 2Right side: degrees of freedom = nLeft side: total degrees of freedom must also be n. But the second statistic is Z2 (i.e., Chi-square with degrees of freedom 1), and is independent of the first statistic (Chi-square with degrees of freedom n-1) because  and S2 are independent. So S2 has degrees of freedom n-1. -  ZT = --------------- = ---------------------- S /  n  U / (n-1)where U ~ Chi-square (n-1) (1st statistic in the sum of previous page). This explains why tdistribution has degrees of freedom n-1.http://www.animatedsoftware.com/statglos/sgdegree.htmSampling distribution (distribution of any statistic, like , S2 etc)The Sampling Distribution of a statistic is the set of values that we would obtain if we drew an infinite number of random samples from a given population and calculated the statistic on each sample. In doing so, all samples must be of the same size (n). While it is not possible for anyone to actually draw an infinite number of samples, the concept of a sampling distribution can be understood by taking the time to carefully consider the following theoretical exercise. Imagine that our population consists of only three numbers: the number 2, the number 3 and the number 4. Our plan is to draw an infinite number of random samples of size n = 2 and form a sampling distribution of the sample means. The accompanying illustration shows this population and it's first two columns show each of the possible random samples (of size n = 2), that might be drawn from this population. If the first item is a 2, the second item can be either a 2 again, or it can be a 3 or a 4. Remember!, we are drawing a random sample and our population is small, hence we are sampling with replacement. If our first item happened to be a 3, the second item might be a 2, a 3 again, or a 4 and if our first item happened to be a 4 the second item could be either a 2 or a 3 or a 4 again. As seen here, there are only 9 possible combinations of two numbers in a given sample and each of the combinations is equally likely. The same is not true, however, for the means of the samples. The third column in the illustration shows the means of each of the possible samples and the histogram shows the relative frequency of each of these means. In doing so the histogram provides a detailed representation of the sampling distribution of means (of size n = 2) that would be obtained if we were, in fact, able to draw an infinite number of random samples from the indicated population and graphically represent their frequency distribution in a histogram.The Central Limit Theorem is a statement about the characteristics of the sampling distribution of means of random samples from a given population. That is, it describes the characteristics of the distribution of values we would obtain if we were able to draw an infinite number of random samples of a given size from a given population and we calculated the mean of each sample. The Central Limit Theorem consists of three statements: [1] The mean of the sampling distribution of  is equal to the mean of the population from which the samples were drawn. i.e.,   =  [2] The variance of the sampling distribution of means is equal to the variance of the population from which the samples were drawn divided by the size of the samples.   2 = 2 / n[3] If the original population is distributed normally (i.e. it is bell shaped), the samplingdistribution of means will also be normal. If the original population is not normallydistributed, the sampling distribution of means will increasingly approximate a normaldistribution as sample size increases. (i.e. when increasingly large samples are drawn) If sample size n >= 30, is approximately normal and thus  -    - --------------- and ----------------  /  n S /  nare


View Full Document

UNCP MAT 2100 - Degrees of Freedom

Documents in this Course
EXAM 2

EXAM 2

2 pages

Load more
Download Degrees of Freedom
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Degrees of Freedom and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Degrees of Freedom 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?