ASU MAT 294 - Module III Hypothesis Testing

Unformatted text preview:

Module III – Hypothesis TestingSampling distribution of the sample mean is the distribution of all possible sample means () of a given sample size from a population.Note:If the population variable is normally distributed then is normally distributed regardless of the sample size.Inferential StatisticsProperties of Chi-square () curvesChi-square goodness of fit testType of violent crimeRelative frequencyType of violent crimefrequencyType of violent crimeRelative frequency (p)Expected frequency (E = np)Type of violent crimeObserved frequencyOExpected frequencyEDifferenceO - E(O – E)Chi-square subtotalModule III – Hypothesis TestingSampling distribution of the sample mean is the distribution of all possible sample means (~x) of a given sample size from a population. The larger the sample size, the smaller the sampling error tends to be in estimating a population mean, , by a sample mean ~x.For samples of size n, the mean of the variable ~x is denoted by ~x, and ~x= for each sample size.The population standard deviation is denoted by  . For samples of size n, the standard deviation of the variable ~x is denoted by ~x, andnx~ for each sample size.Note: - If the population variable is normally distributed then ~x is normally distributed regardless of the sample size.- If the sample size is large, then ~x is approximately normally distributed, regardless of the distribution of the population variable.Inferential Statistics68.26-95.44-99.74 Rule: If the population variable is normally distributed, the 68.26-95.44-99.74 Rule states that 95.44% of all possible observations lie within 2 standard deviations to either side of the mean. If we apply this rule to the variable ~x, 95.44% of all samples of size n have the mean within nx2~2  of  . Or, equivalently, 95.44% of all samples of size n have the property that the interval ]~2~,~2~[xxxx may or may not contain.]~2~,~2~[xxxx is called the confidence interval and 95.44% is the confidence level that the interval may or may not contain Hypothesis TestsTerminology:A hypothesis is a statement that something is trueNull hypothesis is a hypothesis to be testedNotation:):(00HAlternative hypothesis is a hypothesis to be considered as an alternative to the null hypothesisNotation: ):(0aH- two-tailed test):(0aH- left-tailed test):(0aH- right-tailed testBasic Logic behind carrying out the hypothesis test for a normally distributed population variable:- If a sample mean ~x is approximately equal to the population mean , we are inclined not to reject 0H.- If a sample mean ~x differs too much from the population mean, we are inclined to reject 0Hand conclude that the alternative hypothesis is true.- Using the “95.44%” part of the 68.26-95.44-99.74 Rule, if a sample mean ~x is more than two standard deviations from the population mean , we reject the null hypothesis):(00H, and conclude the alternative hypothesis ):(0aH.Properties of Chi-square (2) curves- The total area under the 2- curve equals 1- A 2- curve starts at 0 on the horizontal axis and extends to the right asymptotically to the horizontal axis.- A 2- curve is right-skewed- As the number of degrees of freedom (1ndf, where n is the sample size) becomes larger, 2- curves look increasingly like normal curves. df = 5 df = 10 df = 19A variable is said to have a chi-square distribution if its distribution has a the shape of achi-square curveChi-square goodness of fit testThis procedure can be used to perform a hypothesis test about the distribution of a qualitative variable or a discrete quantitative variable that has only finitely many possiblevalues.Example:A violent crime is classified as murder, forcible rape, robbery, or aggravated assault. Distribution of violent crimes in the United States in 1995Type of violent crime Relative frequencyMurder 0.012Forcible rape 0.054Robbery 0.323Agg.. assault 0.6111.000Sample results for 500 randomly selected violent-crime reports from last yearType of violent crime frequencyMurder 9Forcible rape 26Robbery 144Agg.. assault 321500Population – last years reported violent crimesVariable – type of violent crimePossible values of variable – murder, forcible rape, robbery, and aggravated assault.Null hypothesis to be tested:0H: Last year’s violent-crime distribution is the same as the 1995 distributionAlternative hypothesis:aH: Last year’s violent-crime distribution is different from the 1995 distributionExpected frequencies if last year’s violent-crime distribution is the same as the 1995 distribution:Expected frequency E = np, where n is the sample size and p is the relative frequency from the distribution of violent crimes in 1995.Type of violent crime Relative frequency (p) Expected frequency (E = np)Murder 0.012 500(0.012) = 6Forcible rape 0.054 500(0.054) = 27Robbery 0.323 500(0.323) = 161.5Agg.. assault 0.611 500(0.611) = 305.5Question: Do the frequencies observed last year match the expected frequencies?To answer this question, we perform the following steps:- Determine whether the expected frequencies satisfy the assumptions below:1. All expected frequencies are 1 or greater. (Yes)2. At most 20% of the expected frequencies are less than 5. (none of the expected frequencies are less than 5)- Decide the significance level, . We will choose to perform the test at the 5% significance level, or 05.0.(TYPE I ERROR: Rejecting the null hypothesis when in fact it is true.The probability of making a Type I error is called the significance level, , of a hypothesis test)- Compute the test statistic (2= the sum of the chi-square subtotals) that measures how good the fit is.Type ofviolent crimexObservedfrequencyOExpectedfrequency E DifferenceO - E(O – E)2Chi-squaresubtotal(O – E)2/EMurder 9 6 3 9 1.5Forcible rape 26 27 -1 1 0.037Robbery 144 161.5 -17.5 306.25 1.896Agg.. assault 321 305.5 15.5 240.25 0.786500 500 0 4.219From the table 2= (O – E)2/E = 4.219- Find the critical value 2 with df = k – 1, where k is the the number of possible values of the variable “type of violent crime”. In our example k = 4, so df = 4 – 1 =3 and 815.7205.0 from Table provided. Do not reject 0H Reject 0H


View Full Document

ASU MAT 294 - Module III Hypothesis Testing

Download Module III Hypothesis Testing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Module III Hypothesis Testing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Module III Hypothesis Testing 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?