LECTURE 25Outline• Reference: Section 9.4• Course VI Underground GuideEvaluationshttps://sixweb.mit.edu/student/evaluate/6.041-f2010https://sixweb.mit.edu/student/evaluate/6.431-f2010• Review of simple binary hypothesis tests– examples• Testing composite hypotheses– is my coin fair?– is my die fair?– goodness of fit testsSimple binary hypothesis testing– null hypothesis H0:X ∼ pX(x; H0) [or fX(x; H0)]– alternative hypothesis H1:X ∼ pX(x; H1) [or fX(x; H1)]– Choose a rejection region R;reject H0iff data ∈ R• Likelihood ratio test: reject H0ifpX(x; H1)pX(x; H0)> ξ orfX(x; H1)fX(x; H0)> ξ– fix false rejection probability α; (e.g.,α =0.05)– choose ξ so that P(reject H0; H0)=αExample (test for normal mean)• n data points, i.i.d.H0: Xi∼ N(0, 1)H1: Xi∼ N(1, 1)• Likelihood ratio test; rejection region:(1/√2π)nexp{−!i(Xi− 1)2/2}(1/√2π)nexp{−!iX2i/2}> ξ– algebra: reject H0if:"iXi> ξ%• Find ξ%such thatP#n"i=1Xi> ξ%; H0$= α– use normal tablesExample (test for normal variance)• n data points, i.i.d.H0: Xi∼ N(0, 1)H1: Xi∼ N(0, 4)• Likelihood ratio test; rejection region:(1/2√2π)nexp{−!iX2i/(2 · 4)}(1/√2π)nexp{−!iX2i/2}> ξ– algebra: reject H0if"iX2i> ξ%• Find ξ%such thatP#n"i=1X2i> ξ%; H0$= α– the distribution of!iX2iis known(derived distribution problem)– “chi-square” distribution;tables are availableComposite hypotheses• Got S = 472 heads in n = 1000 tosses;is the coin fair?– H0: p =1/2 versus H1: p &=1/2• Pick a “statistic” (e.g., S)• Pick shape of rejection region(e.g., |S − n/2| > ξ)• Choose significance level (e.g., α =0.05)• Pick critical value ξ so that:P(reject H0; H0)=αUsing the CLT:P(|S − 500| ≤ 31; H0) ≈ 0.95; ξ = 31• In our example: |S − 500| = 28 < ξH0not rejected (at the 5% level)Is my die fair?• Hypothesis H0:P(X = i)=pi=1/6, i =1, . . . , 6• Observed occurrences of i: Ni• Choose form of rejection region;chi-square test:reject H0if T ="i(Ni− npi)2npi> ξ• Choose ξ so that:P(reject H0; H0)=0.05P(T>ξ; H0)=0.05• Need the distribution of T :(CLT + derived distribution problem)– for large n, T has approximatelya chi-square distribution– available in tablesDo I have the correct p df ?• Partition the range into bins– npi: expected incidence of bin i(from the pdf)– Ni: observed incidence of bin i– Use chi-square test (as in die problem)• Kolmogorov-Smirnov test:form empirical CDF,ˆFX, from data(http://www.itl.nist.gov/div898/handbook/)• Dn= maxx|FX(x) −ˆFX(x)|• P(√nDn≥ 1.36) ≈ 0.05What else is there?• Systematic methods for coming up withshape of rejection regions• Methods to estimate an unknown PDF(e.g., form a histogram and “smooth” itout)• Efficient and recursive signal processing• Methods to select b etween less or morecomplex models– (e.g., identify relevant “explanatoryvariables” in regression models)• Methods tailored to high-dimensionalunknown parameter vectors and hugenumber of data points (data mining)• etc.
View Full Document