DOC PREVIEW
ISU STAT 401 - Lecture 33

This preview shows page 1-2-3-4 out of 12 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Stat 401 B – Lecture 331Categorical Variables Response: Highway MPG Explanatory: Type of drive All Wheel Rear Wheel Front Wheel2Indicator Variables We have used indicator variables so that we can trick JMP into analyzing the data using multiple regression.3Categorical Variables There is a more straight forward analysis that can be done with categorical explanatory variables.Stat 401 B – Lecture 334Categorical Variables The analysis is an extension of the two independent sample analysis we did at the beginning of the semester. Body mass index for men and women (Lectures 4 and 5).5Analysis of Variance Response: numerical, Y Explanatory: categorical, X Total Sum of Squares()∑−=2yySSTotal6Sum of Squares Total()990.36697.272==−==∑dfyySSyTotalStat 401 B – Lecture 337Analysis of Variance Partition the Total Sum of Squares into two parts. Due to differences among the sample means for the categories. Due to variation within categories, i.e error variation.8Sum of Squares Factor()iyinyynSSiiiiFactorcategory for mean samplecategory in nsobservatio ofnumber 2==−=∑9Category Sample Means6029.983Front Wheel1726.529Rear Wheel2322.608All WheelSample SizeMeanStat 401 B – Lecture 3310Sum of Squares Drive()()()()3.9327.27983.2960 7.27529.2617 7.27608.22232222=−+−+−=−=∑DriveDriveiiFactorSSSSyynSS11Sum of Squares Error()isinsnSSiiiiErrorcategory for variancesamplecategory in nsobservatio ofnumber 122==−=∑12Category Sample Variances6037.881Front Wheel179.890Rear Wheel2315.613All WheelSample SizeVarianceStat 401 B – Lecture 3313Sum of Squares Error()()()()7.2736881.3759 89.916 613.152212=++=−=∑ErrorErroriiErrorSSSSsnSS14Mean Square A mean square is the sum of squares divided by its associated degrees of freedom. A mean square is an estimate of variability.15Mean Square Factor The mean square factor estimates the variability due to differences among category sample means. If the mean square factor is large, this indicates the category sample means are quite different.Stat 401 B – Lecture 3316Mean Square Error The mean square error estimates the naturally occurring variability, i.e. the error variance, . This is the ruler against which the variability among sample means is measured.2σ17Test of Hypothesis H0: all the category population means are equal. HA: some of the category population means are not equal. Similar to the test of model utility.18Test Statistic F = MSFactor/ MSError P-value = Prob > F If the P-value is small, reject H0and declare that at least two of the categories have different population means.Stat 401 B – Lecture 3319Analysis of VarianceSSTotalN – 1TotalMSErrorSSErrorN – kErrorMSFactor/ MSErrorMSFactorSSFactork – 1Factor (Model)FMSSSdfSource20Analysis of Variance3669.099Total28.2132736.797Error16.52466.15932.32Factor (Model)FMSSSdfSource21Test of Hypothesis F = 16.52, P-value < 0.0001 Because the P-value is so small, there are some categories that have different population means.Stat 401 B – Lecture 3322JMP Response, Y: Highway MPG (numerical) Explanatory, X: Drive (categorical) Fit Y by X231015202530354045505560Highway MPGAll Front RearDriveOneway Analysis of Highway MPG By Drive24JMP Fit Y by X From the red triangle pull down menu next to Oneway select Means/Anova Display options  Uncheck Mean Diamonds Check Mean LinesStat 401 B – Lecture 3325Test of Significance The test of significance, like the test of model utility, is very general. We know there are some categories with different population means but which categories are they?26Multiple Comparisons In ANOVA, a statistically significant F test is often followed up by a procedure for comparing the sample means of the categories.27Least Significant Difference One multiple comparison method is called Fisher’s Least Significant Difference, LSD. This is the smallest difference in sample means that would be declared statistically significant.Stat 401 B – Lecture 3328Least Significant DifferencejinndftnnRMSEtLSDjiErrorji and categoriesfor sizes sample theare and freedom of degrees andconfidence 95%for valuea is where11**=+=29Least Significant Difference When the number of observations in each category is the same, there is one value of LSD for all comparisons. When the numbers of observations in each category are different, there is one value of LSD for each comparison.30Compare All to Rear Wheel All Wheel: ni= 23 Rear Wheel: nj= 17 t*= 1.9847 RMSE = 5.3116Stat 401 B – Lecture 3331Compare All to Rear Wheel()372.31712313116.59847.1=⎟⎠⎞⎜⎝⎛+=LSDLSD32Compare All to Rear Wheel All Wheel: mean = 22.609 Rear Wheel: mean = 26.529 Difference in means = 3.92 3.92 is bigger than the LSD = 3.37, therefore the difference between All Wheel and Rear Wheel is statistically significant.33JMP – Fit Y by X From the red triangle pull down menu next to Oneway select Compare Means – Each Pair Student’s t.Stat 401 B – Lecture 33341.98472t0.05AlphaFrontRearAll-1.92470.55744.78920.5574-3.61590.54894.78920.5489-3.1087Abs(Dif)-LSDFront Rear AllPositive values show pairs of means that are significantly different.FrontRearAllLevelA B C29.98333326.52941222.608696MeanLevels not connected by same letter are significantly different.FrontRearFrontLevelAllAllRear- Level7.3746383.9207163.453922Difference4.7892430.5488600.557427Lower CL9.9600337.2925726.350416Upper CL<.0001*0.0231*0.0199*p-ValueComparisons for each pair using Student's tMeans ComparisonsOneway Analysis of Highway MPG By Drive35Regression vs ANOVA Note that the P-values for the comparisons are the same as the P-values for the slope estimates in the regression on indicator variables.36Regression vs ANOVA Multiple regression with indicator variables and ANOVA give you exactly the same


View Full Document

ISU STAT 401 - Lecture 33

Download Lecture 33
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 33 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 33 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?