DOC PREVIEW
PSU STAT 501 - Best subsets regression

This preview shows page 1-2-22-23 out of 23 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Model selectionStatement of problemExample: Cement dataSlide 4Two basic methods of selecting predictorsWhy best subsets regression?Slide 7What is used to judge “best”?R-squaredAdjusted R-squared or MSEMallow’s Cp criterionSlide 12Facts about Mallow’s CpUsing the Cp criterionSlide 15Slide 16Example: Modeling PIQSlide 18Slide 19Example: Modeling BPSlide 21Slide 22Best subsets regressionModel selection Best subsets regressionStatement of problem•A common problem is that there is a large set of candidate predictor variables.•Goal is to choose a small subset from the larger set so that the resulting regression model is simple, yet have good predictive ability.Example: Cement data•Response y: heat evolved in calories during hardening of cement on a per gram basis•Predictor x1: % of tricalcium aluminate•Predictor x2: % of tricalcium silicate•Predictor x3: % of tetracalcium alumino ferrite•Predictor x4: % of dicalcium silicateExample: Cement data83.35105.0561637.2559.758.7518.2583.35105.0519.546.561637.2559.758.7518.2519.546.5yx1x2x3x4Two basic methods of selecting predictors•Stepwise regression: Enter and remove predictors, in a stepwise manner, until no justifiable reason to enter or remove more.•Best subsets regression: Select the subset of predictors that do the best at meeting some well-defined objective criterion.Why best subsets regression?# of predictors (p-1)# of regression models1 2 : ( ) (x1)2 4 : ( ) (x1) (x2) (x1, x2)3 8: ( ) (x1) (x2) (x3) (x1, x2) (x1, x3) (x2, x3) (x1, x2, x3) 4 16: 1 none, 4 one, 6 two, 4 three, 1 fourWhy best subsets regression?•If there are p-1 possible predictors, then there are 2p-1 possible regression models containing the predictors. •For example, 10 predictors yields 210 = 1024 possible regression models.•A best subsets algorithm determines the best subsets of each size, so that choice of the final model can be made by researcher.What is used to judge “best”?•R-squared•Adjusted R-squared•MSE (or S = square root of MSE)•Mallow’s CpR-squaredSS TOSSESSTOSS RR  12Use the R-squared values to find the point where adding more predictors is not worthwhile because it leads to a very small increase in R-squared.Adjusted R-squared or MSEMSESSTOnSSTOSSEpnnRa11112Adjusted R-squared increases only if MSE decreases, so adjusted R-squared and MSE provide equivalent information.Find a few subsets for which MSE is smallest (or adjusted R-squared is largest) or so close to the smallest (largest) that adding more predictors is not worthwhile.Mallow’s Cp criterionThe goal is to minimize the total standardized mean square error of prediction:  212ˆ1niiippYEYE      niniipiippYVarYEYE1 122ˆˆ1which equals:which in English is:    variancesomebias some pMallow’s Cp criterion pnXXMSESSECppp2),...,(11Mallow’s Cp statisticestimates pwhere:• SSEp is the error sum of squares for the fitted (subset) regression model with p parameters.• MSE(X1,…, Xp-1) is the MSE of the model containing all p-1 predictors. It is an unbiased estimator of σ2.• p is the number of parameters in the (subset) modelFacts about Mallow’s Cp•Subset models with small Cp values have a small total standardized MSE of prediction.•When the Cp value is …–near p, the bias is small (next to none),–much greater than p, the bias is substantial,–below p, it is due to sampling error; interpret as no bias.•For the largest model with all possible predictors, Cp= p (always).Using the Cp criterion•So, identify subsets of predictors for which:–the Cp value is smallest, and–the Cp value is near p (if possible)•In general, though, don’t always choose the largest model just because it yields Cp= p.Best Subsets Regression: y versus x1, x2, x3, x4Response is y x x x x Vars R-Sq R-Sq(adj) C-p S 1 2 3 4 1 67.5 64.5 138.7 8.9639 X 1 66.6 63.6 142.5 9.0771 X 2 97.9 97.4 2.7 2.4063 X X 2 97.2 96.7 5.5 2.7343 X X 3 98.2 97.6 3.0 2.3087 X X X 3 98.2 97.6 3.0 2.3121 X X X 4 98.2 97.4 5.0 2.4460 X X X XStepwise Regression: y versus x1, x2, x3, x4 Alpha-to-Enter: 0.15 Alpha-to-Remove: 0.15 Response is y on 4 predictors, with N = 13 Step 1 2 3 4Constant 117.57 103.10 71.65 52.58x4 -0.738 -0.614 -0.237 T-Value -4.77 -12.62 -1.37 P-Value 0.001 0.000 0.205 x1 1.44 1.45 1.47T-Value 10.40 12.41 12.10P-Value 0.000 0.000 0.000x2 0.416 0.662T-Value 2.24 14.44P-Value 0.052 0.000S 8.96 2.73 2.31 2.41R-Sq 67.45 97.25 98.23 97.87R-Sq(adj) 64.50 96.70 97.64 97.44C-p 138.7 5.5 3.0 2.7Example: Modeling PIQ130.591.5100.72886.28373.2565.75130.591.5170.5127.5100.72886.28373.2565.75170.5127.5PIQMRIHeightWeightBest Subsets Regression: PIQ versus MRI, Height, WeightResponse is PIQ H W e e i i M g g R h h Vars R-Sq R-Sq(adj) C-p S I t t 1 14.3 11.9 7.3 21.212 X 1 0.9 0.0 13.8 22.810 X 2 29.5 25.5 2.0 19.510 X X 2 19.3 14.6 6.9 20.878 X X 3 29.5 23.3 4.0 19.794 X X XStepwise Regression: PIQ versus MRI, Height, Weight Alpha-to-Enter: 0.15 Alpha-to-Remove: 0.15 Response is PIQ on 3 predictors, with N = 38 Step 1 2Constant 4.652 111.276MRI 1.18 2.06T-Value 2.45 3.77P-Value 0.019 0.001Height -2.73T-Value -2.75P-Value 0.009S 21.2 19.5R-Sq 14.27 29.49R-Sq(adj) 11.89 25.46C-p 7.3


View Full Document

PSU STAT 501 - Best subsets regression

Documents in this Course
VARIABLES

VARIABLES

33 pages

Load more
Download Best subsets regression
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Best subsets regression and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Best subsets regression 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?