DOC PREVIEW
MIAMI IES 612 - Lecture Notes

This preview shows page 1-2-3-24-25-26 out of 26 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Example: Manatee Deaths predicted from Number of Boats RegisteredIES 612/STA 4-573/STA 4-576Spring 2005Week 03 – IES612-lecture-week03.docChecking Model Assumptions (OL 13.4) – an initial visitRECALL: Basic ModelYi = 0 +1Xi + i [“simple linear regression”]i ~ indep. N(0, 2)Definition:[Def 1] (Raw) Residuals = observed response – predicted responseoreiyiˆ y i[Def 2] (Standardized Residuals) eiMSEeis2residualstd.dev[Def 3] (Studentized Residuals) eiMSE(1 hii)eis2(1 hii)residualadj.SDAssumption Diagnostic? How do you check the assumption?Remediation?1. E(i) = 0 ] –> E(Yi) = 0 +1Xi–> line is a reasonable model for describing mean change as a function of xD1.1: Plot ei vs. ˆ y iD1.2: Plot ei vs. xi[check to see if pattern exists]D1.3: Plot Yi vs. xi and superimpose plot of ˆ y ivs. xi.D1.4: Large R2/signif. slopeCurvature? Polynomial regression model or nonlinear regression modelSmooth regression? LOWESSTransformation? Log/square root2. V(i) = 2–> V(Yi) =2 –> constant variance –> scatter about the line is the same regardless of the value ofxD2.1: Plot ei vs. ˆ y i[check to see if you have a constant band about zero]Weighted Least Squares?Transformation07:13 Monday, January 14, 2019 13. i ~ NormalD3.1: Normal probability plotof ei [see if linear]D3.2: Histogram of residuals [bell-shaped?]Transformation?Generalized Linear Models (e.g. logistic/probit regression for dichotomous responses; Poisson regression for count responses)4. i independentD4.1: Generally examining the design can suggest if this is trueD4.2: Durbin-Watson testCorrelated regression models?Time series/spatial methods5.* no important omitted variables {relates to pt. 1}D5.1: Plot ei vs. omitted variables [see if pattern]Add omitted variable to a model (multiple regression)6.* no points exerting undo influenceD6.1: Look at statistics that quantify influence (e.g. DFBETAS, DFFITS, etc.)D6.2: Look for extreme X values (break in stemplots of X)Smooth model-robust fitting procedure (e.g. Least AbsoluteValue regression)7.* no extreme outliers impacting inferenceD7.1: Large residual (e.g. standardized/studentized residual >3/2?)D7.2: Break in stemplot of residualsCheck to see if data sheet correct – fix? Don’t simply omit. Report analysis both including/excluding point?Example: Manatee Deaths predicted from Number of Boats Registeredoptions ls=75;data example1; input year nboats manatees; cards;77 447 1378 460 2179 481 2480 498 1681 513 2482 512 2083 526 1584 559 3485 585 3386 614 3387 645 3907:13 Monday, January 14, 2019 288 675 4389 711 5090 719 47;ODS RTF;*file='D:\baileraj\Classes\Fall 2003\sta402\SAS-programs\linreg-output.rtf';proc reg;title 'Number of Manatees killed regressed on the number of boats registered in Florida'; model manatees = nboats / p r cli clm; plot manatees*nboats p.*nboats / overlay; plot r.*nboats r.*p.; * residuals vs x and yhat; plot r.*nqq.; * normal qqplot;run;ODS RTF CLOSE;Residuals plot – model adequate? Constant variance?* now in Excel07:13 Monday, January 14, 2019 3Plot of Residuals vs. Predicted-10-8-6-4-2024680 10 20 30 40 50 60Yhat (predicted response)ResidualSeries1manatees = -41.43 +0.1249 nboatsN 14 Rsq 0.8864AdjRsq0.8769RMSE 4.2764manatees101520253035404550nboats425 450 475 500 525 550 575 600 625 65 0 67 5 700 725Plot manatees*nb oats PRED*nboats07:13 Monday, January 14, 2019 4* now in ExcelScatterplot of Manatee Deaths with superimposed fit01020304050600 100 200 300 400 500 600 700 800Number of Boats (1000s)Manatees KilledStudentized Residuals – outliers?Output StatisticsObs -2-1 0 1 2Cook'sD1| | |0.0172| |** |0.1783| |** |0.1494| **| |0.0915| | |0.0066| *| |0.0217| ****| |0.24407:13 Monday, January 14, 2019 5Output StatisticsObs -2-1 0 1 2Cook'sD8| |** |0.0739| | |0.00510| *| |0.01511| | |0.00012| | |0.00013| |* |0.09114| | |0.027Studentized Residuals-3 -2 -1 0 1 2135791113ObservationNormal errors? - Normal quantile-quantile plot07:13 Monday, January 14, 2019 6manatees = -41.43 +0.1249 nboatsN 14 Rsq 0.8864AdjRsq0.8769RMSE 4.2764Residual-10.0-7.5-5.0-2.50.02.55.07.5Normal Q uantile-3 -2 -1 0 1 2 3Multiple Regression (OL Chapter 12)* More than one predictor variableExample: Lung function in miners exposed to coal dustFEV101COAL 2AGE 3HT 4SMOKING Example: Polynomial regressionY 01X 2X2orY 01X  X  2X  X  2Example: Indicator variables – e.g. different lines in different groupsY 01Igroup 22X 3Igroup 2X  where Igroup2 = 1 (group 2) and Igroup2 = 0 (group 1)07:13 Monday, January 14, 2019 7GROUP 1:Y 02X GROUP 2 :Y 01 23 X So, GROUP 2 INTERCEPT differs from GROUP 1 intercept by 1GROUP 2 SLOPE differs from GROUP 1 slope by 3GENERAL FORM: Yi01Xi12Xi 2K kXikii~ N(0,2)i 1,K ,n(observations)n  k(var iables)Comments:1. “LINEAR” model because the regression coefficients enter the model in a linear way – compareY 01X32sin(X ) andY 0X1So, how does a multiple regression model (MR) differ from simple linear regression (SLR)?i. SLR is the equation of LINE; MR is the equation of a (hyper-)PLANEii. 0 is the mean response when X=0 in SLR while 0 is the mean response when ALL X’s=0 in MRiii. 2 regression coefficients in SLR; k+1 regression coefficients in MRiv. interpretation of coefficients? Partial coefficients in MRv. Model scope (space covered by the Xs)Estimating regression coefficientsLeast squares – minimize    nikikiiiniiiXXXYYEY122211012][Estimate of 2 )()()1(ˆ122parmregofnumbernsobservatioofnumberresidualssquaredofsumknyysnii07:13 Monday, January 14, 2019 8F Test of any relationship between Y and set of predictor variablesH0: 1 = 2 = …=k = 0Ha: at least one of i ≠ 0TS: Fobs = [SS(Reg)/k] / [SS(Resid)/(n-k-1)]= MS(Reg)/MS(Resid)RR: Reject H0 if Fobs > F, k,


View Full Document

MIAMI IES 612 - Lecture Notes

Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?