Today's classHeteroskedasticityMLE for one sample problemWeighted least squaresEstimating 2Weighted regression exampleRobust methodsExampleM-estimatorsHuber's Hampel's Tukey's Solving for "0362Iteratively reweighted least squares (IRLS) Robust estimate of scaleOther resistant fitting methodsWhy not always use robust regression?- p. 1/18Statistics 203: Introduction to Regressionand Analysis of VarianceRobust methodsJonathan Taylor●Today’s class●Heteroskedasticity●MLE for one sample problem●Weighted least squares●Estimatingσ2●Weighted regression example●Robust methods●Example●M-estimators●Huber’sψ●Hampel’s ψ●Tukey’s ψ●Solving forbβ●Iteratively reweighted leastsquares (IRLS)●Robust estimate of scale●Other resistant fitting methods●Why not always use robustregression?- p. 2/18Today’s class■Weighted regression.■Robust methods.■Robust regression.●Today’s class●Heteroskedasticity●MLE for one sample problem●Weighted least squares●Estimatingσ2●Weighted regression example●Robust methods●Example●M-estimators●Huber’sψ●Hampel’s ψ●Tukey’s ψ●Solving forbβ●Iteratively reweighted leastsquares (IRLS)●Robust estimate of scale●Other resistant fitting methods●Why not always use robustregression?- p. 3/18Heteroskedasticity■In our standard model, we have assumed thatε ∼ N(0, σ2I).That is, that the errors are independent and have the samevariance (homoskedastic).■We have discussed graphical checks for non-constantvariance (heteroskedasticity) but not “remedies” forheteroskedasticity.■Suppose thatε ∼ N(0, σ2D)for some known diagonal matrix D.■Where does D come from? Suppose that we see thatvariance increases like f(Xj), then we might chooseDi= f(Xij).■What is the “maximum likelihood” thing to do?●Today’s class●Heteroskedasticity●MLE for one sample problem●Weighted least squares●Estimatingσ2●Weighted regression example●Robust methods●Example●M-estimators●Huber’sψ●Hampel’s ψ●Tukey’s ψ●Solving forbβ●Iteratively reweighted leastsquares (IRLS)●Robust estimate of scale●Other resistant fitting methods●Why not always use robustregression?- p. 4/18MLE for one sample problem■Consider the simpler problemYi∼ N(µ, σ2Di)with σ2and D’s known.■−2 log L(µ|Y, σ) =nXi=1(Yi− µ)2σ2Di+ n log(2πσ2) +nXi=1log(Di)■Differentiating−2nXi=1(Yi− bµ)σ2Di= 0implyingbµ =nXi=1YiDi/nXi=11Di.■Observations are weighted inversely proportional to theirvariance.●Today’s class●Heteroskedasticity●MLE for one sample problem●Weighted least squares●Estimating σ2●Weighted regression example●Robust methods●Example●M-estimators●Huber’sψ●Hampel’s ψ●Tukey’s ψ●Solving forbβ●Iteratively reweighted leastsquares (IRLS)●Robust estimate of scale●Other resistant fitting methods●Why not always use robustregression?- p. 5/18Weighted least squares■−2 log L(β, σ|Y, D) =nXi=1(Yi− β0−Pp−1j=1βjXij)2σ2Di=1σ2(Y − Xβ)tD−1(Y − Xβ)=1σ2(Y − Xβ)tW (Y − Xβ).with W = D−1.■Normal equations:−2XtW (Y − XbβW) = 0or,bβW= (XtW X)−1XtW Y.■Distribution ofbβWbβW∼ N(β, σ2(XtW X)−1).●Today’s class●Heteroskedasticity●MLE for one sample problem●Weighted least squares●Estimating σ2●Weighted regression example●Robust methods●Example●M-estimators●Huber’sψ●Hampel’s ψ●Tukey’s ψ●Solving forbβ●Iteratively reweighted leastsquares (IRLS)●Robust estimate of scale●Other resistant fitting methods●Why not always use robustregression?- p. 6/18Estimating σ2■What are the right residuals?■If we knew β exactlyYi− β0−p−1Xi=1Xijβj∼ N(0, σ2/Wi).■Suggests that the natural residual iseW,i=pWiYi−bYW,i=pWieiwherebYW= (XtW X)−1XtW Y.■Estimate of σ2bσ2W=1n − pXi=1e2W,i=1n − pXi=1wie2i∼ σ2χ2n−pn − p●Today’s class●Heteroskedasticity●MLE for one sample problem●Weighted least squares●Estimatingσ2●Weighted regression example●Robust methods●Example●M-estimators●Huber’sψ●Hampel’s ψ●Tukey’s ψ●Solving forbβ●Iteratively reweighted leastsquares (IRLS)●Robust estimate of scale●Other resistant fitting methods●Why not always use robustregression?- p. 7/18Weighted regression example●Today’s class●Heteroskedasticity●MLE for one sample problem●Weighted least squares●Estimatingσ2●Weighted regression example●Robust methods●Example●M-estimators●Huber’sψ●Hampel’s ψ●Tukey’s ψ●Solving forbβ●Iteratively reweighted leastsquares (IRLS)●Robust estimate of scale●Other resistant fitting methods●Why not always use robustregression?- p. 8/18Robust methods■We also discussed outlier detection but no specific remedies.■One alternative is to discard potential outliers – not always agood idea.■Outliers can really mess up the sample mean, but haverelatively effect on the sample median.■Could also “downweight” outliers: basis of robust techniques.■Another “cause” of outliers may be that the data is not reallynormally distributed.●Today’s class●Heteroskedasticity●MLE for one sample problem●Weighted least squares●Estimatingσ2●Weighted regression example●Robust methods●Example●M-estimators●Huber’sψ●Hampel’s ψ●Tukey’s ψ●Solving forbβ●Iteratively reweighted leastsquares (IRLS)●Robust estimate of scale●Other resistant fitting methods●Why not always use robustregression?- p. 9/18Example■Suppose that we have a sample (Yi)1≤i≤nfromf(y|µ, σ) =12σe−|y−µ|/σ.This has heavier tails than the normal distribution.■MLE for µbµ = argminµnXi=1|Yi− µ|.■It can be shown that bµ is the sample median (exercise inSTATS116).■Take home message: if errors are not really normallydistributed then least squares is not MLE and the MLEdownweights large residuals relative to least squares.●Today’s class●Heteroskedasticity●MLE for one sample problem●Weighted least squares●Estimatingσ2●Weighted regression example●Robust methods●Example●M-estimators●Huber’s ψ●Hampel’s ψ●Tukey’s ψ●Solving forbβ●Iteratively reweighted leastsquares (IRLS)●Robust estimate of scale●Other resistant fitting methods●Why not always use robustregression?- p. 10/18M-estimators■Depending on the error distribution of ε (assuming i.i.d.) weget different optimization problems.■Suppose thatf(y|µ,
View Full Document