DOC PREVIEW
MIT HST 583 - PART 3: REGRESSION ANALYSIS AND THE GENERALIZED LINEAR MODEL

This preview shows page 1-2-3-25-26-27 out of 27 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

WORKSHOP ON THE ANALYSIS OF NEURAL DATA 2001MARINE BIOLOGICAL LABORATORYWOODS HOLE, MASSACHUSETTSA Review of StatisticsA. Simple RegressionD. Properties of Parameter EstimatesB. Model AssumptionsC. Model FittingOr in matrix formRemarksOrD. Properties of the Parameter EstimatesFigure 2. Fit and Confidence Intervals for a Simple Linear Regression ModelE. Model Goodness-of-FitTSS = 63.82ESS = 45.69Emery N. Brown, M.D., Ph.D.WORKSHOP ON THE ANALYSIS OF NEURAL DATA 2001MARINE BIOLOGICAL LABORATORYWOODS HOLE, MASSACHUSETTSA REVIEW OF STATISTICSPART 3: REGRESSION ANALYSIS AND THE GENERALIZED LINEAR MODELEMERY N. BROWNNEUROSCIENCE STATISTICS RESEARCH LABORATORYDEPARTMENT OF ANESTHESIA AND CRITICAL CAREMASSACHUSETTS GENERAL HOSPITALDIVISION OF HEALTH SCIENCES AND TECHNOLOGYHARVARD MEDICAL SCHOOL / MITRegression Analysis/GLM WAND 2001 Emery N. Brown, M.D., Ph.D.Page 2Regression AnalysisA. Simple RegressionB. Model AssumptionsC. Model FittingD. Properties of Parameter EstimatesE. Model Goodness-of-FitF-testR2Analysis of ResidualsF. The Geometry of Regression (Method of Least-Squares)Regression Analysis/GLM WAND 2001 Emery N. Brown, M.D., Ph.D.Page 3A. Simple RegressionAssume we have a data consisting of pairs of two variables and we denote them as ()11,xy,()22,xy, …, (),nnxy. For example, x and y might be measurements of height and weight of a setof individuals from a well-defined cohort. Let’s assume that there is a linear relation between xand y. We assume that the linear relation may be written asyxαβ=+Example 1.For example let us consider this example taken from Draper and Smith (1981), AppliedRegression AnalysisFigure 1. Relation between Monthly Steam Production and Mean AtmosphericTemperature.The variable y is the amount of steam produced per month in a plant and variable x is the meanatmospheric temperature. There is an obvious negative relation.Regression Analysis/GLM WAND 2001 Emery N. Brown, M.D., Ph.D.Page 4B. Model AssumptionsWe assumei)[]|Eyx xαβ=+ii) The x’s are fixed non-random covariatesiii) The y’s are independent Gaussian random variables with mean ixαβ+ and variance 2σC. Model FittingOur objective is to estimate the parameters ,αβand 2σ. Because y is assumed to have aGaussian distribution conditional on x, a logical approach is to use maximum likelihoodestimation. For these data the joint probability density (likelihood) is ()2|,, ,fy xαβσ=()22211|,2NNiiify xαβ σπσ=+=∏exp ()22112Niiiyxαβσ=−−−∑The log likelihood is()()()222211|,,, log222NiiyxNlogf y xαβαβσ πσσ=−−=− −∑Regression Analysis/GLM WAND 2001 Emery N. Brown, M.D., Ph.D.Page 5Differentiating with respect to αand βyields()()21log | , , ,2Niiifyxyxαβσαβα=∂=− − −∂∑()()21log | , , ,2Niiiifyxyxxαβσαββ=∂=− − −∂∑Setting the derivative equal to zero yields the normal equations11NNiiiiNxyαβ==+=∑∑ 2111NNNiiiiiiixxxyαβ===+=∑∑∑Or in matrix form11211 1NNiiiiNN Nii iiii iNx yxx xyαβ==== ==  ∑∑∑∑ ∑111211 1NNiiiiNN Nii iiii iNx yxx xyαβ−==== ==  ∑∑∑∑ ∑Regression Analysis/GLM WAND 2001 Emery N. Brown, M.D., Ph.D.Page 6The solution for βand αare()()()11 1111222111/ˆ/NNNiiiiiNNNiiiiiixy x NxxyyxxxxNβ======−−−==−−∑∑∑∑∑∑ˆyxαβ=−We may write any estimate as()ˆˆˆˆˆˆyxyxxyxxαβ β β β=+ =− + =+ −If we go back and differentiate the log likelihood with respect to 2σ, we obtain the maximumlikelihood estimate()221ˆˆˆ/NiiiyxNσαβ==−−∑Remarks1. Choosing αand βby the maximum likelihood is equivalent to the method of least square inthis case. By the method of least squares, we minimize the sum of the squared initial deviationsof the data from the regression line.2. The estimate of ˆαshows that every regression line goes through the point (),xy.3. The residuals are ˆiiyy−and are the components in the data which the model does not explain.We note that()ˆiiyy xxβ=+ − ()() ()11 1ˆ0NN Nii i iii iyy yy xxβ== =−= −+ −=∑∑ ∑Regression Analysis/GLM WAND 2001 Emery N. Brown, M.D., Ph.D.Page 74. The Pythagorean Relation()()ˆˆii i iyy yy yy−= −− −() ()(){}221ˆˆNii i iiyy yy yy=−= −−−∑∑ () () ()()2211 1ˆ2NN Nii iiii iyy yy yyyy== ==−+−− −−∑∑ ∑N.B. ()() ()()1122NNii iiiiyy xx yyxxββ==−− −=− −−∑∑ ()2212Niixxβ==− −∑()21ˆ2Niiyy==− −∑,and ()() ()222111ˆˆNNNii i iiiiyy yy yy===−= −− −∑∑∑.Or () () ( )222111ˆˆNNNiiiiiiiyy yy yy===−= −+ −∑∑∑Total sum ofsquares(TSS)=Explained sum ofsquares(ESS)+Residual sum ofsquares(RSS)Regression Analysis/GLM WAND 2001 Emery N. Brown, M.D., Ph.D.Page 85. Correlation and RegressionNote that the correlation coefficient for x and y is()()()()1122211NiiixyNNiiiixxyyrxx yy===−−=−−∑∑∑and recall that()()()121NiiiNiixxyyxxβ==−−=−∑∑Hence,()()122121NiixyNiiyyrxxβ==−=−∑∑The regression coefficient is a scale version of the correlation coefficient.D. Properties of the Parameter EstimatesThe variances of the parameter estimatesVar ()()221ˆNiixxσβ==−∑Regression Analysis/GLM WAND 2001 Emery N. Brown, M.D., Ph.D.Page 9Var ()()22121ˆNiiNiixxxσα===−∑∑If we estimate 2σ by 2ˆσthen 1ρ− confidence intervals for the parameters based on the t-distribution are()2,1 /21221ˆˆnNiitxxρσβ−−=±−∑()12212,1 /221ˆˆNiinNiixtNxxρασ=−−=±−∑∑We also invert the above statistics to test hypotheses about the regression coefficients byconstructing a t-test.To construct a confidence interval for a predicted ky value at a given x value, kx, we note thatthe()()()22221varkkNiixxyNxxσσ=−=+−∑and hence an approximate 1ρ− confidence interval is()()1222,1 /2211ˆˆkknNiixxytNxxρσ−−=−±+−∑Example 1. (continued)Regression Analysis/GLM WAND 2001 Emery N. Brown, M.D., Ph.D.Page 10Figure 2. Fit and Confidence Intervals for a Simple Linear Regression ModelE. Model Goodness-of-FitA


View Full Document

MIT HST 583 - PART 3: REGRESSION ANALYSIS AND THE GENERALIZED LINEAR MODEL

Download PART 3: REGRESSION ANALYSIS AND THE GENERALIZED LINEAR MODEL
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view PART 3: REGRESSION ANALYSIS AND THE GENERALIZED LINEAR MODEL and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view PART 3: REGRESSION ANALYSIS AND THE GENERALIZED LINEAR MODEL 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?