Unformatted text preview:

POLYNOMIAL REGRESSION (Chapter 9)We have discussed curvilinear regression from transformations and polynomials 1) Transformations generally more interpretable, often more easilyinterpreted in terms of a possible functional relationship. (extrapolation,interpretation of parameters) However, there are cases where the functional relationship is a polynomial.Parabolic shapes exist for a number of relationships, particularly inengineering.2) Polynomials very flexible, and useful where a model must be developedempirically. They fit a wide range of curvature. use when the functional relationship is overly complicated, but a repeatablepattern of the dependent variable is likely. When applied in this context, we generally fit a quadratic, cubic, maybe aquartic, and then see if we can reduce the model by a few terms.In this case, the polynomial may provide a good approximation of the relationshipBasically, with polynomials we can 1) Determine if there is a curvilinear relationship between the Y and X .33 2) Determine if the curvature is Quadratic? Cubic? Quartic? ... 3) Obtain the curvilinear predictive equation for Y on X .33Simplest Polynomial : One independent variable - Second order Y = + X X 3!"3 # 33"" " %2iXYiiXYiNext larger Polynomial : One independent variable - Third order Y = + X X X 3!"3 # $ 33$3""""%2iXYiiXYi Y = + X X X X 3!"3#$%33$%33"""""%2iXYiNOTES on POLYNOMIAL REGRESSION1) Polynomial regressions are fitted successively starting with the linear term (afirst order polynomial). These are tested in order, so Sequential SS areappropriate.2) When the highest order term is determined, then all lower order terms are alsoincluded. If for instance we fit a fifth order polynomial, and only the CUBIC term issignificant, then we would OMIT THE HIGHER ORDER NON-SIGNIFICANT TERMS, BUT RETAIN THOSE TERMS OF SMALLERORDER THAN THE CUBIC. This does not mean that Y = b + b X + e is not a useful model, only that3!"33this is not a "polynomial".3) If there are s different values of X , then s-1 polynomial terms (plus the3intercept) will pass through every point (or the mean of every point ifthere are more than one observation per X value.3 It is often recommended that not more than 1/3 of the total number of points(different X values) be tied up in polynomial terms.3 eg. If we are fitting a polynomial to the 12 months of the year, don't usemore than 4 polynomial terms (quartic).4) All of the assumptions for regression apply to polynomials.5) Polynomials are WORTHLESS outside the range of observed data, do not try toextend predictions beyond this range. Extrapolation is useless unless thefunctional relationship is actually a parabola.INFLECTIONS for Polynomial Regression lines Linear straight line, no curve or inflections Quadratic one parabolic curve, no inflections Cubic two parabolic rates of curvature with the possibility of an inflection point. Each additional term allows for another change in the rate of curvature andallows for an additional inflection.APPLICATIONS of Polynomial Regression lines1) If enough polynomial terms are used, these curves will fit about anything.However, there is usually no good theoretical reason for usingpolynomial curves. eg. Suppose we have a model where we expect an exponential type growthcurve to result. We could fit this with a quadratic or cubic or quarticpolynomial, but the exponential curve would fit with two advantages. a) Good interpretation of the regression coefficient (proportional growth) b) Uses fewer d.f. in a simpler model.2) Polynomials are useful for testing for the presence of curvature, and the natureof that curvature (inflections or no), and can be used to fit trends withcomplex curvature where no particular theoretical function is known tobe applicable. This is also a useful “covariable" in designs3) The successive terms in polynomials are highly correlated. This is not aproblem when Sequential SS are used.4) Recall Lack of Fit. Each individual X value has a mean, and the Pure error43results from fitting this mean and calculating deviations from this mean. If a high order polynomial is used such that the order is one less than thenumber of different X values, then Pure error is obtained.4 eg. Two different X values can be fitted to a line4 Three different X values can be fitted to a line + quadratic, etc.4Therefore, fitting too high an order polynomial is no more meaningful as a“regression" than fitting Pure error (ie all X are categories).4Conceptually, we GAIN understanding and interpretability when we fit,say 12 levels, with only 2 or 3 degrees of freedom (hopefully with noLack of Fit) as opposed to 11.Therefore, a meaningful fit should be provided by a relatively low order poly(generally no more than of the possible dg, certainly no more than ),""$#and hopeful such that there is no Lack of Fit. Otherwise, you may as wellgo to ANOVA with CLASSES X;.When fitting polynomials, particularly high order polynomials, there are a numberof problems. 1) Variables can get very large. Imagine a model done over time (within ayear), We may regress on the julian data (from 1 to 365 days) If we fit only a 4 order polynomial, by mid-Dec we are up to350 =15,006,250% The corresponding regression coefficient will be very small. SAS adviseswhen the regressions coefficients get too small to insure a minimum ofprecision. A simple solution is to rescale the regression, use months 1.0 to 12.9, oreven year = days* = a decimal year.13652) There is a high correlation between the different variables which are powers ofX. This causes problems with regressions coefficients. The ability to get a good predictive equation is not impaired, (unless theregression coefficients get too small or large also, the actual values of the coefficients themselves are unstable, a commonconsequence of multicolinearity Fortunately, the solution which helps with multicolinearity will also helprounding error and large or small errors.a) One possibility is orthogonal polynomials. These are transformations of thepowers of X and its powers to extract pure linear, quadratic, cubic, etc. These are uncorrelated, and especially useful for design problems. eg. Any three X values which are equally spaced can be represented as3 Linear -1 0 1 Quadratic -1 2 -1 Any four equally spaced X values can be replaced by3 Linear


View Full Document
Download POLYNOMIAL REGRESSION
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view POLYNOMIAL REGRESSION and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view POLYNOMIAL REGRESSION 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?