LSU EXST 7034 - Qualitative indicator variables

Unformatted text preview:

Qualitative indicator variables -An indicator variable is a distinguishing between qualitative categories.The easiest way creating an indicator variable is to 1) choose the category to be singled out 2) In a separate column of the X matrix, put a 1 wherever the chosen category is correct put a 0 otherwise3) This could be repeated once for each category of the qualitative variableWe have also seen the value 1 used as the first column in an X matrix to fit themean. This is one use of an indicator variable, but they can be used to fitother means.Take for example the data set Category Value A 3 A 4 A 5 B 2 B 3 B 4 C 5 C 6 C 7There are three groups here, with means of 4, 3 and 6 respectively. Suppose wewish to distinguish between these in an X matrix.X= X X = 1100110011001010101010101001100110019333330030303003Ô×ÖÙÖÙÖÙÖÙÖÙÖÙ ÖÙÖÙ ÖÙÖÙÖÙÖÙÖÙÖÙÕØÔ×ÕØw, but this matrix is singular, 4 cols for 3 groupsThere are 3 groups, so we can use 2 degrees of freedom after the mean, 3 alltogether. How about the following options? SAS drop last Drop 1 Means Contrasts Orth Poly=> X= X= X=110 100110 100110 100101 010101 010101 010100 001100 001100 001Ô×Ô×ÔÖÙÖÙÖÖÙÖÙÖÖÙÖÙÖÖÙÖÙÖÖÙÖÙÖÖÙÖÙÖÖÙÖÙÖÙÖÙÖÙÖÙÖÙÖÙÖÙÖÙÖÙÖÙÕØÕØÖÙÖÙÖÙÖÙÖÙÖÙÖÙÖÙÖÙÖÙÖÙÖÙÕØÕØ×Ô ×ÙÖ ÙÙÖ ÙÙÖ ÙÙÖ ÙÙÖ ÙÙÖ Ù11 0 1-1111 0 1-1111 0 1-111-11 10-21-11 10-21-11 10-210-1 11110-1 11110-1 111 X=So how do these means fit in with regression?In regression we have quantitative variables as well as the qualitative.X= X X = 1X 101X 101X 101X 011X 011X 011X 001X 001X 009X33XÔ×ÖÙÖÙÖ ÙÖÙÖ ÙÖÙÖ ÙÖÙÖ ÙÖÙÖ ÙÖÙÖ ÙÖÙÖ ÙÖÙÖ ÙÖÙÖ ÙÖÙÖ ÙÖÙÕØÔ×ÕØ"#$%&'()*w3"3"DDD*3œ"*3œ"*$'3œ" 3œ"3œ%$3œ"'3œ%XXX3X303X0323"3" 3"3"3"DDDDWithout the quantitative variable, the indicator variables fit means, which are leveladjustments.with the quantitative variable, the indicator variables fit intercepts, which are alsolevel adjustments. SLR: Y = + X + 3! "3"3"" %Multiple regression with indicator variable Y = + X + X + 3! "3" #3#3"" " % where X is an indicator variable3#When X = 0, then 3# Y = + X + 0 + = + X + 3! "3" # 3! "3"3"" " %"" % which is the SLRWhen X = 1, then 3# Y = + X + 1 + 3! "3" # 3"" " % where both and are constants, so let = + " " """!# !"w! Y = ( + ) + X + 1 + = + X + 3!" "3"#3 "3"3w!"" " " % " " % which is another SLR with a different intercept First line E(Y ) = + X"""!3!"3" Second Line E(Y ) = ( + ) + X""""w!3!""3" Note that there is only 1 value for the slope, so both lines have the same slope andare parallelThe effect of interactions with the indicator variable. SLR: Y = + X + 3! "3"3"" %Multiple regression with the added indicator variable Y = + X + X + 3! "3" #3#3"" " %Multiple regression with an indicator variable and interaction term Y = + X + X + X *X + 3 ! "3" #3# $3" 3# 3"""" % Y = + X + *0 + X *0 + When X = 0, then 3# 3 ! "3" # $3" 3"" " " % Y = + X + 3!"3"3"" % Y = + X + *1 + X *1 + When X = 1, then 3# 3 ! "3" # $3" 3"" " " % Y = (+) + (+)X + 3!# "$3"3"" "" % Y = + X + 33"3ww!""" %NOTE that both the intercept and the slope are different (though not necessarilysignificantly so). This essentially fits two entirely different regressionlines (two slopes and 2 intercepts). First line & E(Y ) = + X"" ""!" 3 !"3" Second line & E(Y ) = ( + ) + ( + )X" " "" ""ww!"3!""$3" Now there are 2 slopes, so each line has its own slope and intercept. These aretwo separate lines.Interpreting the estimated coefficientsWhen the indicator is 0, the coefficients and represent the slope and""!"intercept of the group which is allocated the “0" indication. The modelreduces down to a SLRWhen the indicator is 1, the coefficients and are recombined with and "" ""!" #$(respectively) to create new slopes and intercepts Y = ( + ) + ( + )X + 3!# "$3"3"" "" % Therefore, is a value which shows how much GREATER (or less if )"#the intercept for the second group (1) is than the first (0). A test of this value (aganinst 0) is actually a test of the difference in theintercepts. (ie. if = 0 then the intercepts are not different)"#Likewise, is a value which shows how different the slope for the second group"$(1) is from the first (0). A test of this value is actually a test of thedifference in the slopes.HANDOUTS 1) Raw data (see coding) and Raw data Plot 2) Overall SLR, see coefficients & overall fit Residual plot : not very good 3) Separate Intercepts : improved fit, different coefficients Residual plot : better, but ... 4) Separate slopes and intercepts : improved fit, different coefficients Residual plot : pretty good 5) Separate fits : note sameness of reg coeff note differences in se, and WHAT IS TESTEDNOTES1) It can be extended to as many categories as necessary, each getting its its ownintercept and/or slope. The textbook examples are restricted primarily to two categories. However,in practice we can adjust for as many separate slopes and intercepts asdesired.2) A model fitted to indicator variables only, with no quantitative variables, fits anANOVA model.3) In using this approach to test for a constant relationship between twopopulations, we make all of the usual assumptions (NIDrv(0, ).52 We must also assume that the two populations have the same variance. Fitting the functions in this fashion produces the same regressioncoefficients as if the models were fitted separately. However, we havethe advantage that all observations from both models contribute to theestimation of the variances.4) The usual diagnostics apply to these approaches. However, one must havesufficient foresight to include some notation for the indicator variable inresidual plots. Residual plots of diverging lines may appear as nonhomogeneous variancewhen fitted as an SLR.Inferences about regression linesConceptually, there are several approaches to testing inferences about the variousregression lines fitted. All can be expressed in the form of “Full versusreduced models" and as EXTRA SS1) Can fit directly as full and reduced models.2) Can use tests of b provided by PROC REG for some tests33) Can


View Full Document

LSU EXST 7034 - Qualitative indicator variables

Download Qualitative indicator variables
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Qualitative indicator variables and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Qualitative indicator variables 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?