The Vision Thing Power ThirteenOutlineSlide 3Slide 4Slide 5Slide 6Slide 7Slide 8Why? The Bivariate Normal Density and CirclesEllipsesSlide 11Bivariate Normal: marginal & conditionalConditional DistributionSlide 14Bivariate Regression: Another PerspectiveExample: Lab SixSlide 17Correlation MatrixSlide 19Slide 20Slide 21Bivariate Normal Distribution and the Linear probability ModelSlide 23Slide 24Slide 25Discriminant Function, Linear Probability Function, and Decision Theory, Lab 6Slide 27Slide 28Slide 29Slide 30Decision Theory1The Vision ThingPower ThirteenBivariate Normal Distribution2Outline•Circles around the origin•Circles translated from the origin•Horizontal ellipses around the (translated) origin•Vertical ellipses around the (translated) origin•Sloping ellipses3xyx = 0, x2 =1y = 0, y2 =1x, y = 04xyx = a, x2 =1y = b, y2 =1x, y = 0ab5xyx = 0, x2 > y2 y = 0x, y = 06xyx = 0, x2 < y2 y = 0x, y = 07xyx = a, x2 > y2 y = bx, y > 0ab8xyx = a, x2 > y2 y = bx, y < 0ab9Why? The Bivariate Normal Density and Circles•f(x, y) = {1/[2xy]}*exp{(-1/[2(1-)]* ([(x-x)/x]2 -2([(x-x)/x] ([(y-y)/y] + ([(y-y)/y]2}•If means are zero and the variances are one and no correlation, then•f(x, y) = {1/2 }exp{(-1/2 )*(x2 + y2), where f(x,y) = constant, k, for an isodensity•ln2k =(-1/2)*(x2 + y2), and (x2 + y2)= -2ln2k=r210Ellipses•If x2 > y2, f(x,y) = {1/[2xy]}*exp{(-1/2)* ([(x-x)/x]2 + ([(y-y)/y]2}, and x* = (x-x) etc.•f(x,y) = {1/[2xy]}exp{(-1/2)* ([x*/x]2 + [y*/y]2) , where f(x,y) =constant, k, and ln{k [2xy]} = (-1/2) ([x*/x]2 + [y*/y]2 )and x2/c2 + y2/d2 = 1 is an ellipse11xyx = 0, x2 < y2 y = 0x, y < 0Correlation and Rotation of the Axes Y’X’12Bivariate Normal: marginal & conditional•If x and y are independent, then f(x,y) = f(x) f(y), i.e. the product of the marginal distributions, f(x) and f(y)•The conditional density function, the density of y conditional on x, f(y/x) is the joint density function divided by the marginal density function of x: f(y/x) = f(x, y)/f(x)Conditional Distribution•f(y/x)= 1/[y ]exp{[-1/2(1-y2]* [y-y-x-x)(y/x)]}•the mean of the conditional distribution is: y + (x - x) )(y/x), i.e this is the expected value of y for a given value of x, x=x*:•E(y/x=x*) = y + (x* - x) )(y/x)•The variance of the conditional distribution is: VAR(y/x=x*) = x2(1-)2 2/12)1(214xyx = a, x2 > y2 y = bx, y > 0xyRegression lineintercept:y - x(y/x)slope:(y/x)15Bivariate Regression: Another Perspective•Regression line is the E(y/x) line if y and x are bivariate normal–intercept: y - x x/y)–slope: x/y)16Example: Lab Six0123456-0.05 0.00 0.05 0.10Series: GESample 1993:01 1996:12Observations 48Mean 0.022218Median 0.019524Maximum 0.117833Minimum -0.058824Std. Dev. 0.043669Skewness 0.064629Kurtosis 2.231861Jarque-Bera 1.213490Probability 0.545122Rate of Return to GE stock17Example: Lab Six024681012-0.04 -0.02 0.00 0.02 0.04 0.06 0.08Series: INDEXSample 1993:01 1996:12Observations 48Mean 0.014361Median 0.017553Maximum 0.076412Minimum -0.044581Std. Dev. 0.025430Skewness -0.453474Kurtosis 3.222043Jarque-Bera 1.743715Probability 0.418174Rate of Return to S&P500 Index18Correlation Matrix•GE INDEXGE 1.000000 0.636290INDEX 0.636290 1.000000•19Bivariate Regression: Another Perspective•Regression line is the E(y/x) line if y and x are bivariate normal–intercept: y - x x/y)–slope: x/y)y = 0.022218 x = 0.014361x/y) = (0.02543/0.043669) = –intercept = 0.0064–slope = 1.09420-0.10-0.050.000.050.100.15-0.05 0.00 0.05 0.10INDEXGEReturns Generating Process For GE Stock and S&P 500 Index21Vs. 0.0064Vs. 1.09422Bivariate Normal Distribution and the Linear probability Model23incomeeducationx = a, x2 > y2 y = bx, y > 0mean income playersMeaneduc.PlayersMeanEducNon-PlayersMean income nonNon-PlayersPlayers24incomeeducationx = a, x2 > y2 y = bx, y > 0mean income playersMeaneduc.PlayersMeanEducNon-PlayersMean income Non-PlayersNon-PlayersPlayers25incomeeducationx = a, x2 > y2 y = bx, y > 0mean income playersMeaneduc.PlayersMeanEducNon-PlayersMean income Non-PlayersNon-PlayersPlayersDiscriminatingline26Discriminant Function, Linear Probability Function, and Decision Theory, Lab 6•Expected Costs of Misclassification–E(C) = C(P/N)P(P/N)P(N)+C(N/P)P(N/P)P(P)•Assume C(P/N) = C(N/P)•Relative Frequencies P(N)=23/100~1/4, P(P)=77/100~3/4•Equalize two costs of misclassification by setting fitted value of P(P/N), i.e.Bern to 3/4–E(C) = C(P/N)(3/4)(1/4)+C(N/P)(1/4)(3/4)27incomeeducationx = a, x2 > y2 y = bx, y > 0mean income playersMeaneduc.playersMeanEducNon-PlayersMean income Non-PlayersNon-PlayersPlayersDiscriminatinglineNote: P(P/N) is area of the non-players distribution below (southwest) of the line28Set Bern = 3/4 = 1.39 -0.0216*education - 0.0105*income,solve for education as it depends on income and plot297 non-players misclassified, as well as 14players misclassified3031Decision Theory•Moving the discriminant line, I.e. changing the cutoff value from 0.75 to 0.5, changes the numbers of those misclassified, favoring one population at the expense of another•you need an implicit or explicit notion of the costs of misclassification, such as C(P/N) and C(N/P) to make the necessary judgement of where to draw the
View Full Document