Correlation and ANOVA February 2, 20041Political Science 552Correlation and Analysis of VarianceCorrelation{}{}{}{}YXYXYXσσσρ,, =()()()()∑∑∑−−−−=22YYXXYYXXriiii() ()YXiiiiiiiiSSDSSDSCPnYYnXXnYXYXr =−−−=∑∑∑∑∑∑∑2222Correlation-Text Notation{}{}{}{}212121,,YYYYYYσσσρ=()()()()∑∑∑−−−−=22221122112YYYYYYYYriiii() ()2122222121212112−−−=∑∑∑∑∑∑∑nYYnYYnYYYYriiiiiiiiCorrelation and ANOVA February 2, 20042Computational ExampleXY XY X2Y20 10 0 0 1005 70 350 25 49005 60 300 25 36001 30 30 1 9002 35 70 4 12252 20 40 4 4006 70 420 36 49003 40 120 9 16004 50 200 16 25004 55 220 16 30251 25 25 1 62533 465 1775 137 23775111775237751373346522======∑∑∑∑∑nXYYXXYExample, continued111775237751373346522======∑∑∑∑∑nXYYXXY() ()−−−=∑∑∑∑∑∑∑nYYnXXnYXYXriiiiiiii2222−−×−=114652377511331371146533177522r961.1.411838380=×=rBivariate NormalCorrelation and ANOVA February 2, 20043Bivariate Normal Cross-sectionFisher’s Little z−+=′rrz11log151.110{}31−=′nzσ2.646.99.950.74.536.49.245.242.297.98.928.73.523.48.234.232.092.97.907.72.510.47.224.221.945.96.887.71.497.46.213.211.831.95.867.70.485.45.203.201.738.94.848.69.472.44.192.191.658.93.829.68.460.43.182.181.589.92.811.67.448.42.172.171.527.91.793.66.436.41.161.161.472.90.775.65.424.40.151.15●●●●●●●●1.098.80.618.55.309.30.050.051.071.79.604.54.298.29.040.041.045.78.590.53.288.28.030.031.020.77.576.52.277.27.020.02.996.76.563.51.266.26.010.01.973.75.549.50.255.25.000.00ζρζρζρζρz'rz'rz'rz'rTABLEB8Correlation and ANOVA February 2, 20044Fisher’s Little z Example{}946.196.96.=′=zr{}354.311131=−=−=′nzσ{}640.2694.946.1354.96.1946.196.1 =+=×+=′+′zzσ{}252.1694.946.1354.96.1946.196.1 =−=×−=′−′zzσ95.)99.85Pr(. =<<ρRelationship between r and b1()()()()∑∑∑−−−−=22YYXXYYXXriiii()()()∑∑−−−=21XXYYXXbiii{}{}XsYsrb =1{}{}YsXsbr1=Coefficient of Determination (r2)SSTOSSRSSTOSSESSTOr =−=2()() ()error explained totalˆˆ 222+=−+−=−+=∑∑∑iiiiYYYYYYSSESSRSSTOCorrelation and ANOVA February 2, 20045Partitioning SSTO()()()[]∑∑−+−=−22ˆˆiiiiYYYYYY()()()()()∑∑−−+−+−=−iiiiiiiYYYYYYYYYYˆˆ2ˆˆ222()() () ()()∑∑∑∑−−+−+−=−iiiiiiiYYYYYYYYYYˆˆ2ˆˆ222()()0ˆˆ=−−∑iiiYYYYPartitioning SSTO, continued()()()()∑∑∑−−−=−−iiiiiiiiYYYYYYYYYYˆˆˆˆˆ()0ˆ==−∑∑iiieYY()∑∑==− 0ˆˆˆiiiiieYYYYComputing Sum of Squared Errors()()(){}YsnnYYYYSSTOiii22221 ×−=−=−=∑∑∑()() ()(){}YsnrSSTOrYYSSEii2222111ˆ×−×−=×−=−=∑()()(){}XsnbXXbYYSSRii22122121ˆ×−×=−=−=∑∑Correlation and ANOVA February 2, 20046Sum of Squares Regression (SSR)()2ˆ∑−= YYSSRiiiXbbY10ˆ+=()210∑−+= YXbbSSRiXbYb10−=()211∑−+−= YXbXbYSSRi()211∑−= XbXbSSRi()221∑−= XXbSSRiFT-Gore by Religion55.426.9(843)59.424.2(396)69.425.8(40)60.019.2(20)55.723.7(211)56.925.8(1510)CatholicsNon-Catholic ChristiansNone or AtheistTotalOtherJewsPartitioning VariationData = Fit + ResidualTotal Variation = Model + ResidualorSSDTotal=SSDBetweenGroup+SSDWithinGroupSSTO = SSG + SSESSDTotal=SSDGroup+SSDErrorCorrelation and ANOVA February 2, 20047Steps in ANOVA• Obtain Sum of Squared Deviatios, SSD’s• Compute MSE’s by Dividing SSD’s by appropriate degrees of freedom– df for SSG is # of groups – 1– df for SSE is N - # of groups• Take ratio of MSG to MSE to get F-ratioSSDBaseline()nXXsnsnXXSSTOSSDKjniijKjniijTTKjjKjniijBaselinejjj211112221112)1(1−=⋅−=⋅−=−==∑∑∑∑∑∑∑=======⋅SSDRemaining()()∑∑∑∑∑∑======−=⋅−=−==KjjniijniijKjjjKjnijijremainingnXXsnXXSSESSDjjj1211212112)1(Correlation and ANOVA February 2, 20048SSDGroups()()()SSESSTSSDSSDXnXnXXnXXSSGSSDremainingbaselineKjjjKjjjKjnijExplainedj−=−=−=−⋅=−==⋅==⋅==⋅∑∑∑∑21212112ANOVA Tablen-1n-k-1ndfSSTO n-1SSTOTotalSSE (n-k-1)SSEErrorMSG MSESSGkSSGGroupsFMSSSSource of VariationANOVA Table, ExampleSource SS df MS F Prob > FBetween groups 10703.4797 4 2675.86993 4.07 0.0028Within groups 990229.156 1505 657.959572Total 1000932.64 1509 663.308572Summary of FTGorereligion Mean Std. Freq.Dev.NonCath Christian 55 27 849Catholics 59 24 396Jews 69 26 40Other 59 21 14Atheist/None 56 24 211Total 57 26 1510Correlation and ANOVA February 2, 20049Regression ANOVAn-1n-21dfSSTO n-1SSTOTotalSSE (n-2)SSEErrorMSR MSESSR1SSRRegressionFMSSSSource of VariationComputational Elements()(){}1195.510384118122====−=−∑∑nesbXXYYiii()∑=−= 41122YYSSTOi{}( )3118)211(95.5222=−×=−×= nesSSEi()380038102221=×=−=∑XXbSSRiRegression ANOVA Example11-111-21df4118Total312.2/9 =35.35312Error3800 35.35=107.493800/13800RegressionFMSSSSource of VariationCorrelation and ANOVA February 2, 200410E{MSR} and E {MSE}{}{}iMSEEεσ2={} {}()∑−+=2212XXMSREiiβεσ{}{}0 if 1==βMSREMSEE{}0 if 11==βFEUsing r2SSTOrSSR ×=2()SSTOrSSE ×−=2112SSTOrMSR×=()212−×−=nSSTOrMSEF and t tests for r2and r()()()()21122−×−×==nSSTOrSSTOrMSEMSRF()()()222212)2(1 rnrnrrF−−×=−−=22−−×=nnrtCorrelation and ANOVA February 2, 200411Spearman’s r()()()()[]212222112211∑∑∑−−−−=RRRRRRRRriiiis2121+==nRR()2122 e wher161iiiisRRDnnDr −=−×−=∑{}11 −= nrssrsexampleXY RXRYDD201011005 70 9.5 10.5 -1 1560 9.5 9 .5 .251 30 2.5 4 -1.5 2.252 35 4.5 5 -.5 .252 20 4.5 2 2.5 6.256 70 11 10.5 .5 .253406600450 7.5 7 .5 .254 55 7.5 8 -.5 .251 25 2.5 3 -.5 .2511() 16122−×−=∑nnDris()95. 1111111612=−××−=sr{}11 −= nrss{}316.1111 =−=srs{}004.3316.95.
View Full Document