##
This **preview** shows page *1-2-3-18-19-36-37-38*
out of 38 **pages**.

*View Full Document*

End of preview. Want to read all 38 pages?

Upload your study docs or become a GradeBuddy member to access this document.

View Full Document**Unformatted text preview:**

Statistical Analysis of Cross-TabsCross tabulation of qualitative data is a basic tool for empirical research. Cross tabulations (cross tabs for short) are also called contingency tables because they are used to test hypotheses about how some variables are contingent upon others, or hoOur goal is to help students to obtain and understand the statistics they need in doing empirical research using cross tabulations of variables that are available for analysis of observational samples, notably in the social sciences, using here our exampNominal, Ordinal, Interval and Ratio Scale VariablesFunctional Coefficients (interval or ordinal variables or 2 x 2 tables)Relational Correlation Coefficients: Order- and Category-BasedOrder-Based Correlation Coefficients (Ordinal variables and 2 x 2 tables)Somer’s symmetric D is an order-based measure \(Somer’s d is an order-based measure computed as Gamma is a weak order-based measure computed as (C-D)/(C+D).Categorical Correlation Coefficients (Nominal variables and 2 x 2 tables)Phi²Adjusted Phi?² \(Cramer’s V\)Contingency coefficientWhat does the Strength of a Correlation Mean?Nominal Variables and the Laws of ProbabilityExpected ProbabilityExpected Frequencies given Independent Probabilities: the Null HypothesisThe Chi-square \(?²\) statistic for measuring Table 5.6. Difference between Survey Frequencies and Expected FrequenciesDegrees of FreedomThe Phi-square correlation coefficient F² and adAssigning a Positive or Negative sign to the Phi? correlation coefficient F?Evaluating Cross Tabulations of Nominal VariablesTable 5.9. A hypothetical perfect correlation, wiEvaluating Cross Tabulations of Ordinal Variables2 x 2 Cross Tabulations of Ordinal Variables with a sign for FEvaluating Cross Tabulations of Categorical VariablesFisher Exact Test for up to 6 x 6 Cross TabulationsFisher Exact one- and two-tailed Test and Data-Mining ErrorsSection 5: ConclusionReview of ConceptsComparison of Correlation CoefficientsAppendix 1: Interpreting Gamma CoefficientsChapter 5 Statistical Analysis of Cross-Tabs D. White and A. Korotayev 2 Jan 2004 Html links are live Some new text added in Blue 30 Oct 2004 Introduction Descriptive statistics includes collecting, organizing, summarizing and presenting descriptive data. We assume here, as with the Standard Sample, that the collection, data cleaning, and organization of data into variables has been done, and that the student has access to a database through a program such as SPSS, the Statistical Package for the Social Sciences. Further, we will assume that the student has instructions on using the software to create cross-tabulations of variables. Inferential statistics includes determining relationships using correlation coefficients and testing hypotheses with significance tests. These may be used for evaluating predictions by comparison to the null hypothesis and for the similarity between two predicted outcomes (comparison of a theoretical model to an expected outcome from the model, testing whether two sets of observations give the same result, and so forth). Those will be the primary concern of this Chapter, and other issues of statistical inference will be taken up in Chapter 7 and 8. Cross tabulation of qualitative data is a basic tool for empirical research. Cross tabulations (cross tabs for short) are also called contingency tables because they are used to test hypotheses about how some variables are contingent upon others, or how increases in one affects increases, decreases or curvilinear changes in others. Problems of causal influences or feedback relationships are difficult to make, of course, without experimental controls or data over time. Contingency analysis, however, is a good place to begin in testing theories or developing hypotheses to be tested with more rigorously collected data. The use of control variables in studying correlations can also be of use in replicating results and identifying more complicated contingencies by which variables interact or influence one another. Our goal is to help students to obtain and understand the statistics they need in doing empirical research using cross tabulations of variables that are available for analysis of observational samples, notably in the social sciences, using here our examples from the Standard Cross-Cultural Sample. Section 1 provides practical advice for contingency table analysis in SPSS. A more general introduction to statistical analysis proceeds in the three sections that build one on the other. The student is recommended to study these sections because they provide the basis for statistical reasoning. To deal with your data analysis rather than simply apply correlational and significance tests mechanically without understanding them, it will beChapter 5 invaluable to study and understand the concepts of statistical reasoning in order to reason from them rather than from ad hoc interpretation of the mechanical procedures in section 1. Section 2 introduces measurement (nominal, ordinal, interval and ratio scales) and correlation, which are closely connected. Basic methods are presented for getting useful correlations from nominal and ordinal data. Section 3 takes up those topics in statistics that derive their analytical power from the use of probability theory. We begin with probabilistic inference and the three laws of probability (independent events, sample spaces, and mutually exclusive events). From these we derive expected frequencies and the null hypothesis of statistical independence. We then explain how from a comparison of expected and actual frequencies for cross-tabulations on our data we can derive two useful statistics: the chi-square measure of departure from statistical independence and the phi-square all-purpose correlation coefficient. Section 4 unites the two previous sections. Interpreting correlations derived from cross-tables and the testing of hypotheses from them requires the concepts in statistical analysis that derive from probability theory reviewed in the previous section. When strictly independent events having two characteristics that are independently defined are tabulated in a contingency table, the laws of probability can be used to model, from the marginal totals (rows, columns) of the table, what its cell values would be if the variables were statistically independent. The actual cell values of the frequency table can be used to measure the correlation between the variables