UF STA 6126 - Introduction to multivariate relationships

Unformatted text preview:

10. Introduction to Multivariate RelationshipsSlide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Simpson’s paradoxTypes of Multivariate RelationshipsSlide 12Slide 13Slide 14Slide 15Slide 16Some review questions10. Introduction to Multivariate RelationshipsBivariate analyses are informative, but we usually need to take into account many variables.•Many explanatory variables have an influence on any particular response variable.•The effect of an explanatory variable on a response variable may change when we take into account other variables. (Picture such as on p. 305 for X = height, Y = achievement test score, taking into account grade level)Example: Y = whether admitted into grad school at U. California, Berkeley (for the 6 largest departments) X = gender Whether admittedGender Yes No Total %yes Female 550 1285 1835 30% Male 1184 1507 2691 44%Difference of sample proportions = 0.44 – 0.30 = 0.14 has se = 0.014, Pearson 2 = 90.8 (df = 1), P-value = 0.00000…. There is very strong evidence of a higher probability of admission for men than for women.•Now let X1 = gender and X2 = department to which the person applied. e.g., for Department A, Whether admittedGender Yes No Total %yes Female 89 19 108 82% Male 511 314 825 62%Now, 2 = 17.4 (df = 1), but difference is 0.62 – 0.82 = -0.20. The strong evidence is that there is a higher probability of being admitted for women than men.What happens with other departments?Female Male Difference of Dept. Total %admitted Total %admitted proportions 2A 108 82% 825 62% -0.20 17.4B 25 68% 560 63% -0.05 0.25C 593 34% 325 37% 0.03 0.75D 375 35% 417 33% -0.02 0.3E 393 24% 191 28% 0.04 1.0F 341 7% 273 6% -0.01 0.4Total 1835 30% 2691 44% 0.14 90.8There are 6 “partial tables,” which summed give the original “bivariate” table. How can the partial table results be so different from the bivariate table?•Partial tables – display association between two variables at fixed levels of a “control variable.”Example: Previous page shows results from partial tables relating gender to whether admitted, controlling for (i.e., keeping constant) the level of department.When control variable X2 is kept constant, changes in Y when X1 changes are not due to changes in X2Note: When each pair of variables is associated, then a bivariate association for two variables may differ from its partial association, controlling for the other variable.Example: Y = whether admitted is associated with X1 = gender, but each of these itself associated with X2 = department.Department associated with gender: Males tend to apply more to departments A, B, females to C, D, E, FDepartment associated with whether admitted: % admitted higher for dept. A, B, lower for C, D, E, FMoral: Association does not imply causation! This is true for quantitative and categorical variables. e.g., a strong correlation between quantitative var’s X and Y does not mean that changes in X cause changes in Y.Why does association not imply causation?•There may be some “alternative explanation” for the association. Example: Suppose there is a negative association between X = whether use marijuana regularly and Y = student GPA. Could the association be explained by some other variables that have an effect on each of these, such as achievement motivation or degree of interest in school or parental education?With observational data, effect of X on Y may be partly due to association of X and Y with lurking variables – variables that were not observed in the study but that influence the association of interest.•Unless there is appropriate time order, association is consistent with X causing Y or with Y causing X, or something else causing both. •Even when the time order is appropriate, there could still be some alternative explanation, such as a variable Z that has causal influence on both X and Y.Especially tricky to measure cause and effect when both variables measured over time; e.g., annual data for a nation shows a negative association between the fertility level and the percentage of the nation’s population using the Internet.Causation difficult to assess with observational studies, unlike experimental studies that can control potential lurking variables (by randomization, keeping different groups “balanced” on other variables). In an observational study, when X1 and X2 both have effects on Y but are also associated with each other, there is said to be confounding. It’s difficult to determine whether either truly causes Y, because a variable’s effect could be partly due to its association with the other variable. (Example in Exercise 10.32 for X1 = amount of exercise, Y = number of serious illnesses in past year, X2 = age is a possible confounding variable)Simpson’s paradox•It is possible for the (bivariate) association between two variables to be positive, yet be negative at each fixed level of a third variable. (see scatterplot)Example: Florida countywide data (Ch.11, pp. 322-323)There is a positive correlation between crime rate and education (% residents of county with at least a high school education)!There is a negative correlation between crime rate and education at each level of urbanization (% living in an urban environment) (see scatterplot)Types of Multivariate Relationships•Spurious association: Y and X1 both depend on X2 and association disappears after controlling X2 (Karl Pearson 1897, one year after developing sample estimate of Galton’s correlation, now called “Pearson correlation”)Example: For nations, percent owning TV negatively correlated with birth rate, but association disappears after control per capita gross domestic product (GDP).Example: College GPA and income later in life?Example: Math test score for child and whether family has Internet connection?•Chain relationship – Association disappears when control for intervening variable.Example: Gender  Department  Whether admitted(at least, for


View Full Document

UF STA 6126 - Introduction to multivariate relationships

Download Introduction to multivariate relationships
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Introduction to multivariate relationships and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Introduction to multivariate relationships 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?