Relationships Between Two VariablesRelationships Between Two Numeric VariablesRelationships between Categorical and Numeric VariablesRelationships Between Two Categorical VariablesScatterplotPositively AssociatedNegatively AssociatedCorrelation CoefficientsRegression LineLeast-Squares Regression LineEquation of the least-squares regression lineCompute b0 and b1Interpretation of b0 and b1r2 in RegressionExampleCautions about Correlation and RegressionLurking variableOutliers and Influential Observations in RegressionTwo-Way TablesExample: Gender and Highest Degree ObtainedSimpson's ParadoxChapter 2 - Looking at Data - RelationshipsAnh Dao8th July, 2009Chapter 2 - Looking at Data - RelationshipsRelationships Between Two VariablesMore than one variable can be measured on each individual.Example: Gender and Height, Size and Cost, Grade and MajorWe want to look at the relationship among these variables.Is there an association between these two variables?Two variables measured on the same individuals are associated ifknowing the value of one of the variables tells us something about thevalues of the other variable that we would not know without thisinformation.Chapter 2 - Looking at Data - RelationshipsRelationships Between Two VariablesIf we expect one variable to influence another, we call it the explanatoryvariable. (Explains or influences changes in the response variable.)The variable that is influenced is called the response variable.(Measures an outcome of a study)In each of the following examples, identify the explanatory andresponse variablesGender and blood pressureClass attendance and course gradeNumber of beers and BACChapter 2 - Looking at Data - RelationshipsRelationships Between Two VariablesWe may be interested in relationships of different types of variables.Numeric and Numeric: scatterplots, correlation coefficients,least-squares regressionCategorical and Numeric: summary statistics or table of descriptivestatisticsCategorical and Categorical: side-by-side bar graphs, side-by-side piegraphs, two-way tablesChapter 2 - Looking at Data - RelationshipsRelationships Between Two VariablesRelationships Between Two Numeric VariablesDepending on the situation, one of the variables is the explanatoryvariable and the other is the response variable.There is not always an explanatory-response relationship.Examples:Height and WeightIncome and AgeSAT scores on math exam and on verbal examAmount of time spent studying for an exam and exam scoreChapter 2 - Looking at Data - RelationshipsRelationships Between Two VariablesRelationships between Categorical and Numeric VariablesWe are interested in comparing the numerical variable across each ofthe levels of the categorical variable.In this setup the categorical variable is always the explanatory variableand the numerical variable is always the response variable.Examples:Compare high speeds for 4 different car brandsCompare sucrose levels for 4 different types of fruitCompare GPR for 20 different majorsFor graphical comparison use side-by-side boxplots, for numericalcomparison use table of descriptive statisticsChapter 2 - Looking at Data - RelationshipsRelationships Between Two VariablesRelationships between Categorical and Numeric VariablesGraphical Comparison: side-by-side boxplotsExample: Sucrose levels of fruits (fictitious data)Chapter 2 - Looking at Data - RelationshipsRelationships Between Two VariablesRelationships between Categorical and Numeric VariablesNumerical ComparisonWe could also look at summary statistics for each group.Chapter 2 - Looking at Data - RelationshipsRelationships Between Two VariablesRelationships Between Two Categorical VariablesDepending on the situation, one of the variables is the explanatoryvariable, or grouping variable and the other is the response variable. Inthis case, we look at the percentages of one variable for each level ofthe other variable.Examples:Gender and Soda PreferenceCountry of Origin and Marital StatusSmoking Habits and Socioeconomic StatusChapter 2 - Looking at Data - RelationshipsRelationships Between Two VariablesRelationships Between Two Categorical VariablesGraphical Comparison: Side-by-side bar chartsExample: We asked n = 13 students whether they take vitamins or not.We classified people by gender. Are men and women equally likely totake vitamins?Answer: No, females take vitamins more often than males.Conclusion: there is an association between probability of takingvitamins and gender.Note: When the number of individuals in each group is different,compare the relative frequencies, not the counts!Chapter 2 - Looking at Data - RelationshipsRelationships Between Two VariablesRelationships Between Two Categorical VariablesGraphical Comparison: Side-by-side bar chartsWe can analyze the side-by-side pie graphs in the same way!Chapter 2 - Looking at Data - RelationshipsScatterplotEach pair of observations appears as a dot on the plot.Look for overall pattern and any striking deviations from that pattern.Look for outliers, values falling outside the overall pattern of therelationship.You can describe the overall pattern of a scatterplot by the form,direction, and strength of the relationship.Chapter 2 - Looking at Data - RelationshipsScatterplotResponse variable is represented by the y-axis. Explanatory variable isrepresented by the x-axis.Form: linear or clustersDirectionTwo variables are positively associated when above-averagevalues of one tend to accompany above-average values of theother and likewise below-average values also tend to occurtogether.Two variables are negatively associated when above-averagevalues of one variable accompany below-average values of theother variable, and vice-versa.Strength: how close the points lie to a line.Chapter 2 - Looking at Data - RelationshipsScatterplotPositively AssociatedChapter 2 - Looking at Data - RelationshipsScatterplotNegatively AssociatedChapter 2 - Looking at Data - RelationshipsCorrelation CoefficientsCorrelation coefficients, denoted by r : measures the direction andstrength of the linear relationship between two numeric variables.r is also called Pearson correlation coefficient or sample correlationcoefficient.r =1n − 1nXi=1xi−¯xsxyi−¯ysyGeneral PropertiesIt must be between -1 and 1, or (−1 ≤ r ≤ 1).If r is negative, the relationship is negative.If r = −1, there is a perfect negative linear relationship (extremecase).Chapter 2 - Looking at Data - RelationshipsCorrelation CoefficientsGeneral
View Full Document