Unformatted text preview:

Linear correlationIntroductionLinear CorrelationUseComputationA complete exampleCautionsConfidence Interval Belt GraphsSummaryFurther examplesIntroductory Statistics LecturesLinear correlationTesting two variables for a linear relationshipAnthony TanbakuchiDepartment of MathematicsPima Community CollegeRedistribution of this material is prohibitedwithout written permission of the author© 2009(Compile date: Tue May 19 14:51:18 2009)Contents1 Linear correlation 11.1 Introduction . . . . . . . 11.2 Linear Correlation . . . 3Use . . . . . . . . . . . . 6Computation . . . . . . 6A complete example . . 10Cautions . . . . . . . . . 12Confidence IntervalBelt Graphs . . . 131.3 Summary . . . . . . . . 171.4 Further examples . . . . 171 Linear correlation1.1 IntroductionMotivationIs there a relationship — correlation — between your height and . . . (1)your mother’s height? (2) your forearm height? (3) your work hours per week?(4) your commute distance?12 of 17 1.1 Introduction●●●●●●●●●●●●●●●●●●56 58 60 62 64 66 6865 70 75height_motherheight●●●●●●●●●●●●●●●●●●8 10 12 14 1665 70 75forearmheight●●●●●●●●●●●●●●●●●●0 10 20 30 40 5065 70 75work_hoursheight●●●●●●●●●●●●●●●●●●5 10 1565 70 75creditsheightMotivationExample 1. How much of a individual’s height is explained by their mother’sheight? Use our class data to determine if there is a linear relationship betweena mother’s height and their child’s height (your height) and how much variationin the child’s height can be explained by the mother’s height.R: he i gh t = c l a s s . data $ h e i g htR: he i gh t mother = c l a s s . dat a $ h e i g ht motherThe first few data points are:Use a scatter plot to see if a relationship existsR: p l ot ( h e i g h t mother , he ig h t , main = ”H e ig ht in i n c h e s ”)Anthony Tanbakuchi MAT167Linear correlation 3 of 17height height mother1 65 652 68 673 71 634 66 645 68 656 65 62●●●●●●●●●●●●●●●●●●56 58 60 62 64 66 6865 70 75Height in inchesheight_motherheightQuestion 1. Does it look like there is a linear relation ship? Draw a best fitline in the dataPaired data. Definition 1.1A set of (xi, yi) data where each pair is related. (Dependent samples.)ex: mother height, child height.1.2 Linear CorrelationCorrelation. Definition 1.2exists when two variables have a relationship with one another.Anthony Tanbakuchi MAT1674 of 17 1.2 Linear Correlation●●●●●●●●●●●●●●●●●●●●−4 −2 0 2 4−10 −5 0 5linear correlationxy●●●●●●●●●●●●●●●●●●●●−4 −2 0 2 40 5 10 15 20non−linear correlationxyAnthony Tanbakuchi MAT167Linear correlation 5 of 17●●●●●●●●●●●●●●●●●●●●−4 −2 0 2 4−10 −5 0 5 10perfect positivexy●●●●●●●●●●●●●●●●●●●●−4 −2 0 2 4−5 0 5 10strong positivexy●●●●●●●●●●●●●●●●●●●●−4 −2 0 2 4−10 −5 0 5 10positivexy●●●●●●●●●●●●●●●●●●●●−4 −2 0 2 4−1.5 −0.5 0.5 1.0 1.5no correlationxyAnthony Tanbakuchi MAT1676 of 17 1.2 Linear Correlation●●●●●●●●●●●●●●●●●●●●−4 −2 0 2 4−10 −5 0 5 10perfect negativexy●●●●●●●●●●●●●●●●●●●●−4 −2 0 2 4−10 −5 0 5 10strong negativexy●●●●●●●●●●●●●●●●●●●●−4 −2 0 2 4−15 −5 0 5 10negativexy●●●●●●●●●●●●●●●●●●●●−4 −2 0 2 4−1.5 −0.5 0.5 1.0 1.5no correlationxyUSEOften used to help answer:1. Is there a linear relationship between X and Y ?2. Can X be used to predict Y ?3. How much of the variation in X can be predicted with Y ?COMPUTATIONLinear correlation coefficient.Definition 1.3The linear correlation coefficient for a population is denoted with ρ. Wecan estimate ρ via a sample and calculate Pearson’s linear correlationcoefficient r:r =P(xi− ¯x)(yi− ¯y)(n − 1)sxsy(1)n is the number of pairs of data points (length of x or y).Anthony Tanbakuchi MAT167Linear correlation 7 of 17• Measures the strength of the linear relationship between x andy.• Larger values of |r| indicate stronger linear relationship.1• Positive r indicates positive slope, negative r indicates negativeslope.Examples of r●●●●●●●●●●●●●●●●●●●●−4 −2 0 2 4−10 −5 0 5 10r = 1xy●●●●●●●●●●●●●●●●●●●●−4 −2 0 2 4−5 0 5 10r = 0.99xy●●●●●●●●●●●●●●●●●●●●−4 −2 0 2 4−10 −5 0 5 10r = 0.86xy●●●●●●●●●●●●●●●●●●●●−4 −2 0 2 4−1.5 −0.5 0.5 1.0 1.5r = −0.1xy1Larger |r| does not indicate a steeper slope. We will find the slope later using regression.Anthony Tanbakuchi MAT1678 of 17 1.2 Linear Correlation●●●●●●●●●●●●●●●●●●●●−4 −2 0 2 4−10 −5 0 5 10r = −1xy●●●●●●●●●●●●●●●●●●●●−4 −2 0 2 4−10 −5 0 5 10r = −0.99xy●●●●●●●●●●●●●●●●●●●●−4 −2 0 2 4−15 −5 0 5 10r = −0.88xy●●●●●●●●●●●●●●●●●●●●−4 −2 0 2 4−1.5 −0.5 0.5 1.0 1.5r = −0.1xyProperties of r (ρ for populations)1. −1 ≤ r ≤ +12. r is scale invariant.3. r is invariant if x and y are interchanged.4. r only measures the strength of linear relationships.Coefficient of determination (explained variation).Definition 1.4r2is the proportion of linear variation in y that is explained by x.• 0 ≤ r2≤ 1• The closer r2is to 1 the stronger the linear relationship and like-wise the more variation in y that can be explained by x.Example of r2Anthony Tanbakuchi MAT167Linear correlation 9 of 17●●●●●●●●●●●●●●●●●●●●−4 −2 0 2 4−15 −5 0 5 10r = −0.88 r^2 = 0.78xyHypothesis test for linear correlation. Definition 1.5requirements (1) simple paired (x, y) random samples, (2) Pairs of(x, y) have a bivariate normal distribution2, (3) correlation islinear.null hypothesis ρ = 0 (no linear correlation)alternative hypothesis ρ 6= 0 ( a linear correlation exists3)Always make a scatter plot first to see if the


View Full Document

UA MATH 167 - Study Notes

Download Study Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Study Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Study Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?