1STAT 13, UCLA, Ivo DinovSlide 1UCLA STAT 13Introduction toStatistical Methods for the Life and Health ScienceszInstructor: Ivo Dinov, Asst. Prof. In Statistics and NeurologyzTeaching Assistants: Tom Daula and Kaiding Zhu,UCLA StatisticsUniversity of California, Los Angeles, Fall 2002http://www.stat.ucla.edu/~dinov/STAT 13, UCLA, Ivo DinovSlide 2Chapter 3: Exploratory Tools for RelationshipsTools for assessing relationships betweenzTwoqualitative variableszA quantitative and a qualitative variablezTwoqualitative variablesSTAT 13, UCLA, Ivo DinovSlide 350 100 150 200SYSVOL100150200250300Figure 3.1.1 Scatter plot of SYSVOL versus DIAVOLfor the heart-attack data in Table 2.1.1.From Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 2000.Use scatter plots to explore relationships between quantitative variablesSTAT 13, UCLA, Ivo DinovSlide 4TABLE 3.1.1 Deaths and Radiation in Milkafter Chernobyl Peak radioactivity Percentage in milk increaseRegion (picocuries/L) in death rate Middle Atlantic 23 2.2South Atlantic 20 2.4New England 22 1.9East North-Central 29 3.9West North-Central 32 3.6East Southern 21 2.6Central Southern 16 0Mountain 37 4.2Pacific 44 5Example: Deaths and radiation in milk after Chernobyl Accident10 20 30 40 50012345Radiation (picocuries/L)Figure 3.1.2 Chernobyl data.From Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 2000.STAT 13, UCLA, Ivo DinovSlide 5TABLE 3.1.2 Computer Timings DataNumber of terminals: 40 50 60 45 40 10 30 20Time Per Task (secs): 9.9 17.8 18.4 16.5 11.9 5.5 11 8.1Number of terminals: 50 30 65 40 65 65Time Per Task (secs): 15.1 13.3 21.8 13.8 18.6 19.8Example: Computer timings data0 1020304050605101520Number of terminalsFigure 3.1.3Computer timings data.STAT 13, UCLA, Ivo DinovSlide 6TABLE 3.1.3 Gaseous Emissions in Car Exhausts (gram per mile)Car HC CO NOX Car HC CO NOX Car HC CO NOX1 0.50 5.01 1.28 17 0.83 15.13 0.49 32 0.52 4.29 2.942 0.65 14.67 0.72 18 0.57 5.04 1.49 33 0.56 5.36 1.263 0.46 8.60 1.17 19 0.34 3.95 1.38 34 0.70 14.83 1.164 0.41 4.42 1.31 20 0.41 3.38 1.33 35 0.51 5.69 1.735 0.41 4.95 1.16 21 0.37 4.12 1.20 36 0.52 6.35 1.456 0.39 7.24 1.45 22 1.02 23.53 0.86 37 0.57 6.02 1.317 0.44 7.51 1.08 23 0.87 19.00 0.78 38 0.51 5.79 1.518 0.55 12.30 1.22 24 1.10 22.92 0.57 39 0.36 2.03 1.809 0.72 14.59 0.60 25 0.65 11.20 0.95 40 0.48 4.62 1.4710 0.64 7.98 1.32 26 0.43 3.81 1.79 41 0.52 6.78 1.1511 0.83 11.53 1.32 27 0.48 3.45 2.20 42 0.61 8.43 1.0612 0.38 4.10 1.47 28 0.41 1.85 2.27 43 0.58 6.02 0.9713 0.38 5.21 1.24 29 0.51 4.10 1.78 44 0.46 3.99 2.0114 0.50 12.10 1.44 30 0.41 2.26 1.87 45 0.47 5.22 1.1215 0.60 9.62 0.71 31 0.47 4.74 1.83 46 0.55 7.47 1.3916 0.73 14.97 0.51Source: Lorenzen [1980].Example: Car emissionsHC = hydrocarbons; CO=carbon monoxide; NOX = nitrogen oxides;grams/mile measurements; 46 identical vehicles tested.2STAT 13, UCLA, Ivo DinovSlide 751015200.51.52.5Carbon monoxide (g/m)Figure 3.1.4 Gaseous emissions in car exhausts.From Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 2000.STAT 13, UCLA, Ivo DinovSlide 81900 19401980Year230245215200019601920Figure 3.1.5Olympic winning times for themen's 1500 meters.From Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 2000.TABLE 3.1.4 Olympic Winning Times (in secs) for the Men's 1500 Meters (1900-1988)Year 1900 1904 1908 1912 1920 1924 1928 1932 1936 1948 1952Time 246.0 245.4 243.4 236.8 241.9 233.6 233.2 231.2 227.8 229.8 225.2Year 1956 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996Time 221.2 215.6 218.1 214.9 216.3 219.2 218.4 212.5 216.0 220.1 215.8Are wemoving fasternow?STAT 13, UCLA, Ivo DinovSlide 9Quiz on Section 3.1.1z What is a quantitative variable?z What basic tool is used for exploring relationshipsbetween quantitative variables?z What is a controlled variable? (variables whose values are determined in the exper. Design, as opposed to random variables who are evaluated once the experiments are conducted (e.g., number of terminals vs. task completion time)z What is the difference between a random and a nonrandom variable? (variables whose values are not to be observed as random events during the experiment, i.e., these are controlled, odeterministic or predictable variables, e.g., yearfor the Running Time experiment).STAT 13, UCLA, Ivo DinovSlide 10Regression relationship = trend + residual scatter9000 10000 11000 12000Disposable income ($)9000 10000 11000 12000(a) Sales/incomeDisposable income ($)From Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 1999.z Regression is a way of studying relationships between variables (random/nonrandom) for predicting or explaining behavior of 1 variable (response) in terms ofothers (explanatory variables or predictors).STAT 13, UCLA, Ivo DinovSlide 111000 2000 3000 4000Ventilation1000 2000 3000 4000Ventilation(b) Oxygen uptakeFrom Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 1999.Trend ( does not have to be linear) + scatter (could be of any type/distribution)STAT 13, UCLA, Ivo DinovSlide 1215 20 25 30 35 40102030405060Gestational age (wk)15 20 25 30 35 40102030405060(c) Liver lengthsGestational age (wk)From Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 1999.Trend + scatter (fetus liver length in mm)Change of scatter with age3STAT 13, UCLA, Ivo DinovSlide 13200030004000Weigh t (lbs)5000 200030004000Weight (lbs)5000(a) Scatter plot (b) With trend plus scatterOutliersFigure 3.1.7Displacement versus weight for 74 models of automobile.From Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 2000.Trend + scatterDotted curves (confidence intervals) represent the extend of the scatter.STAT 13, UCLA, Ivo DinovSlide 14xxyy (a) Which line? (b) Flatter line givesbetter predictions.Figure 3.1.8 Educating the eye to look vertically.From Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 2000.Looking verticallyFlatter line gives better prediction, since it approx. goes through themiddle of the Y-range, for each fixed x-value (vertical line)STAT 13, UCLA, Ivo DinovSlide 15100 300 500Diastolic volumeBAFigure 3.1.9 Scatter plot from the heart attack data.From Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 2000.Outliers – odd, atypical, observations (errors, B, or real data, A)STAT 13, UCLA, Ivo DinovSlide 1740 60 80Parent’s rating20Figure 3.1.10Parent's rating
View Full Document