DOC PREVIEW
Cal Poly STAT 252 - Scatterplots, Association, Correlation

This preview shows page 1-2-3 out of 9 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1 STAT 252 Handout 12 Winter 2010 Scatterplots, Association, Correlation We will study relationships between two quantitative variables this week. • A scatterplot is a graphical display of the relationship between two quantitative variables. o The explanatory variable goes on the horizontal (x-) axis, the response on the vertical (y-) axis. • We examine a scatterplot for evidence of association between the variables. Three aspects of association to look for are: • Form o The form of the association is linear if a straight line appears to summarize the relationship between the variables. • Direction o Positive association means that larger values of one variable tend to appear with larger values of the other, and smaller values of one variable tend to appear with smaller values of the other. o Negative association means that larger values of one variable tend to appear with smaller values of the other, and smaller values of one variable tend to appear with larger values of the other. • Strength o Strength refers to the degree to which the data points follow a recognizable form. Example 1: House prices The following table reports the price and size (in square feet) for a sample of houses in Arroyo Grande, California. These data were obtained from the website zillow.com on February 7, 2007, for a random sample of houses listed on that site as recently sold. Address Price ($) Size (sq ft) Address Price ($) Size (sq ft)2130 Beach St 311,000 460 1030 Sycamore Dr 490,000 1664 2545 Lancaster Dr 344,720 1030 620 Eman Ct 492,000 1160 415 Golden West Pl 359,500 883 529 Adler St 500,000 1545 990 Fair Oaks Ave 414,000 728 646 Cerro Vista Cir 510,000 1567 845 Pearl Dr 459,000 1242 926 Sycamore Dr 520,000 1176 1115 Rogers Ct 470,000 1499 227 S Alpine St 541,000 1120 579 Halcyon Rd 470,000 1419 654 Woodland Ct 567,500 1549 1285 Poplar St 470,000 952 2230 Paso Robles St 575,000 1540 1080 Fair Oaks Ave 474,000 1014 2461 Ocean St 580,000 1755 690 Garfield Pl 475,000 1615 833 Creekside Dr 625,000 1844 (a) What are the observational units here? (b) How many variables are reported in the table for each observational unit? What type (categorical or quantitative) is each variable?2 (c) Do the data suggest that bigger houses tend to cost more than smaller ones? Hint: Notice that the houses are presented in order by price, from least to most expensive. Consider the following scatterplot of price vs. size (our convention is to say y vs. x, with the first variable (y) on the vertical axis): Size (sq ft)Price20001750150012501000750500$650,000$600,000$550,000$500,000$450,000$400,000$350,000$300,000 (d) What is the address for the house represented by the point in the bottom left of the graph? How about the point in the top right? Bottom left: Top right: (e) Circle the point corresponding to the house at 845 Pearl Drive (1242 square feet, $459,000). (f) Does the scatterplot reveal any relationship between a house’s size and its price? In other words, does knowing a house’s size provide any useful information about its price? Write a sentence describing the relationship between the two variables. (g) Is house size positively or negatively associated with price? Would you describe the association as strong, moderate, or weak? Is the association roughly linear? Direction: Strength: Form: (h) Find an example of a pair of houses, where one house is larger than the other but costs less. Provide their addresses and circle the pair of points on the scatterplot above.3 Example 2: Birth rates and death rates The following scatterplot displays the relationship between the death rate and the birth rate of the 50 states, both measured per 1000 residents, as of 1997: 201510121110987654birth ratedeath rateABC (a) Identify the observational units in this study. (b) Identify the variables in this study. Also classify each as categorical or quantitative. (c) Describe the overall pattern in this scatterplot. Specifically, do states with higher birth rates tend to have lower, higher, or the same death rates as states with lower birth rates? (d) For each of the lettered states, describe how its birth and death rates compare to the others. (e) These three states are Alaska, Florida, and Utah. Guess which state goes with which letter. Explain the reasoning behind your guesses.4 Example 3: New car data The following nine scatterplots pertain to variables measured on models of new cars in 1999. A: B: C: 2000 3000 4000141516171819weight1/4 mile5 6 7 8 9 10 11 12 13141516171819acc 0-601/4 mile50 150 2501012141618202224page numfuel cap D: E: F: 2000 3000 4000202530weightcity mpg20 25 30455565city mpgfrontwgt20 25 3022273237city mpghwy mpg G: H: I: 10 12 14 16 18 20 22 2422273237fuel caphwy mpg10 12 14 16 18 20 22 24455565fuel capfrontwgt20 25 30141516171819city mpg1/4 mile Arrange these plots from the most strongly negative to the most strongly positive association: Strong negative Moderate negative Virtually none Moderate positive Strong positive Example 4: Space shuttle Challenger disaster The following scatterplots display the number of O-ring seals showing evidence of thermal distress vs. the air temperature at launch for the 23 space shuttle missions preceding the fatal launch of Challenger in January 1986: 807060503210temperatureO-ring failures756555321temperatureO-ring failures5 (a) Explain the difference in how the two graphs were constructed. (b) Would you say that the graph on the left reveals an association between number of O-ring failures and launch temperature? If so, describe its form, direction, and strength. (c) Repeat (b) for the graph on the right. (d) Which graph is more informative? Explain. Example 5: Televisions and life expectancy The following table provides information on life expectancy and number of televisions per thousand people in a sample of 22 countries, as reported by the 2006 World Almanac and Book of Facts: Country Life Expectancy TVs per 1000 People Country Life Expectancy TVs per 1000 People Angola 38.45 15 Mexico 75.25 272 Australia 79.95 716 Morocco 70.75 165 Cambodia 59.00 9 Pakistan 63.00 105 Canada 80.15 709 Russia 67.30 421 China 72.40 291 South Africa 43.30 138 Egypt 71.05 170 Sri Lanka 73.25 102 France 79.70 620 Uganda 51.60 28 Haiti 52.95 5 United Kingdom 78.45 661 Iraq


View Full Document

Cal Poly STAT 252 - Scatterplots, Association, Correlation

Download Scatterplots, Association, Correlation
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Scatterplots, Association, Correlation and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Scatterplots, Association, Correlation 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?