DOC PREVIEW
UCLA STATS 10 - slides_chapters4

This preview shows page 1-2-3-4-27-28-29-30-55-56-57-58 out of 58 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 58 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 58 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 58 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 58 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 58 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 58 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 58 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 58 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 58 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 58 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 58 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 58 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 58 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Chapter 4: Regression Analysis-Exploring Associations Between VariablesScatterplots•Scatterplots are the best way to start observing the relationship and the ideal way to picture associations between two quantitative variables!•In a scatterplot, you can see patterns, trends, relationships, and even the occasional extraordinary value sitting apart from the others!•The variable in the x-axis is called the explanatory variable (also independent) and the variable on the y-axis is called the response variable (also dependent)!•When looking at scatterplots, we look for direction, form, strength, and unusual features!Direction•A pattern that runs from lower left to upper right is said to have a positive direction (as x increases, y increases as well).!•A trend running from upper left to lower right has a negative direction (as x increases, y decreases).!Form•If the points appear as a cloud or swarm of points stretched out in a generally consistent, straight form, the form of the relationship is linear.!Form•Otherwise we categorize the relationship as non-linear.Strength•If there does not appear to be a lot of scatter, there is a strong relationship between the two variables!•If there appears to be some scatter, there is a weak relationship between the two variables!•If there appears to be lots of scatter, there is no relationship between the two variables!Unusual Features• Look for the unexpected; what you never thought to look for might be an interesting feature.!•One example of such a surprise is an outlier standing away from the overall pattern of the scatterplot.!•Clusters or subgroups should also raise questions.Celebrity Couples•According to internet lore, there's a mathematical equation that governs the lower bound for the socially acceptable age of a potential data partner: half your age plus 7, or in mathematical terms if x is your age then the lower bound is x/2 + 7.Celebrity CouplesCorrelation Coefficient•The correlation coefficient (r) (also called Pearson correlation coefficient) gives us a numerical measurement of the strength of the linear relationship between the explanatory and response variables.!!Note: You will not be asked to calculate the correlation coefficients, but you may be asked to estimate it and interpret it.!Correlation Properties•The sign of a correlation coefficient gives the direction of the association.!•Correlation is always between -1 and +1!•Correlation can be exactly equal to -1 or +1, but these values are unusual in real data because they mean that all the data fall exactly on a single straight line!•A correlation near zero corresponds to a weak linear association.!•Correlation treats x and y symmetrically: The correlation of x with y is the same as the correlation of y and x.!Strong = .7-1Moderate = .3-.7Weak = .1-.3Correlation Properties•Correlation has no units!•When arm length and height are measured in centimeters or inches the correlation coefficient is still 0.88!•Correlation measures the strength of the linear association between the two numerical variables.!•Variables can have a strong association but still have a small correlation if the association is not linear!!!!!!!!!!!!!Correlation Properties•Correlation is sensitive to outliers. A single outlying value can make a small correlation large or make a large one smallCorrelation not Causation•A high correlation between two variables does NOT imply causation.!Ice Cream Sales and Shark Attacks•As ice cream sales increase so do shark attacks. Does that mean increased ice cream sales causes an increase in shark attacks?!NOWhich of the following show a strong correlation?a)1, 2, 4, 5!! ! ! b) 3, 6!!!!c) 2, 5, 6!! ! ! ! d) 2, 5Which of the following has a correlation of r = 0.93?a) 1!! ! b) 2!! ! c) 3!! ! d) 4Fat and Sodium•Fast food is often considered unhealthy because much of it is both high in fat and sodium. But are the two related?!!!!!!!!!!!!!!!!!!!!!•There does not appear to be a relationship between sodium and fat content in burgers. The correlation of 0.199 shows a weak relationship between the two.!What about Fat and Calories?•There appears to be a strong, positive linear relationship between fat and calories. The correlation of 0.962 supports the conclusion of a strong relationship. But there appears to be an outlier at 410 calories and 19 grams.What about Fat and Calories?•Even without the outlier at 410 calories and 19 grams of fat the correlation is 0.837, still strong.Fat and Protein?•Now we have a scatterplot of total fat versus total protein for 32 items on the Burger King menu.Linear Model•Correlation says “there seems to be a linear association between these two variable,” but it doesn't tell us what that association is.!•We can say more about the linear relationship between two quantitative variables with a model.!•A model simplifies reality to help us understand underlying patterns and relationships!•The linear model is just an equation of a straight line through the data.!•The points in the scatterplot don't all line up, but a straight line can summarize the general pattern!•The linear model can help us understand how the explanatory variable and the response variable are associated.!Linear ModelResiduals•The model won't be perfect, regardless of the line we draw!•Some points will be above the line and some will be below the line!•The estimate made from a model is the predicted value, denoted as!•The difference between the observed value and its associated predicted value is called the residual!•To find the residuals, we always subtract the predicted value from the observed one: !residual = observed - predicted!distance between line and data pointerror = observed y - predicted yResiduals•A negative residual means the predicted value is too big (an overestimate).!•A positive residual means the predicted value is too small (an underestimate).“Best Fit” Means Least Squares•Some residuals are positive, others are negative, an, on average, they cancel each other out.!•So, we can't assess how well the line fits by adding up all the residuals.!•Similar to what we did with deviations, we square the residuals and add the squares.!•The smaller the sum, the better the fit.!•The line of best fit is the line for which the sum of the squared residuals is smallest.!Fathom Example of Best Fit LineThe Least Squares Line•We write the linear model as:!!!!!!!•This model says


View Full Document

UCLA STATS 10 - slides_chapters4

Download slides_chapters4
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view slides_chapters4 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view slides_chapters4 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?