DOC PREVIEW
UVA STAT 2120 - Topic_02

This preview shows page 1-2-15-16-17-32-33 out of 33 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Examining RelationshipsExamining RelationshipsLeast-squares regressionSection 2.3The regression line A regression line describes a one-way linear relationship between variables. An explanatory variable, x, “explains” variability in a response variableyresponse variable, y.Often one wants topredictyfrom a givenxOften one wants to predictyfrom a given x.The least-squares regression lineThe “least-squares” regression line is:with “slope” standard deviation in ystandard deviation in xand “intercept”correlation of x and y A prediction, ŷ, is made by plugging in a value of xInterpretations The least-squares line minimizes the sum of squared-prediction errors.Note: “vertical prediction errors:” interchanging x and yld dif h f l iwould modify the formulation.Interpretations Slope, b1, is amount of change in ŷ when x increases by one unit.y Intercept, b0, is the prediction, ŷ, at x = 0.Example: BAC data.Each beer increases predicted BAC by 0 0180Each beer increases predicted BAC by 0.0180. Predicted BAC after no beers is -0.0127 ≈ 0.random variabilityCoefficient of determination, r2 The coefficient of determination is r2, measures the proportion of variability explained by the regression liline.r2= 0.76var. in yvar. in y-hatResiduals Analysis of residuals, y – ŷ, helps to assess the suitability of a linear relationship.Residual plots The ideal plot of residuals (y – ŷ against x) would exhibit no systematic pattern.Problem indicators Systematic patterns suggest complications and possible invalidity in the use of linear regression.Curved pattern: deviations from the linear form.Trends in spread: less prediction accuracy inprediction accuracy in some regions of x.Influential observations An influential observation is an observation whose deletion would drastically change the regression line.0.160.180.20point #3complete data0.100.120.140.040.060.08point #3 deleted0.000.020246810Often an outlier in x, but may not be an outlier in yExamining RelationshipsExamining RelationshipsCautions about correlation and regressionSection 2.4Basic cautions Correlation is for two-way relationships; regression is for one-way relationships.yp Both are only relevant for linear relationships. Neither is resistant.Extrapolation Extrapolation is when predictions are made outside the range of data.g The linear relationship may be untrustworthy outside the range of data.Example: BAC data.unconsciousness or death Predicted BAC after x = 24 beers: ŷ = 0.4184. Predicted BAC after x = 36 beers: ŷ = 0.6340.Lurking variables A lurking variable may influence the relationship between variables. An unobserved lurking variable may explain puzzling associationsassociations.Example: Higher rates of red-wine drinking⇔Better levels of overall healthPossible lurking variables: income, other lifestyle tendencies etctendencies, etc.Association is not causation An observed association may reflect the influence of a causal lurking variable.g An experiment that controls lurking variables is best for establishing causation.Example:BAC: control weight gender etcExample:BAC: control weight, gender, etc.Causation may be established in other ways, butCausation may be established in other ways, but with weaker evidence.Examining RelationshipsExamining RelationshipsRelationships in categorical dataSection 2.5Two-way tables Relationships in categorical data may be explored by compiling variables in two-way tables.pgyColumn variableRow variable(cnt./1000) Age groupEducation 25‐34 35‐54 55+< High school4459917414226Row variable< High school4459917414226High school 11562 26455 20060Collge 1‐3 yrs. 10693 22647 11125llCollege 4+ yrs. 11071 23160 10597Marginal distributions The marginal distributions are the individual distributions of the row and column variables.(They appear in the margins of the two-way table.)(cnt./1000) Age group RowEducation 25‐34 35‐54 55+ totals< High school445991741422627859< High school445991741422627859High school 11562 26455 20060 58077Collge 1‐3 yrs. 10693 22647 11125 44465College 4+ yrs. 11071 23160 10597 44828Column totals 37785 81436 56008 175229Conditional distributions A conditional distribution is calculated from the counts of one variable limited to a given category ggyof the other variable.(cnt/1000)Age groupRow(cnt./1000)Age groupRowEducation 25‐34 35‐54 55+ totals< High school 4459 9174 14226 27859High school 11562 26455 20060 58077Collge 1‐3 yrs. 10693 22647 11125 44465College 4+ yrs.11071231601059744828College 4+ yrs.11071231601059744828Column totals 37785 81436 56008 175229Visualizing relationships(cnt./1000) Age group Row Describe relationships with conditional distributions.Education 25‐34 35‐54 55+ totals< High school 16% 33% 51% 100%High school20%45%35%100%High school20%45%35%100%Collge 1‐3 yrs. 24% 51% 25% 100%College 4+ yrs. 25% 52% 24% 100%40%50%60%< High school40%50%High school40%50%60%College, 1-3 yrs.40%50%60%College, 4+ yrs.0%10%20%30%40%0%10%20%30%0%10%20%30%40%0%10%20%30%40%0%25-34 35-54 55+0%25-34 35-54 55+0%25-34 35-54 55+0%25-34 35-54 55+Producing dataProducing dataIntroductionChapter 3Observational studies and experimentsCentral issue: the (undesirable) possibility of “confounding” between an explanatory variable  In an observational study, individuals are and a lurking variable.y,observed, but no attempt is made to control the conditions of data-production.Often plagued by confounding with lurking variablesOften plagued by confounding with lurking variablesIn anexperiment, the conditions of data-In an experiment, the conditions of dataproduction are controlled by applying treatments to individuals.Aid llt f f diAvoids all types of confoundingProducing dataProducing dataDesigning samplesSection 3.1Key elements of a sampling study Population: a collection of individuals about which the conclusions of statistical inference are to be relevant. Sample: the subset of a population on which data are measured and put to analysis. Sampling design: the method used to select the sample from the population.Biased sampling designs Biased sampling: favors some portions of the population over others.ppExamples: Voluntary sampling: individuals are self-selected by responding to an incentive. Convenience sampling: selection is determined by the


View Full Document

UVA STAT 2120 - Topic_02

Download Topic_02
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Topic_02 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Topic_02 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?