This preview shows page 1-2-3-4 out of 13 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1Stat 13, UCLA, Ivo DinovSlide 1UCLA STAT 13Introduction toStatistical Methods for the Life and Health SciencesInstructor: Ivo Dinov, Asst. Prof. of Statistics and NeurologyTeaching Assistants:Brandi Shanata & Tiffany HeadUniversity of California, Los Angeles, Fall 2007http://www.stat.ucla.edu/~dinov/courses_students.htmlStat 13, UCLA, Ivo DinovSlide 2Chapter 13Regression & CorrelationStat 13, UCLA, Ivo DinovSlide 3Linear Relationshipsz Analyze the relationship, if any, between variables x and y by fitting a straight line to the data If a relationship exists we can use our analysis to make predictionsz Data for regression consists of (x,y) pairs for each observation For example: the height and weight of individualsStat 13, UCLA, Ivo DinovSlide 4Lines in 2D(Regression and Correlation)Vertical LinesHorizontal LinesOblique linesIncreasing/DecreasingSlope of a lineInterceptY=α X + β, in general.Math Equation for the Line?Stat 13, UCLA, Ivo DinovSlide 5Lines in 2D(Regression and Correlation)Draw the following lines:Y=2X+1Y=-3X-5Line through (X1,Y1) and (X2,Y2). (Y-Y1)/(Y2-Y1)= (X-X1)/(X2-X1). Math Equation for the Line?Stat 13, UCLA, Ivo DinovSlide 6Correlation Coefficient Correlation coefficient (-1<=R<=1): a measure of linear association, or clustering around a line of multivariate data. Relationship between two variables (X, Y) can be summarized by: (μX, σX), (μY, σY) and the correlation coefficient, R. R=1, perfect positive correlation (straight line relationship), R =0, no correlation(random cloud scatter), R = –1, perfect negative correlation. Computing R(X,Y): (standardize, multiply, average)⎟⎠⎞⎜⎝⎛−∑=⎟⎠⎞⎜⎝⎛−−=yykxxkyNkxNYXRσμσμ111),(X={x1, x2,…, xN,}Y={y1, y2,…, yN,}(μX, σX), (μY, σY)sample mean / SD.2Stat 13, UCLA, Ivo DinovSlide 7Correlation Coefficient Example:⎟⎠⎞⎜⎝⎛−∑=⎟⎠⎞⎜⎝⎛−−=yykxxkyNkxNYXRσμσμ111),(Stat 13, UCLA, Ivo DinovSlide 8Correlation Coefficient Example:⎟⎠⎞⎜⎝⎛−∑=⎟⎠⎞⎜⎝⎛−−=yykxxkyNkxNYXRσμσμ111),(904.0),(),(,563.653.215 ,573.65216,kg 556332 ,cm 1616966==========YXRYXCorrYXYXσσμμStat 13, UCLA, Ivo DinovSlide 9Correlation Coefficient - PropertiesCorrelation is invariant w.r.t. linear transformations of X or Y⎟⎠⎞⎜⎝⎛−=⎟⎠⎞⎜⎝⎛×−+−=⎟⎠⎞⎜⎝⎛×+−+=⎟⎠⎞⎜⎝⎛−+++=⎟⎠⎞⎜⎝⎛−∑=⎟⎠⎞⎜⎝⎛−−=++xxkxkxxkbaxbaxkyykxxkxabbxaababaxbaxdcYbaXRyNkxNYXRσμσμσμσμσμσμ)(||)(since ),,(111),(Stat 13, UCLA, Ivo DinovSlide 10Correlation Coefficient - PropertiesCorrelation is AssociativeCorrelation measures linear association, NOT an association in general!!! So, Corr(X,Y) could be misleading for X & Y related in a non-linear fashion.),(11),( XYRyNkxNYXRyykxxk=⎟⎠⎞⎜⎝⎛−∑=⎟⎠⎞⎜⎝⎛−=σμσμStat 13, UCLA, Ivo DinovSlide 11Correlation Coefficient - Properties1. R measures the extent oflinear association betweentwo continuous variables. 2. Association does not implycausation - both variablesmay be affected by a thirdvariable – age was a confounding variable.),(11),( XYRyNkxNYXRyykxxk=⎟⎠⎞⎜⎝⎛−∑=⎟⎠⎞⎜⎝⎛−=σμσμStat 13, UCLA, Ivo DinovSlide 12Linear Relationships98737St. Louis138210Pittsburgh179787Orlando98189New York188998New Orleans198946Miami2581502Denver158409Detroit2781216Dallas94612Chicago138370Boston178576AtlantaAirfareDistanceDestination3Stat 13, UCLA, Ivo DinovSlide 13z Until now we have described data using statistics such as the sample meanz What seems to bemissing from thisone sample viewof the data?Descriptive Statistics: Distance, AirfareVariable N N* Mean SE Mean StDev Minimum Q1 Median Q3 MaximumDistance 12 0 713 116 403 189 380 675 985 1502Airfare 12 0 166.9 17.2 59.5 94.0 108.0 168.0 195.5 278.0Linear RelationshipsStat 13, UCLA, Ivo DinovSlide 14z This scatterplot gives us a view of how the dependent variable airfare (y) changes with the independent variable distance (x) z From this data there appears to be a linear trend, but the data do not fall in an exact straight line Still may be reasonable to fit a line to this dataDist anc eAirfare16001400120010008006004002000300250200150100Scatterplot of Airfare vs DistanceLinear RelationshipsStat 13, UCLA, Ivo DinovSlide 15Linear RelationshipsSOCR SLR: socr.ucla.edu/htmls/SOCR_Analyses.htmlSample Size = 12 Dependent Variable = X Independent Variable = Y Simple Linear Regression Results:Mean of Y = 166.917Mean of X = 712.667Regression Line:X = -186.090 + 5.384464042692271 YCorrelation(Y, X) = .795R-Square = .632Intercept: Parameter Estimate: -186.090Standard Error: 229.137T-Statistics: -.812P-Value: .436Slope: Parameter Estimate: 5.384Standard Error: 1.299T-Statistics: 4.144P-Value: .002Stat 13, UCLA, Ivo DinovSlide 16Linear RelationshipsSOCR SLR: socr.ucla.edu/htmls/SOCR_Analyses.htmlStat 13, UCLA, Ivo DinovSlide 17z Two Contexts for regression:1. Y is an observed variable and X is specified by the researcher Ex. Y is hair growth after 2 months, for individuals at certain dose levels of hair growth cream (X)2. X and Y are observed variables Ex. Height (Y) and weight (X) for 20 randomly selected individualsLinear RelationshipsStat 13, UCLA, Ivo DinovSlide 18z Suppose we have n pairs (x,y)z If a scatterplot of the data suggests a general linear trend, it would be reasonable to fit a line to the dataz The question is which is the best line?Example Airfare (cont’) We can see from the scatterplot that greater distance is associated with higher airfare In other words airports that tend to be further from Baltimorethan tend to be more expensive airfarez To decide on the best fitting line, we use the least-squares method to fit the least squares (regression) lineThe Fitted Regression Line4Stat 13, UCLA, Ivo DinovSlide 19z RECALL: y = mx+ bz In statistics we call this Y = b0+ b1Xwhere Y is the dependent variableX is the independent variableb0is the y-interceptb1is the slope of the lineEquation of the Regression Line()()()∑∑−−−2xxyyxxiiixby1−Stat 13, UCLA, Ivo DinovSlide 20LS Estimates for the Linear Parameters1. The least-squares line passes through the points (x = 0, = ?) and (x = , = ?). Supply the missing values.xˆ y =ˆ β 0+ˆ β 1xyˆyˆ[]xynixixniyiyxix10ˆˆ


View Full Document

UCLA STATS 13 - Lecture Notes

Documents in this Course
lab8

lab8

3 pages

lecture2

lecture2

78 pages

Lecture 3

Lecture 3

117 pages

lecture14

lecture14

113 pages

Lab 3

Lab 3

3 pages

Boost

Boost

101 pages

Noise

Noise

97 pages

lecture10

lecture10

10 pages

teach

teach

100 pages

ch11

ch11

8 pages

ch07

ch07

12 pages

ch04

ch04

10 pages

ch07

ch07

12 pages

ch03

ch03

5 pages

ch01

ch01

7 pages

ch10

ch10

7 pages

Lecture

Lecture

2 pages

ch06

ch06

11 pages

ch08

ch08

5 pages

ch11

ch11

9 pages

lecture16

lecture16

101 pages

lab4

lab4

4 pages

ch01

ch01

7 pages

ch08

ch08

5 pages

lecture05

lecture05

13 pages

Load more
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?