New version page

UIUC PSYC 235 - Studying Relationships between Variables

Upgrade to remove ads
Upgrade to remove ads
Unformatted text preview:

Psyc 235: Introduction to StatisticsEnd of Term BusinessQuestions?Studying Relationships between VariablesCorrelation & Regression What’s the difference?Difference between correlation & regressionLast Week: Scatter PlotsConsider Data:CorrelationCorrelation Coefficient: rExample for discussion:What do we always do first in a study?CovarianceThink about it…Slide 15Pearson Correlation Coefficient ( r )Last week you learned:Looking at the data…Optimal values of a & bPowerPoint PresentationSlide 21Slide 22Standard Error of EstimateData ExampleConfidence Limits on YCI on YHypothesis TestingTesting rTesting bSlide 30CI on bICES FormsPsyc 235:Introduction to StatisticsDON’T FORGET TO SIGN IN FOR CREDIT!http://www.psych.uiuc.edu/~jrfinley/p235/End of Term Business•3rd Graded Assessment next M & WSame procedure as the last twoIf you want to sign up for a specific time please email one of us.•Extra Credit Final Requirements84 hours on ALEKS by 11:59 pm April 30thYou can check how many in class hours you have during the assessmentYou must schedule an extra credit final (no walk-ins) by 11:59 May 4th (email is fine)Asssessment can be scheduled anytime 9-5 on May 8th or May 9thQuestions?You master98% or moreYour receiveA+92% – 97% A84% – 91% B75% - 83% C65% - 74% D< 65% FStudying Relationships between Variables•When considering relationships,An independent variable (X) usually has many quantitative levelsWant to show that the dependant variable is some function of the independent variableCorrelation & RegressionCorrelation & RegressionWhat’s the difference?•Usually, 2 observations for each of N subjects Consider these 2 examples:running speed in a maze(Y) & number trials to reach criterion(X)Running speed (Y) & the number of food pellets per reinforcement(X)•What’s the difference here?•In both, Y is a random variable.•In one X is fixed and the other it is random.Difference between correlation & regression•When X is fixed variable-- regression•When X is random variable -- correlation•Note: in practice they are very similar, often differentiated by purpose of research•Goal is to predict Y from X: regression•Goal is to know degree of relationship: correlationLast Week: Scatter PlotsQuickTime™ and a decompressorare needed to see this picture.X-axis: predictor variableY-axis: criterion variableConsider Data:Infant mortality as a function of # physicians-10-5051010 12 14 16 18 20physicians per 10,000 populationAdjusted infant mortality• What do you see?• Unexpected?• As we continue, think about why we might seethis patternWhat is this line?Regression Line:Y predicted on XGiven a value of X,our prediction of what Y is likely to beCorrelation•Correlation is a measure of the degree to which the data clusters around that line. If all data is on the line r=+1.Infant mortality as a function of # physicians-10-5051010 12 14 16 18 20physicians per 10,000 populationAdjusted infant mortalityCorrelation Coefficient: r•sign: direction of relationship•magnitude (number): strength of relationship•-1 ≤ r ≤ 1•r=0 is no linear relationship•r=-1 is “perfect” negative correlation•r=1 is “perfect” positive correlation•Notes:Symmetric Measure (You can exchange X and Y and get the same value)Measures linear relationship onlyExample for discussion:•Howell(1988) investigated the relationship between stress and health.Measures:Stress: a scale to measure frequency, perceived importance, and desirability of life eventsHealth: Hypkins Symptom Checklist for 57 psychological symptomsWhat do we always do first in a study?•Look at the raw data!What do you notice?unimodalpositively skewedfew outliersgood variabilityCovariance•Correlation Coefficient is based off of covariance•Covariance: number that reflects the degree to which two variables vary together•CovXY=∑(X-X)(Y-Y) N - 1Think about it…•If there was a positive correlation between these two variables what we expect to find?Participant Stress(X) Symptoms(Y)1 30 992 27 943 9 804 20 705 3 1006 15 1097 5 628 10 819 23 7410 34 121. . .. . .. . .∑X = 2297∑X2 = 67489X = 21467Sx = 13.096∑Y = 9705∑Y2= 923787Y = 90.701Sy =20.266∑XY = 22576N = 107Think about it…Participant Stress(X) Symptoms(Y)1 30 992 27 943 9 804 20 705 3 1006 15 1097 5 628 10 819 23 7410 34 121. . .. . .. . .∑X = 2297∑X2 = 67489X = 21467Sx = 13.096∑Y = 9705∑Y2= 923787Y = 90.701Sy =20.266∑XY = 22576N = 107CovXY=∑(X-X)(Y-Y) N - 1or CovXY=∑XY-(∑X∑Y/N) N - 1CovXY=134.301Pearson Correlation Coefficient ( r )•The way we were talking about the covariance suggests that it might be a measure of the relationship between 2 variables•But: Covariance is also a function of the standard deviations of X and Y•To resolve:r = covxy SXSYr= 134.301 = .506 (13.1)(20.3)Last week you learned:•Formula:•alt. formula (ALEKS):•So this is the same formula for r, just in a slightly different form.•r is the covariance, adjusted by the standard deviation•Also Note: this r is not an unbiased estimate of the correlation coefficient in the population: so there is an adjusted formula for a small sample size (not common)€ r =1n −1xi− x sx ⎛ ⎝ ⎜ ⎞ ⎠ ⎟yi− y sy ⎛ ⎝ ⎜ ⎜ ⎞ ⎠ ⎟ ⎟∑€ r =xiyi− nx y i=1n∑(n −1)sxsyLooking at the data…Infant mortality as a function of # physicians-10-5051010 12 14 16 18 20physicians per 10,000 populationAdjusted infant mortalityEquation for straight line:Y = bX + aY= predicted Y valueb = slopea = interceptX= value of predictor valueOur task is to solve for a & be to find best-fitting linear function.A logical way to do this is in terms of errors of prediction…So, how far apart are predicted Ys and actual Ys^^Optimal values of a & b•a = Y -bX•b = covXY s2x•In our stress data set:•b=.7831•a=73.891•Y=.78*X+73.9So, we could finda predicted value of Y for any X…And then find the difference between predicted and actual.^XYEstimated Regression LineFind (Y-Y)2^XYEstimated Regression Lineiiiyyeˆ−=iyiyˆixbXaY +=ˆ:Line Regression theofEquation iyˆ=“y hat”: predicted value of Y for Xi€ b1=Xi− X ( )Yi−Y ( )i=1N∑Xi− X ( )2i=1N∑ALEKS:€ b1= rsysx€ b0= Y − b1X Y=b0+b1X€ b1=xiyi− nx y i=1n∑(n −1)sx2b1 (slope)b0 (Y intercept)usingcorrelationcoefficientStandard Error of Estimate•So if we wanted to predict a value of Y from X, we could just plug it


View Full Document
Download Studying Relationships between Variables
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Studying Relationships between Variables and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Studying Relationships between Variables 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?