UIUC PSYC 235 - Studying Relationships between Variables - D1175927

Home> Schools> University of Illinois at Urbana, Champaign> Psychology (PSYC) > PSYC 235> Studying Relationships between Variables

UIUC PSYC 235 - Studying Relationships between Variables

School name University of Illinois at Urbana, Champaign

Course Psyc 235- Intro to Statistics

Pages 32

Download Save

Unformatted text preview:

Psyc 235: Introduction to StatisticsEnd of Term BusinessQuestions?Studying Relationships between VariablesCorrelation & Regression What’s the difference?Difference between correlation & regressionLast Week: Scatter PlotsConsider Data:CorrelationCorrelation Coefficient: rExample for discussion:What do we always do first in a study?CovarianceThink about it…Slide 15Pearson Correlation Coefficient ( r )Last week you learned:Looking at the data…Optimal values of a & bPowerPoint PresentationSlide 21Slide 22Standard Error of EstimateData ExampleConfidence Limits on YCI on YHypothesis TestingTesting rTesting bSlide 30CI on bICES FormsPsyc 235:Introduction to StatisticsDON’T FORGET TO SIGN IN FOR CREDIT!http://www.psych.uiuc.edu/~jrfinley/p235/End of Term Business•3rd Graded Assessment next M & WSame procedure as the last twoIf you want to sign up for a specific time please email one of us.•Extra Credit Final Requirements84 hours on ALEKS by 11:59 pm April 30thYou can check how many in class hours you have during the assessmentYou must schedule an extra credit final (no walk-ins) by 11:59 May 4th (email is fine)Asssessment can be scheduled anytime 9-5 on May 8th or May 9thQuestions?You master98% or moreYour receiveA+92% – 97% A84% – 91% B75% - 83% C65% - 74% D< 65% FStudying Relationships between Variables•When considering relationships,An independent variable (X) usually has many quantitative levelsWant to show that the dependant variable is some function of the independent variableCorrelation & RegressionCorrelation & RegressionWhat’s the difference?•Usually, 2 observations for each of N subjects Consider these 2 examples:running speed in a maze(Y) & number trials to reach criterion(X)Running speed (Y) & the number of food pellets per reinforcement(X)•What’s the difference here?•In both, Y is a random variable.•In one X is fixed and the other it is random.Difference between correlation & regression•When X is fixed variable-- regression•When X is random variable -- correlation•Note: in practice they are very similar, often differentiated by purpose of research•Goal is to predict Y from X: regression•Goal is to know degree of relationship: correlationLast Week: Scatter PlotsQuickTime™ and a decompressorare needed to see this picture.X-axis: predictor variableY-axis: criterion variableConsider Data:Infant mortality as a function of # physicians-10-5051010 12 14 16 18 20physicians per 10,000 populationAdjusted infant mortality• What do you see?• Unexpected?• As we continue, think about why we might seethis patternWhat is this line?Regression Line:Y predicted on XGiven a value of X,our prediction of what Y is likely to beCorrelation•Correlation is a measure of the degree to which the data clusters around that line. If all data is on the line r=+1.Infant mortality as a function of # physicians-10-5051010 12 14 16 18 20physicians per 10,000 populationAdjusted infant mortalityCorrelation Coefficient: r•sign: direction of relationship•magnitude (number): strength of relationship•-1 ≤ r ≤ 1•r=0 is no linear relationship•r=-1 is “perfect” negative correlation•r=1 is “perfect” positive correlation•Notes:Symmetric Measure (You can exchange X and Y and get the same value)Measures linear relationship onlyExample for discussion:•Howell(1988) investigated the relationship between stress and health.Measures:Stress: a scale to measure frequency, perceived importance, and desirability of life eventsHealth: Hypkins Symptom Checklist for 57 psychological symptomsWhat do we always do first in a study?•Look at the raw data!What do you notice?unimodalpositively skewedfew outliersgood variabilityCovariance•Correlation Coefficient is based off of covariance•Covariance: number that reflects the degree to which two variables vary together•CovXY=∑(X-X)(Y-Y) N - 1Think about it…•If there was a positive correlation between these two variables what we expect to find?Participant Stress(X) Symptoms(Y)1 30 992 27 943 9 804 20 705 3 1006 15 1097 5 628 10 819 23 7410 34 121. . .. . .. . .∑X = 2297∑X2 = 67489X = 21467Sx = 13.096∑Y = 9705∑Y2= 923787Y = 90.701Sy =20.266∑XY = 22576N = 107Think about it…Participant Stress(X) Symptoms(Y)1 30 992 27 943 9 804 20 705 3 1006 15 1097 5 628 10 819 23 7410 34 121. . .. . .. . .∑X = 2297∑X2 = 67489X = 21467Sx = 13.096∑Y = 9705∑Y2= 923787Y = 90.701Sy =20.266∑XY = 22576N = 107CovXY=∑(X-X)(Y-Y) N - 1or CovXY=∑XY-(∑X∑Y/N) N - 1CovXY=134.301Pearson Correlation Coefficient ( r )•The way we were talking about the covariance suggests that it might be a measure of the relationship between 2 variables•But: Covariance is also a function of the standard deviations of X and Y•To resolve:r = covxy SXSYr= 134.301 = .506 (13.1)(20.3)Last week you learned:•Formula:•alt. formula (ALEKS):•So this is the same formula for r, just in a slightly different form.•r is the covariance, adjusted by the standard deviation•Also Note: this r is not an unbiased estimate of the correlation coefficient in the population: so there is an adjusted formula for a small sample size (not common)€ r =1n −1xi− x sx ⎛ ⎝ ⎜ ⎞ ⎠ ⎟yi− y sy ⎛ ⎝ ⎜ ⎜ ⎞ ⎠ ⎟ ⎟∑€ r =xiyi− nx y i=1n∑(n −1)sxsyLooking at the data…Infant mortality as a function of # physicians-10-5051010 12 14 16 18 20physicians per 10,000 populationAdjusted infant mortalityEquation for straight line:Y = bX + aY= predicted Y valueb = slopea = interceptX= value of predictor valueOur task is to solve for a & be to find best-fitting linear function.A logical way to do this is in terms of errors of prediction…So, how far apart are predicted Ys and actual Ys^^Optimal values of a & b•a = Y -bX•b = covXY s2x•In our stress data set:•b=.7831•a=73.891•Y=.78*X+73.9So, we could finda predicted value of Y for any X…And then find the difference between predicted and actual.^XYEstimated Regression LineFind (Y-Y)2^XYEstimated Regression Lineiiiyyeˆ−=iyiyˆixbXaY +=ˆ:Line Regression theofEquation iyˆ=“y hat”: predicted value of Y for Xi€ b1=Xi− X ( )Yi−Y ( )i=1N∑Xi− X ( )2i=1N∑ALEKS:€ b1= rsysx€ b0= Y − b1X Y=b0+b1X€ b1=xiyi− nx y i=1n∑(n −1)sx2b1 (slope)b0 (Y intercept)usingcorrelationcoefficientStandard Error of Estimate•So if we wanted to predict a value of Y from X, we could just plug it

View Full Document


School:
Email:
New Password:
Confirm Password:

UIUC PSYC 235 - Studying Relationships between Variables

Sign up for free to view:

Please select your school