PSY 200: Chapter 8 and 9 Quiz 3
61 Cards in this Set
Front | Back |
---|---|
Correlation
|
Relationship between 2 variables
Linear relationship
|
Correlation Coefficient
|
Measure used to express extent or strength of relationship
Represented by r
|
Positive Correlation
|
0 < r < 1
Score high on 1 variable and score high on the other
Score low on 1 variable and score low on the other
Positive Slope
1.0= perfect correlation
|
Negative Correlation
|
-1 < r < 0
Score high on 1 variable and score low on the other
Negative Slope
-1= perfect correlation
|
Zero
|
0= no correlation
No linear relationship
|
Linear Relationship
|
Looking for linear relationship others may exist (u-shaped)
Correlation only measures linear
|
Correlation RULE
|
Correlation does not = Causation!!!
Just means that there was a relationship
|
r < 0.29
|
Small correlation/ weak relationship
|
r 0.3-0.49
|
Medium correlation/ relationship
|
r 0.5 - 1.0
|
Large correlation/ strong relationship
|
Scatter diagram
|
Graphic means to show data points and correlation and (later) regression
|
Centroid
|
(Mean of X, Mean of Y) This will be the central point (X,Y) of 2 variables
|
When to use Pearson r
|
Interval and ratio data
|
Pearson r Z Score method
|
r = sum (ZxZy)/N
Good if you already have Z scores
Answer must be between -1 and 1
|
Correlation Coefficient Pearson r Raw score method
|
Will have to find 8 sums
|
Covariace
|
Numerator
Degree to which 2 variables share common variance
Can be a negative number on top but not on bottom
|
High Covariance
|
More linear
Covariance closer to +/- 1
|
Low Covariance
|
Less linear
Closer to 0
|
If r = +/- 1
|
All data fall in a line
|
If r < 1
|
Data are scattered
|
3 types of Variation
|
Total= explained (r2) + unexplained (k2)
|
Total Variation
|
Graph with all arrows pointing at the mean line (middle horizontal line)
|
Explained Variation
|
Double arrows pointing from Mean line to regression line in 2 spots on either side of the centroid
|
Unexplained Variation
|
All arrows point toward regression line
Weighs how far away data is from where the regression line is
|
If r = +/- 1
|
All is explained
|
If r= 0
|
All is unexplained
|
r2 Coefficient of Determination
|
The proportion of 1 variable explained by the other
|
k2 Coefficient of non-determination
|
proportion of 1 variable not explained by the other
|
toal = 1 or 100%
|
1= r2 + k2
k2= 1-r2
|
Cautions with pearson r
|
Measures linearity so low r means not linear; could still have non-linear relationship
Distribution need not be normal but must be unimodal and skewed
If truncated will get spuriously low r (r is always lower when you truncate the data)
|
Spearman r
|
Used with ordinal data rs
Both variables must be rank ordered
|
Non parametric test
|
looks at ranks only
|
Parametric Test
|
Uses actual numbers
|
D
|
rank x- rank y
|
Sum of D
|
0
|
Tied Scores and Spearman r
|
If tied must take this into account to be fair
take the mean of the tied ranks and assign mean rank to both
If there are more than 2 take middle number
|
Correlation Matrix
|
Table to visualize many correlations
Correlate the most= Number closest to 1 or -1
Correlate the least= Number closest to 0
|
If r is 0 or very low
|
Does not mean no correlation at all it means that there is no linear correlation
|
Share common variance
|
If high relationship= linear and closer to 1 /-1
If Low relationship= less linear and closer to 0
|
X Rank
|
High to low
|
Regression
|
Allows you to predict relationships
|
Regression Analysis Equation
|
Y= a + by X
X, Y= Variables
by= slope, (m), (tilt)
a= y intercept (b) (where it hits y-axis)
|
r = +/- 1
|
It's easy to predict and draw the line
|
r < +/- 1
|
You must draw a "best fit" line
|
Properties of the regression line
|
Squared deviations around the line are minimal
Sum deviations = 0
New symbols X' and Y' are for predictions
|
To find the regression line equation
|
Use the formula with 3 formulas in it
|
To draw the regression line for Y= a + by X
|
1.) Pick 2 reasonable values for X
2.) Put in the equation and solve for Y
3.) plot the 2 pairs of X,Y points
4.) Connect the dots with a line
|
X= a +bx Y
|
In regression analysis you can also find X= a + bx Y and get 2 regression lines that have certain relationship
r = 1 Line is on the line
r = 0-75 line crosses through the line narrow
r = 0.25 line is quarter through
r = 0 the line is perpendicular
|
r = +/- 1
|
Superimposed
|
r = 0
|
Perpendicular
|
Intersection point
|
Mean of x, Mean of Y = Centroid
|
Standard Error of the Estimate
|
sesty
Estimate of the standard deviation of data around the regression line
k2 was a version of this but not really in terms of standard deviation
|
r = +/- 1 and sesty
|
sesty = 0 means no errors/ deviation
|
r = 0 and sesty
|
means sesty is maximal a lot of deviation
|
Larger sesty means
|
Less accurate predictions
|
Y' and Y true
|
Recall Y' was a prediction not a fact
Using sesty we can find an interval where are 68% sure that true Y will be
|
Sesty and Y true
|
are influenced by magnitude of X and Y
|
Variance and sesty
|
Low variance > better/ lower setsty > better Y true
|
Homoscedasticity
|
Where variance of 1 variable is constant at all levels of the other variable
|
Heteroscedasticity
|
Where variance of 1 variable is not constant at all levels of the other variable
|
Post Hoc Fallacy
|
Assuming a cause and effect relationship from correlation data
|