Chapter 5 Identifying Good Measurement Types of Measurement 1 Self report Anything that the subjects give the answer to GPA etc Issues people are not always accurate they may overestimate their abilities people might randomly answer Christmas tree they might not be truthful or try to give answers they deem as being correct or desired questions can be biased or loaded confirmatory hypothesis testing 2 Behavioral how a person acts response time as in chess playing easy to measure good rating system can be interested in how chess players can be so good with memory Issues misinterpret the behavior may not be able to observe all of the factors that are contributing to performance limitations to what you can study some people act differently when they are observed faking good or bad Hawthorn Effect 3 Physiological heart rate sweat brain imaging Issues be very careful to get a baseline level everybody s body works differently lie detectors are a physiological measure you can get false positives and false negatives because people might be more anxious or have different body responses simply because they are being measured or they might be able to control certain responses and cheat the system Assessing Reliability getting consistent results 1 Test Retest Reliability is the answer the first time able to predict your answer the second time Correlation Coefficients can quantify reliability High correlation means it is reliable because one measurement can predict a future measurement Test Retest Reliability Does performance at Time A predict performance at Time B 2 Interrater Reliability Ex two people measure same person s aggression on different days This measures how much the two observers agree It should be high if it is not then it is not reliable You do not want negative correlations on this or any reliability On this it would mean the researchers are strongly disagreeing see slide 10 A is good interrater reliability B is bad interrater reliability If you get bad reliability here you can more specifically define what they are supposed to be measuring send out new rater s 3 Internal Reliability do different items on a test correlate with each other Cronbach s Alpha measure of internal reliability measures how well every item on a test correlates with every other item on a test Average correlation is the Cronbach s Alpha Typically 0 1 realistically different questions could correlate with each other negatively but people who get one right would get another wrong but two things that are negatively correlated to the same item would themselves be positively correlated Example of Cronbach s alpha point by serial correlation how every item on a test correlates with all other items on a test Why ask so many questions if they are all correlated So you can establish an average You take more measurements so you can get closer to the average creating a distribution around the true value this will get you the best estimate Validity Is a reliable measure always a good measure NO You need something else VALIDITY The measure has to be VALID and RELIABLE Reliability is consistency over time validity is how your measure actually measures what it s supposed to measure 1 Face Validity the extent to which your test APPEARS to measure what it says it s measuring For example answering math questions as an intelligence test might have face validity superficial validity A test does NOT NEED to have face validity to be valid and having facing validity does NOT MEAN it is valid overall 2 Content Validity the extent to which your measure envelopes everything about a certain construct big picture about a construct If you are measuring depression and you only measure their feelings not also their behaviors then you do not have content validity because you are measuring one aspect of depression but not all of them Scales Categorical Scales If data is in categories Also called Nominal Scales Example Gender Driving experiment with Cell Phone or No Cell Phone Ethnicity level of education high school vs college vs post grad Quantitative Scales Different subtypes of this 1 Ordinal a rank order 1st 2nd 3rd etc distance between ordinal values can vary Problem is difference between the ordinal values is not consistent maybe 1st and 2nd are miles apart 1st is way better than 2nd but 2nd is just barely better than 3rd rankings in athletics 2 equal distances between their values but they have a non meaningful 0 point and Interval the ratio between those distances is not the same Most common example is temperature Fahrenheit and Celsius 0 degrees Celsius and 0 degrees Fahrenheit are not No Temperature not meaningful 0 the distance between 0 and 1 degrees is the same as 1 to 2 degrees Is 6 degrees three times the temperature of 2 degrees NO 6 is not 3 times the temperature of 2 degrees IQ tests 100 is the average Standard deviation is 15 in either direction normally distributed If you have a 0 it does not mean you have NO intelligence there is no 0 point so no meaningful 0 If someone scores a 60 they are not ACTUALLY half as smart as someone with a 120 Therefore A even though C the distance is the same has a true and meaningful 0 Same distance between values and there is a ratio No meaningful true zeros B cannot be thought of as ratios to each other 3 Ratio between distances Kelvin scale of temperature 0 no temperature ABSOLUTE ZERO Weight 0 grams means NO WEIGHT Relative terms can be used distance between 1 pound and 2 pound is the same as 2 to 3 pounds AND 2 pounds can be said to be half as much as 4 pounds More Types of Validity Predictive and Concurrent Validity similar often used interchangeably but technically not the same Concurrent Validity correlating two things at relatively the same time Ex boss giving you an employee exam that asks personality questions then correlates at the same time with your past month s sales Predictive Validity using one measure to predict the performance of another measure at a different date The extent to which a measure predicts future outcomes Ex test for applying to jobs those stupid exams you hate taking are supposed to predict your performance as an employee later on On a Timeline at some point predictive validity will become concurrent and vice versa Slide 18 right is good predictive validity left is poor predictive validity Convergent and Discriminant Validity Convergent Validity predicting other measures of the same construct a test that correlates with other tests of the same
View Full Document