Test 2 10 02 2013 Reliability of Measurement Degree to which a measuring instrument is consistent that is the same individual obtains similar scores in similar situations 3 Ways to assess reliability first thing you look at before you go on and look at something else o Test retest Administer the same test twice to the same individuals see how much the results on the first test match the results on the second test ACT o Alternate form Administer alternate forms of the test to the same individuals ACT SAT o Split half Half of the time items are correlated with the other half from a single testing session weakest form look at scores on one half compared to the scores on the other half o You have to have reliability in measurement Factors that affect reliability o Within a session Participant becomes upset Participant becomes ill Participant misreads a questions Participant guesses Test can t be accurate if it changes with mood or something along those lines o Between sessions Alternate forms may not be equivalent Participant changes Things that affect between the tests People drop out or don t come and take the next test it will affect reliability Video on Reliability o Crow sledding down roof with snow o Anthropomorphize when we give animals or objects human abilities like emotion o Personality questionnaires are so vague they can make anyone believe it is true about them o Reliability is like the foundation to your house you have to have o You should get about the same score on a test about every time you take it o Split half you should score about the same on even number questions as you would on the odd Validity of Measurement Does the defined measurement process actually measure the intended concept Is the test measuring what it says its measuring Several ways to examine validity o Content validity Performance assessment based on information or skills to which participants were previously exposed something we talked about in class E g test questions based on information actually discussed in class Making questions based off the areas on the study guide and no other areas o Face validity Performance assessment appears to be based on the general nature of the information or skills to which participants were previously exposed E g test questions based on the general category of information discussed in class Were the questions about what you were expecting If the questions are so weird and surprising that isn t face validity o Concurrent validity Performance assessment provides scores similar to established assessment instruments E g scores on a new depression scale are similar to those from the Beck Depression Inventory o Predictive validity criterion Performance assessment provide scores that predict behavior in the future E g scores on the ACT predict measures of success in college In general high ACT score high GPA low ACT score low GPA GPA in high school is more predictable for how well someone will do in college than the ACT o Construct validity Measurement of a construct concept provides scores that can reliably identify individual differences and can successfully predict future performance Requires converging evidence from several studies E g intelligence as measure by IQ scores appears to differentiate individuals and appears to predict performance in a variety of arenas Give a test and then accurately be able to identify who s extroverted and who s introverted or a test to accurately define high and low self esteem personality Summary o Good measurement is essential in behavioral research o Good measurement requires specific operational definitions of constructs identification on a particular scale of measurement validity and reliability Methods of Data Collection Observation Activity o Switching the remote between hands The Nature of Observations o The importance of visual observations in behavioral research o Element of subjectivity o Two people can look at the same behavior and see two different observations Ways of observing o Participant vs nonparticipant observations o Scheduling observations o Defining the behavior to be observed Operational definitions Violence on playground what is your definition of violence o Specific techniques for recording behaviors Frequency method How often did something happen how many times did she hit him Duration method How long did something happen Exercise how long did they run could they run Interval method How many times in a 15 min period in the library people walk around on the quiet floor see if there is a difference on the quiet floor and normal floor how many times they get up and take a break o Recording more than one response You want to make sure your operational definition is good enough so your observations can t be more than one thing o We attend to things that fit in to a pattern that reinforces into things we already think Reliability of Observations o Foundation of our measurement o Would the behavior be observed the same way every time o Importance of concurrent and the independent observers Normally have more than one observing someone s behavior o Can measure inter observer agreement If one person looks at violence behavior on a playground and sees like 8 acts of violence behavior and someone else observing the same behavior saw 20 acts there is no inter observer agreement Inter Observer Agreement o High inter observer agreement creates confidence that the behavior is well defined o High inter observer agreement makes it more likely that you will observe an effect of your independent variable o More confident on your results if you have High inter observer agreement so you can get the same results from different people o Steps for maintaining observer reliability Establish objective criteria Pilot test procedures and assess inter observer reliability prior to beginning the actual study If reliability is low reassess definitions criteria and training Lots of training when observing If reliability is high begin study and use periodic checks Periodic retraining of observers If possible use blind observers Measuring the Reliability of Observational Data o Percentage agreement among observers o The reliability coefficient How strongly are these two associated Recordings by Equipment o Common for physiological measures o Often increases precision for behavioral measures o Need to calibrate equipment o Sometimes video behavior and go back until they get reliable behavior Public Records o Examples Census data Crime
View Full Document
Unlocking...