DOC PREVIEW
UMass Amherst COMM-DIS 416 - Test Measurement

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

COMMDIS 416 1st Edition Lecture 6Outline of Last Lecture I. Static and Dynamic Assessments II. Disorder vs. Delay vs. Difference III. SLPs Role in Assessment IV. 3 Methods of Assessment V. Performance-Based Assessment VI. Static Assessment VII. Dynamic Assessment VIII. Zone of Proximal Development IX. Testing the Limits X. Clinical Interview XI. Graduated Prompting XII. Test-Teach-Retest XIII. Information Gained through Dynamic Assessment XIV. Statistic vs. Dynamic Assessments XV. Additional Impressions Outline of Current Lecture I. Psychometric Test Properties II. Standardization Process III. Six Recommended Criteria IV. Bell Curve Theory V. Types of Scores VI. In Summary VII. PPVT-4 Results VIII. Consider the Following Scores Current LectureI. Psychometric Test Properties - A process by which a test or measurement is put under scrutiny with respect to its reliability and validity o Validity: refers to the extent a test measures what it sets out to measure o Reliability: are the same results achieved when administered to the same individual on different occasions?-II. Standardization Process - Test Development: o Administered to a sample groupo Sample group must represent the population for which the test will be used o The bigger the sample group the better - Sample Group:o Should be compromised of specific attributes such as:  Physical attributes: i.e. Age, gender, typical development, etc. Non-physical attributes: i.e. I.Q., language exposure  These attributes establish homogeneity of group III. Six Recommended Criteria - Criterion 1 : Sample Size o A test should include a sample size of at least 100-200 - Criterion 2: Description of normative sampleo Geographic Region- the broader the geographic sampling the better o Mother’s education levelo Subjects: typical speech and language development o Race/ethnicity [African America, white, Hispanic, others (American Indians, Asians, pacific islanders)] - Criterion 3: Validityo The validity of a test measures how well the test is measuring what it says its measuring o Content Validity: the test must adequately sample the domain(s) it purports to measure  E.g., in the case of the GFTA, the articulation of consonant sounds  Completeness of the item sample (23 of 25 SAE consonants) - 2 low sounds-low intervention priority  Way in which the items assess the content o E.g., GFTA-initial, medial, final position?o Construct Validity- the degree to which a test measures the trait that it is intended to measure  E.g., GFTA- the ability to articulate sounds begins in infancy and continues through early childhood. Most children can articulate all consonants by 8 years of age. An assessment that purports to measure consonant articulation ability should demonstrate age differentiation o Concurrent validity- Compare tests to an already established valid test  Used to validate scores and test - Criterion 4: Reliabilityo Test-Retest Reliability: Measure of temporal stability  A test score should not fluctuate over time, assuming no treatment is given Administer the same test to the same sample on 2 different occasions  Time between measures is critical-must wait at least 18 months o Parallel Forms Reliability: Alternate or Equivalent Form Two forms measure the same thing o PPVT-4 – form A and form B  Administer both instruments (Form A/B) to the sample of people  Correlation Coefficient- value between 0.00 and 1.00 Used to assess the consistency of the results of two tests (Form A/B) Preferable for standardized test to have CC > 0.9  How reliable should tests be? Some reliability guidelines:o 0.9 = high reliability o 0.8 = moderate reliability o 0.7 = low reliability o Inter Examiner/Inter-Judge Reliability: The agreement of two independent judges (i.e., two clinicians) on the types of responses performed by a client  The client should obtain the same score if tested by a different examiner Factors that influence interjudge reliability: o Training o Practice o Ambiguous directions o Response complexity o Live scoring vs. tape analysis - Criterion 5: Description of Test Procedure o Test administration should be given in enough detail that the test giver will be able to duplicate the procedures as reported - Criterion 6: Description of Tester Qualifications o The test manual need to describe any general or specialized training required for administrators, test scores, if anyIV. Bell Curve Theory - In order to test score to be interpreted, the clinician must compare the: o Results obtained (raw scores) to some reference point o Note: raw scores are the original data (i.e., number of correctly answered items)o Must determine if observed behavior is within normal limits (typical vs. atypical) - Normal Distribution=Bell Curve Theory:o A normal distribution of data means that the set of data are close to the “average” (central tendency) while relatively few examples tend to one extreme or the other o The larger the sample, the more “normal” the distribution o Useful tool for describing client behavior and how that behavior compares to others o Test subjects should belong to the population to which you are comparing them to under the normal distribution - Mean/Central Tendency: highest point in the distribution or the mean or average score (M, X(bar)) - Standard Deviation: spread of values from the mean, a unit of measurement away from the mean (SD, StDev, σ)- Within +/- standard deviation are within normal limits (average range) - The Empirical Rule: o 68% of “normal” population falls within 1 SD above the mean and 1 SD below the mean. This is the average range. o 95% of the data will fall within 2 standard deviations of the meano Almost all (99.7%) of the data will fall within 3 standard deviations of the mean V. Types of Scores - Standard Score: o Raw score that has been converted into a measurable and comparable score o The raw score is converted to a Standard Score using charts and tables that are provided with each test o The standard score has a mean (100) and standard deviation (15) o Standard score: 85-115 WNL - T-Score:o T-scores have a mean (or average) score of 50 with a standard deviation of 10. Thus, a T-score between 40 and 60 is WNL - Scaled Score: o Scaled scores have a mean (average) score of 10 with a standard deviation of 3. A scaled score of 7-13 is WNL- Percentile Ranking:o Percentile ranking


View Full Document
Download Test Measurement
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Test Measurement and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Test Measurement 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?