7. Testing
Testing: Big Questions • How do teachers construct tests? • How are teacher-made tests like/unlike standardized tests? • What information comes from test results?
7.1 Instructional Objectives 7.2 Teacher-Developed Tests in the Classroom 7.3 Formative Evaluation 7.4 Classroom Grading Approaches
7.5 Criterion- Referenced Testing 7.6 Norm-Referenced Testing 7.7 Interpreting Norm- Referenced Tests Scores 7.8 Validity
7.9 Reliability 7.10 Test Bias 7.11 Using Tests Appropriately 7.12 Summary
7.1 Instructional Objectives
Objectives: Checklist for learning • More specific than goals • What students should know or be able to do by end of lesson ➔ descriptive verbs! • Taxonomies provide hierarchies of increasing sophistication • Bloom: Cognitive, affective, psychomotor
Bloom’s taxonomies • Cognitive most used • 6 levels: remember, comprehend, apply, analyze, evaluate, create • Objective: “Students will compare and contrast yurts and tipis, in 3 key features.” • Note task, level (analysis), criteria ➔ “Mastery learning” system
7.2 Teacher-Developed Tests in the Classroom
Classroom assessment Backward planning as a “best practice” 1. Write objective with taxonomy-level verb and criteria for mastery 2. Create Assessment/test that fits objective 3. Plan learning activities that support and prepare students for mastery
Classroom tests • Essay: for comprehension, analysis; needs criteria • Multiple choice, matching for recognition • T/F, fill blanks for recall • Problem-solving for application/analysis ➔ Consider pros/cons and kind of students who benefit
Performance-based or authentic assessment 1 • Portfolio showing progress • Exhibition, e.g. posters • Demonstration, e.g. slide shows, videos • For individual or group assessment
Authentic assessment 2 Rubric with criteria for scoring (posted for all to see) 10 points 5 points Sources Over 5 Under 5 Facts Over 10 Under 10 Format Correct Errors Graphics Over 5 Under 5
7.3 Formative Evaluation
Formative assessments 1 • Assess/evaluate learning needs before instruction (aka “pretest”) • Determine previous knowledge on topic or skill • Determine readiness for skill or topic
Formative assessments 2 • Check understanding, monitor progress during learning cycle • Spot errors for re-teaching • Give feedback and suggestions • Check readiness for final (summative) assessment (aka “posttest”)
7.4 Classroom Grading Approaches
Assigning grades 1 When student gets a grade for work, what does he/she think it means? • This is what I am worth • This how I compare with classmates • This is what teacher thinks of me • This is how well I learned
Assigning grades • Letter grades: A, B, C, D, F • Absolute: 10 points per letter • Curve (relative): comparative scaling (force bell curve?) • Descriptive (short or long) • Performance rating (with rubric/criteria) • Mastery checklist (# of attempts not important)
7.5 Criterion-Referenced Testing
Criterion referencing • Emphasis on mastery of specific skills/objectives • Good for topics that can be broken into small objectives • Good for topics that have hierarchy of skills (e.g. math) • Must master skill A before you can understand and master skill B
Criterion referencing • Set-up: objective and performance criteria to prove mastery for each skill (e.g. 80% correct answers) • No comparisons (and no time constraints?) ➔ move to next level at own pace
7.6 Norm-Referenced Testing
Norm referencing • “Standardized” • Comparative with other students • Achievement tests (what has been learned, e.g. state/graduation test) • Aptitude tests (predict future success, e.g. IQ, SAT, GRE)
7.7 Interpreting Norm-Referenced Test Scores
Analyzing test results (1) • Raw scores ➔ derived (comparative) score • “Normed” with large samples of test-takers • Norming = fitted onto normal distribution (bell curve) • Bell curve: mean/average (skewed by extremes), median (middle #), and mode (most frequent) are same
Analyzing test results (2) Statistical descriptors • Areas of distribution marked by standard deviations = deviations from average • Example: IQ tests 100 = avg.; 34% either side of average • Z-scores: # standard deviations +/- from average • Stanines: #5 in center; 1-4 below, 6-9 above
Analyzing test results (3) More statistical descriptors • Percentiles = % of students performing same or below • Example: 80 th percentile = performs better than 80% of others • Grade-level equivalents = • Example: 3.4 = 3 rd grade, 4 th month
7.8 Validity
How is a test valid? • Validity: accuracy measure • Content: match what was in curriculum • Face: appropriate format • Criterion-related: items match objectives • Predictive: match future performance • Construct: match other tests
7.9 Reliability
How is a test reliable? • Reliability = consistency • Test-retest • Alternate/parallel (versions) • Split-half = odds/evens • Kuder-Richardson = 1 test
• Perfect = 1.0, but .80 OK • 0 = no correlation • Negative value = as one factor goes up, other down
7.10 Test Bias
Can a test be biased? • If content or format favors one SES, race, culture, gender, or learning style • Shows up in form/content of test question or answer • Partial solution: test in students’ native language • Not bias: Males vary more than females in achievement scores
7.11 Using Tests Appropriately
Testing: Use wisely • Check validity and standard error of estimate (score +/-) • Check reliability and standard error of measurement (confidence interval) caused by degree of unreliability • Consider how scores and results will be used
7.12 Summary
Testing the test • What are you trying to find out, and at what point in learning cycle? • Does a test report skill achievement or compare students? • Does a test measure what it should, consistently and without bias to any learner?
Recommend
More recommend