welcome and introduction
play

Welcome and Introduction 5 Presenter G. Gag age e Kingsbur - PowerPoint PPT Presentation

Decisions to be made in developing an adaptive testing system for K 12 education G. Gage Kingsbury March 9, 2012 Welcome and Introduction 5 Presenter G. Gag age e Kingsbur sbury Vice President for the International Association for


  1. Decisions to be made in developing an adaptive testing system for K – 12 education G. Gage Kingsbury March 9, 2012

  2. Welcome and Introduction 5

  3. Presenter G. Gag age e Kingsbur sbury Vice President for the International Association for Computerized Adaptive Testing (IACAT) and Senior Research Fellow at the Northwest Evaluation Association (NWEA) 6

  4. Decisions to be made in developing an adaptive testing system for K – 12 education 7

  5. The Idea An adaptive test is a test that adjusts its characteristics based on the performance of a test taker. 8

  6. Questions and Answers 9

  7. Computerized Adaptive Testing 20 Item Test 250 240 Achievement Score 235 234 235 234 234 233 232 231 230 230 229 Advanced 228 228 226 225 221 220 216 Proficient 210 210 202 200 Basic 191 190 180 175 170 160 150 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Test Questions 10

  8. Pioneers of adaptive testing • Alfred Binet • Frederick Lord • David J. Weiss • Fumiko Samejima • Mark Reckase 11

  9. First implementers • David Foster • Jim McBride • Tony Zara • Gage Kingsbury 12

  10. You have chosen to use an adaptive test because … • It can be more efficient than a fixed-form test • It provides good information across a broader spectrum of student performance • It can provide immediate scoring and reporting • It can provide better security than a fixed-form test • It can be designed to measure growth 13

  11. Since the first implementations • We have seen international growth in the use of CAT for – Educational testing – Medical outcomes assessment – Certification and licensure 14

  12. Accuracy of adaptive tests • Compared to a fixed-form test • As a function of test length • Depending on termination procedure 15

  13. Relationship between Spring and Fall Reading Scores 250 240 230 220 Fall RIT 210 200 190 180 170 160 150 150 160 170 180 190 200 210 220 230 240 250 Spring RIT PP to CAT PP to PP 16

  14. Students' Mean = 211.7 s.d. = 11.11 Proficiency = 205 Basic = 192 Test Information Functions for Grade 4 Mathematics .12 .10 Information .08 .06 .04 .02 .00 165 175 185 195 205 215 225 235 245 RIT 17

  15. Choosing to use an adaptive test requires making a series of decisions in the areas of… • Psychometrics • Interface (including accommodations) • Item designs • Test designs • Test distribution • Item usage • Item and test security • Proctor training • Reporting 18

  16. Basics of a theoretical CAT • IRT model • Item pool • Select first item • Select next item • Terminate test • Score 19

  17. Decision areas for an operational CAT for measuring student achievement • Be Before ore th the e tes est t (T (Test est stu tuff) – How will we develop the measurement scale? – What mix of item styles will we need? – Which IRT model is appropriate? – What depth do we need in our item bank? – How will we choose an operational item pool? – What will our test blueprint include? – How will we QA everything involved? 20

  18. Questions and Answers 21

  19. Decision areas for an operational CAT for measuring student achievement • Be Before ore th the e tes est t (S (School chool stu tuff) f) – School, teacher, and student identification – Establishing a testing environment – Teacher training – Software/hardware setup – Proctor training – Student familiarization – Student scheduling – QA 22

  20. Decision areas for an operational CAT for measuring student achievement • Test est ad admi ministra stration tion – Student verification process – Test selection – Proctor throughout – Identify previously used items 23

  21. Decision areas for an operational CAT for measuring student achievement • Test est even ent – Apply test blueprint – Select first item or set of items – Check for effort – Update item selection theta hat – Update constraints – Select next item – Terminate test 24

  22. Decision areas for an operational CAT for measuring student achievement • After er the e tes est – Calculate final score – Calculate growth – Terminate test session – Store data – Identify student as completing test – Compare to norms, growth norms, content, etc. – Create individual student report – Add information to teacher/administrator reports 25

  23. Measuring growth and adaptive testing • Measuring at multiple points in time • The standard deviation of growth • The standard error of growth • Reduction of uncertainty • Growth and instruction 26

  24. Adaptive testing and idiosyncratic knowledge patterns • Can there be multiple thetas without multidimensionality? • Selecting items to reveal knowledge patterns • A simple algorithm • The impact on instruction 27

  25. Field testing within an adaptive testing system • Calibration differences from paper to CAT • Random sampling for calibration in CAT • Using provisional calibrations in CAT field tests 28

  26. Cautionary notes • Adaptive testing needs to be well tuned to avoid bad tests. • The item pool must support the stakes. • Adaptive testing changes, but doesn’t eliminate, security issues. – Brain dump sites • Limit desire. No test can do everything. • Adaptive test development is never done. 29

  27. Have fun • The decisions to be made should consider the good of the students for whom the test is designed. • Don’t try to build the perfect test—it won’t be. • Consider a ―dry eye‖ policy— making kids cry isn’t the purpose of the test. 30

  28. Thank you Gage Kingsbury gagekingsbury@comcast.net 31

Recommend


More recommend