assessing the reliability of a problem solving rubric
play

Assessing the Reliability of a Problem-Solving Rubric when using - PowerPoint PPT Presentation

Assessing the Reliability of a Problem-Solving Rubric when using Multiple Raters T. Ryan Duckett, Uchenna Asogwa, Matthew W. Liberatore, Gale Mentzer, & Amanda Malefyt timothy.duckett@utoledo.edu


  1. Assessing the Reliability of a Problem-Solving Rubric when using Multiple Raters T. Ryan Duckett, Uchenna Asogwa, Matthew W. Liberatore, Gale Mentzer, & Amanda Malefyt timothy.duckett@utoledo.edu http://www.utoledo.edu/engineering/chemical-engineering/liberatore/ June 2019

  2. Conceptual framework Items Student Construct domains Rater-mediated knowledge of assessment of engineering student ability Rating scales concepts Rubric fidelity 2

  3. Study design Iterative, inter-rater Data Collected Full rating plan reliability study # Raters: N/A 4 5 20 Participant N: 113 (39% female) 70 2 MW schools Undergrad MEB Traditional Traditional Traditional Problem Type: Innovative Innovative 3

  4. Typical homework problem 4

  5. Established rating tool: PROCESS Problem-solving domain Problem definition Represent the problem Organize information Calculations Solution completion Solution accuracy Grigg, S. J. & Benson, L. European Journal of Engineering Education, 2014. 39(6) : 617-635. 5

  6. Detailed PROCESS rubric Level of completion Problem- Tasks solving performed domain Missing Inadequate Adequate Accurate Source of Error 0 points 1 point 2 point 3 point Did not Completed few Completed most Clearly and Problem Identify explicitly problem/system problem/system correctly identified definition unknown identify tasks with many desks with few and defined problem errors errors problem/system Grigg, S. J. & Benson, L. European Journal of Engineering Education, 2014. 39(6) : 617-635. 6

  7. Accuracy in assessment π‘„π‘œπ‘—π‘˜π‘˜π‘™ - - B n - F k D i C j log = π‘„π‘œπ‘—π‘˜π‘™ βˆ’1 B n is the ability of student n. D i is the difficulty of item i. C j is the severity of judge j . F k is the extra difficulty overcome in being observed at the level of category k, relative to category k-1 . 7

  8. Rasch fundamentals of measurement Principle of Invariance Monotonicity Unidimensionality Local independence 8

  9. Rasch creates common measure - PROCESS Item Overall Measure - Rater Severity - + Student Ability + Rating Category Difficulty - (3) 100 A 90 B C -- 80 Solution Accuracy 70 D E 2 60 Identify 50 F G H Organize -- I Allocate 40 Rater 3 Rater 5 J K L Represent 1 30 Rater 2 M N O Solution Completion 20 Rater 1 P Q -- Rater 4 10 R S T (0) 0 9

  10. Measuring rater bias 10

  11. Raters discuss similar scores Scores for Student E: Problem Represent Organize Solution Solution Rater Calculate definition problem knowledge completion accuracy Rater 1 3 3 3 3 3 1 Rater 2 3 2 3 3 3 1 Rater 3 3 3 3 3 3 3 Rater 4 3 3 3 3 3 1 Rater 5 3 3 3 2 2 1 11

  12. Raters identify differences Scores for Student M: Problem Represent Organize Solution Solution Rater Calculate definition problem knowledge completion accuracy Rater 1 3 3 3 2 3 1 Rater 2 3 2 1 1 2 1 Rater 3 3 3 3 3 2 3 Rater 4 2 3 3 2 1 1 Rater 5 3 3 1 3 2 1 12

  13. Improving rater agreement Moderate Level of Weak agreement 13

  14. Conclusions Iterative reliability evaluation Accuracy of assessment Identify source of measurement errors Greater adherence to measurement principles 14

  15. Thank you and… Katherine Roach, Caleb Sims, Lindsey Stevens, countless TAs DUE-1712186 Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. University of Toledo IRB protocols 202214

Recommend


More recommend