the role of item models in automatic item generation
play

The Role of Item Models in Automatic Item Generation Mark J. Gierl - PowerPoint PPT Presentation

The Role of Item Models in Automatic Item Generation Mark J. Gierl Hollis Lai Centre for Research in Applied Measurement and Evaluation University of Alberta CCSSO SymposiumOrlando, FL June 22, 2011 CHANGING TIMES Developments in cognitive


  1. The Role of Item Models in Automatic Item Generation Mark J. Gierl Hollis Lai Centre for Research in Applied Measurement and Evaluation University of Alberta CCSSO Symposium—Orlando, FL June 22, 2011

  2. CHANGING TIMES • Developments in cognitive science, mathematical statistics, the learning sciences, computer technology, and educational psychology are creating profound changes in educational measurement • Assessment engineering (AE; Luecht, 2006a, 2006b, 2011) is an innovative approach to measurement where engineering ‐ like principles are used to direct the design and analysis of assessments as well as the scoring and reporting of the results Our vision of a 21st ‐ century testing program capitalizes on modern technology and takes advantage of recent innovations in testing. Using an analogy from engineering, we envision a modern testing program as an integrated system of systems. (Drasgow, Luecht, & Bennett, 2006)

  3. CHANGING TIMES • Developing a test using AE requires three explicit steps: STEP #1: An assessment begins with specific, empirically ‐ derived cognitive model of task performance; STEP #2: Item models are then created to produce replicable assessment tasks; STEP #3: Psychometric methods are applied to the examinee response data—typically in a confirmatory mode—to produce scores that are both replicable and interpretable

  4. CONVENTIONAL ITEM DEVELOPMENT

  5. AUTOMATIC ITEM GENERATION  By way of contrast, the idea of automatic item generation is seen as a dream come true by many testing agencies given that large item banks are required for continuous testing and that item development with humans is both time consuming and expensive  The first requirement is that an item class can be described sufficiently for a computer to create instances of that class automatically—the purpose of our study is to describe how item models can be used to specify the item class  The second requirement is that the determinants of item difficulty be understood well enough so that each of the generated instances need not be calibrated individually

  6. AUTOMATIC ITEM GENERATION  STRONG THEORY: The goal of automatic item generation from strong theory is to generate calibrated items automatically from design principles using a theory of difficulty based on a cognitive model  The theory needs to describe the cognitive mechanism required to solve the items and the features of items that cause difficulty levels to vary  Item generation from strong theory, at least right now, is best suited to specific domains where cognitive analysis is more feasible and where well ‐ developed theories are more likely to exist

  7. AUTOMATIC ITEM GENERATION  WEAK THEORY: The goal of automatic item generation from weak theory is to generate calibrated items automatically from design guidelines using a theory of invariance  Often, the starting point is to use a parent item whose psychometric characteristics are known; then through experience, intuition, theory, and luck create an item model by identifying characteristics of the parent item that affect item difficulty; finally, vary those characteristics that affect difficulty to generate new items  Weak theory has resulted in many operational examples of item generation—however, because the determinants of difficulty are not well understood, fewer item characteristics can be varied simultaneously and items, as a result, may be more visibly similar that those generated by strong theory

  8. ITEM MODELS BASIC MATH ITEM AND ITEM MODEL Ann has paid $1525 for planting her lawn. The cost of lawn is $45/m 2 . Given the shape of her lawn is square, what is the side length of Ann’s lawn? A. 5.8 B. 6.8 C. 4.8 D. 7.3

  9. ITEM MODELS STEM: Ann has paid $I1 for planting her lawn. The cost of lawn is $I2/m 2 . Given the shape of her lawn is S1, what is the S2 of Ann’s lawn? Manipulating the integers can increase or ELEMENTS: decrease the range of generated items I1 Value Range: 1525 ‐ 1675 by 75 I2 Value Range: 45 or 30 Any geometric concept could be added for our S1 Range: “square” or “round” string variables S2 Range: “side length” or “radius” OPTIONS: S1=”square” S1=”round” S2=”side length” S2=”radius” A= I I 1 2 I I 1 2*3.14 A= B=A+1 B=A+1 C=A ‐ 1 C=A ‐ 1 D=A+1.5 D=A+1.5 KEY: A

  10. ITEM MODELS Model ‐ based item development has many practical advantages: • More strategic test construction where the purpose of development is, first, to create item models, and then to generate content for the models to populate an item bank • Test assembly becomes model based, meaning that tests are composed of instances from the item bank • The logic behind model ‐ based item development can lead to more efficient test construction (i.e., larger number of items and fewer discarded items after field testing) because it treats items as classes rather than treating items as an isolated entities that are individually authored, reviewed, and formatted

  11. ITEM MODELS • To create item models systematically and strategically, an item model taxonomy is required (Gierl, Zhou, Alves, 2008) • This type of taxonomy is a prerequisite for automatic item generation because it provides the guiding principles necessary for designing a large number of diverse item models by outlining their structure, function, similarities, differences, and limitations (i.e., taxonomy helps us avoid creating item models that produce generated items that all look the same) • A taxonomy for item model development must manipulate three variables: the stem , options , and auxiliary information

  12. ITEM MODELS • The stem is the section of the model used to formulate context, content, and/or questions • Independent indicates that the n i element(s) (n i >=1) in the stem are independent or unrelated to one another (that is, a change in one element will have no affect on the other stem elements) • Dependent indicate n d element(s) (n d >=2) in the stem are dependent or directly related to one other • Mixed include both independent (n i >=1) and dependent (n d >=1) elements in the stem • Fixed represents a constant stem format with no variation or change

  13. ITEM MODELS • The options contain the alternatives for the item model • Randomly ‐ s elected options refers to the manner in which the distractors are selected from their corresponding content pools (the distractors are selected randomly) • Constrained options mean that the keyed option and the distractors are generated according to specific constraints, such as formulas, calculation, and/or context • Fixed options occurs when both the keyed option and distractors are invariant or unchanged in the item model • Auxiliary information includes any additional material, in either the stem or option, required to generate an item, including texts, images, tables, and/or diagrams

  14. ITEM MODELS • By crossing the 4 stem and 3 options categories, a matrix of 12 item model types can be produced • 10 functional combinations can be created from the matrix of 12 (the two remaining combinations are not applicable) Table 1. Plausible Stem ‐ by ‐ Option Combinations in the Item Model Taxonomy Stem Independent Dependent Mixed Fixed Options Randomly Selected √ √ √ √ Constrained N/A √ √ √ Fixed N/A √ √ √

  15. ITEM MODELS • We have also developed software for item generation—IGOR (Item GeneratOR)—which is now operational • IGOR was programmed using Sun Microsystems JAVA SE 6 and it is available either as a desktop program or a web ‐ based application

  16. IGOR GRADE 3 Stem: Independent; Options: Constrained; Auxiliary Information: None I have 13 tens, 2 hundreds, and 21 ones. What number am I? A. 351 B. 324 C. 234 D. 213

  17. IGOR STEM: I have I1 tens, I2 hundreds, and I3 ones. What number am I? ELEMENTS: I1 Value range: 11 to 19 by 1 I2 Value range: 1 to 9 by 1 I3 Value range: 11 to 49 by 1 OPTIONS: A. I1*10+I2*100+I3 B. I2*100+I3 C. I1+I2*100+I3 D. I1*10+I2*100 KEY: A

  18. IGOR

  19. IGOR When IGOR was used with 10 item model in math, which represented • each cell in our taxonomy, 331371 unique items were generated We have also applied IGOR to 31 different item models using Grade 3, • 6, and 9 content from Mathematics, Social Studies, Science, and Language Arts to generate tens of thousands of test items We have applied IGOR to 6 different item models in the College Board’s • AP Biology program producing 2263 unique items Finally, we have archived our work over the past 4 years and, in the • process, created an item model bank which currently contains 182 different item models across an array of content areas, grade levels, and testing programs (e.g., achievement and licensure testing)

  20. CONCLUSION Model ‐ based item development has many practical advantages: • More strategic test construction where the purpose of development is, first, to create item models, and then to generate items • Test assembly becomes model based, meaning that tests are composed of generated items from the bank • The logic behind model ‐ based item development can lead to more efficient test construction because it treats items as classes rather than treating items as an isolated entities that are individually authored, reviewed, and formatted and, therefore, you have the potential to get more item development *bang* for your valuable item development dollars

  21. THANK YOU If you have questions or comments, please contact me Dr. Mark J. Gierl (mark.gierl@ualberta.ca)

Recommend


More recommend