alignment in validity evaluation and
play

Alignment in Validity Evaluation and Education Policy Ellen Forte - PowerPoint PPT Presentation

Alignment in Validity Evaluation and Education Policy Ellen Forte CCSSO 2018 CEO & Chief Scientist National Conference on Student Assessment edCount, LLC San Diego, CA Since 1994, US educational policy has been based on Systemic Reform,


  1. Alignment in Validity Evaluation and Education Policy Ellen Forte CCSSO 2018 CEO & Chief Scientist National Conference on Student Assessment edCount, LLC San Diego, CA

  2. Since 1994, US educational policy has been based on Systemic Reform, which is the foundation of standards-based assessment and accountability Curriculum Content and Standards Evaluation and Instruction Accountability Performance Assessment Standards • Standards define expectations for student learning • Curricula and assessments are interpretations of the standards • Evaluation and accountability rely on the meaning of scores • Without clear alignment among standards, curricula, and assessment the model falls apart 2

  3. Webb Alignment Criteria, history (Webb, 1997, p. 4) Alignment is “the degree to which expectations and assessments are in agreement and serve in conjunction with one another to guide the system toward students learning what they are expected to know and do.”

  4. Webb Alignment Criteria, history Webb (1997) introduced a comprehensive framework for evaluating alignment of a state’s assessment with its standards or curriculum The original framework included five sets of criteria: ◦ Content focus ◦ Articulation across grades and ages ◦ Equity and fairness ◦ Pedagogical implications ◦ System applicability Webb (1999) used four criteria from the content focus set in an exploratory study Those four criteria became the de facto definition of alignment that have been driving large-scale educational test design and evaluation in the US ever since

  5. Standards DOK Items Set of items that contribute to scores

  6. Stop rating DOK. Stop analyzing DOK as an independent indicator. The “thinking” and the content should not be separated. Stop.

  7. Shifting our expectations for alignment evaluation… Common attributes of current alignment studies: ◦ Post hoc ◦ Link items directly to content standards ◦ Ignore blueprints ◦ Ignore achievement/performance standards ◦ Ignore principled-design philosophy and components ◦ Ignore scores ◦ Ignore interpretations and uses of scores Alignment evaluation should: ◦ Provide formative information ◦ Consider all axis points in the path from standards to scores ◦ Address scores as they are reported and meant to be interpreted ◦ Yield critical information to support a validity argument

  8. An Updated View of Alignment (Webb, 1997, p. 4) Alignment is “the degree to which expectations and assessments are in agreement and serve in conjunction with one another to guide the system toward students learning what they are expected to know and do.” (Forte, 2017, p. 3) “Alignment is about coherent connections across various aspects within and across a system and relates not simply to an assessment, but to the scores that assessment yields and their interpretations.”

  9. Some Standards Relevant to Alignment 1.0 – Clear articulation of each intended test score interpretation for a specified use should be set forth, and appropriate validity evidence in support of each intended interpretation should be provided. 4.0 – “Tests and testing programs should be designed and developed in a way that supports the validity of interpretations of the test scores for their intended uses. Test developers and publishers should document steps taken during the design and development process to provide evidence of fairness, reliability, and validity for intended uses for individual in the intended examinee population” (p. 85). 4.12 – “Test developers should document the extent to which the content domain of a test represents the domain defined in the test specifications” (p. 89). 12.4 – “When a test is used as an indicator of achievement in an instructional domain or with respect to specified content standards, evidence of the extent to which the test samples the range of knowledge and elicits the processes reflects in the target domain should be provided” (p. 196).

  10.       11 Forte, 2013

  11. Six Key Relationships 1. The relationship among the measurement targets and the state’s academic content standards; 2. The relationship among the measurement targets and the item specifications and development guidelines; 3. The relationship among the measurement targets and the assessment blueprints; 4. The relationship between the measurement targets and the performance level descriptors (PLDs/ALDs); 5. The relationship between the measurement targets (via task models and item templates) and the assessment items; and 6. The relationship between the measurement targets and the items that contribute to students’ test scores.

  12. A. System Design 1. How were targets and claims and established to reflect the full depth and breadth of the standards? Is this method reasonable and sound? 2. How were the task models and item templates developed to reflect the claims and measurement targets? Is this method reasonable and sound? 3. How were the blueprints developed to reflect the claims and measurement targets? Is this method reasonable and sound? 4. How were the PLDs developed to reflect the claims and measurement targets? Is this method reasonable and sound? 5. How were the items developed to reflect the claims and measurement targets (via task models and item templates)? Is this method reasonable and sound? 6. How were the forms and scoring rules developed to reflect the claims and measurement targets? Is this system reasonable and sound?

  13. B. System Implementation 1. How well do claims and measurement targets address the full depth and breadth of the standards? 2. How well do the task models and item templates reflect the claims and measurement targets? 3. How well do the blueprints reflect the claims and measurement targets? 4. How well do the PLDs reflect the claims and measurement targets? 5. How well do the items reflect the claims and measurement targets? 6. How well do the sets of items that contribute to students’ scores reflect the claims and measurement targets?

Recommend


More recommend