growth in student achievement issues of measurement
play

Growth in Student Achievement: Issues of Measurement, Longitudinal - PowerPoint PPT Presentation

Growth in Student Achievement: Issues of Measurement, Longitudinal Analyses & Accountability Damian W. Betebenner NCIEA CCSSO NCSA, June 23, 2010 Discussions of student growth lie at the intersection of three topics Longitudinal Data


  1. Growth in Student Achievement: Issues of Measurement, Longitudinal Analyses & Accountability Damian W. Betebenner NCIEA CCSSO NCSA, June 23, 2010

  2. Discussions of student growth lie at the intersection of three topics Longitudinal Data Analysis/Applied Statistics Overview Accountability/Education Policy/Data Use Measurement/Psychometrics

  3. Measurement/Psychometrics Examining student growth requires multiple measurements of the same individual  Growth in what? Overview  How much growth? (How is scaling involved in answering this question?)  Is it enough growth?

  4. Longitudinal Data Analysis/Applied Statistics Many methods for analysis of longitudinal data  What are the relevant questions?  Are the analytic techniques capable of Overview answering those questions?  Does the data possess properties sufficient for the analytic techniques employed? (e.g., vertical scale)  Does the analysis sustain the inferences made from the data?

  5. Accountability/Education Policy/Data Use Education Policy & Accountability have many goals and purposes  Why growth in accountability? Overview  What are the goals and purposes of accountability?  What is the theory of action behind accountability?  How can we judge the validity of the accountability system?  What about the current policy context?

  6. Measurement/Psychometric Issues Technical Considerations

  7. Measurement/Psychometric Issues Technical Considerations  Growth in what?  How much growth?  Scales for measuring growth  Ordinal (within-year, across year)  Interval (within-year, across year)  Vertical  Growth magnitude versus growth norm  Is it enough growth? Norm- versus criterion- referencing (intersection of Accountability and Measurement)

  8. Growth in what? Technical Considerations  Beneath any notion of change (i.e., growth) is a construct that is changing over time  Height and weight are common points of reference  Constructs in education are “slippery”  Need, at a minimum, an underlying semantical referent (e.g. reading or math)

  9. How much growth? Technical Considerations  Are growth magnitudes possible in education?  If calculable, are they interpretable absent some norm?  Approaches to growth magnitudes:  Performance standards  Vertical scale with interval properties  Learning progressions (qualitative growth)

  10. How much growth? Technical Considerations Performance Standards Limitations Strengths  Few levels, mask  Anchors reference substantial range within points for discussions levels thus masking about performance student growth within  Growth is embedded level in accountability metric  Vary greatly in stringency from state to state so that “proficient” performance lacks meaning

  11. How much growth? Technical Considerations Scale Scores Strengths Limitations  Difficult to interpret or  Semi-continuous scores (many score points) explain to users  Vertical scales are hard  Can be used to create vertical scales across to defend grade levels  Claims of interval  Give the appearance of measurement interval scales needed properties don’t hold to by some analytical close scrutiny models

  12. How much growth? Technical Considerations Vertical Scale Vertical & Interval scales required for some analytic techniques:  Gain score calculation (magnitude of growth)  Growth curve analysis (rate of growth) (e.g., Willett & Singer, 2003) Vertical & Interval scales required for some questions:  Matthew effects: Do higher achievers grow faster than lower achievers?  Growth rates relative to student age: Do students grow more in later grades than earlier grades?

  13. How much growth? Technical Considerations Vertical Scale Vertical and/or Interval scales NOT required for some analytic techniques:  Value-Added analyses: Most require interval, but not vertical, scale. See Ballou (2008), Briggs & Betebenner (2009).  Auto-regressive analyses, growth norms Vertical and/or Interval scales NOT required for some questions:  Is a student’s progress (ab)normal?  Is a student’s growth sufficient to put them on track to reach/maintain proficiency?  See Yen (2007) for an excellent list of questions

  14. How much growth? Technical Considerations Magnitudes versus Norms Physical growth Two Growth Quantities  9 year old boy grew 5 inches in past year  Magnitude of growth  Average increase in height  Relative amount of growth for boys between years 8 and 9 is 4 inches How much growth? Achievement growth  4 th grader grew 25 scale  People expect an answer score points since 3 rd grade of magnitude  Average 4 th grade scale  People need magnitude score is 21 points higher embedded within a norm than average 3 rd grade score

  15. How much growth? Technical Considerations Growth norms Although normative comparisons are spurned by criterion-referenced and standards-based measurement advocates, norms can provide a useful interpretive framework, especially in the interpretation of student growth “Scratch a criterion and you find a norm” W. H. Angoff (1974)

  16. Longitudinal Data Analysis Issues Technical Considerations

  17. Many Questions Technical Considerations  How much annual growth did this (these) student(s) make in reading?  Is (Are) this (these) student(s) making sufficient growth to reach/maintain desired achievement targets? (Growth-to-standard & Growth Model Pilot Program)  Are students in particular subgroups (e.g., minority students) making as much progress as other students?  How much did this teacher/school contribute to students’ growth over the last year? (Value-Added)  Again, see Yen (2007) for an excellent list of questions

  18. Many Techniques Technical Considerations Numerous data analysis techniques for use with longitudinal data:  Gain scores (suitable scale required)  Cross-tabulation based upon prior and current categorical achievement level attainment (e.g., value-tables, transition matrices)  Regression based approaches: growth-curve analysis (HLM), fixed/mixed-effects models, growth norms

  19. Questions 1 st , Analyses 2 nd Technical Considerations  Different growth analysis techniques often address different questions  Different questions lead to different conversations which lead to different uses and outcomes “It is better to have an approximate answer to the right question than a precise answer to the wrong question.” J. W. Tukey

  20. Model Purpose Technical Considerations Three general uses associated with statistical models (Berk, 2004): Description: An account of the data. Model is true to the extent that it is useful. Model quality judged by craftsmanship (de Leeuw, 2004) Inference: Sample to Population. Model is true to the extent that the assumed chance process reflects reality (super-population fallacy) Causality: A causes B to happen. Model is true to the extent that plausible causal theory exists and design criteria are met  Models are rarely descriptive despite minimal requirements  Inference and causality require information external to the data. Can’t be validated solely from data  Models are often causal in nature but rarely meet rigorous criteria necessary for such inferences

  21. Value-Added Models Technical Considerations Causality  Value-Added Models (e.g., EVAAS) are a frequently discussed type of growth model  Value-Added Models attempt to quantify the portion of student progress attributable, usually to a teacher or school  Value-Added is about the inferences made and not the actual model  Causal attributions make value-added models well suited for accountability discussions  In the absence of random assignment causal attributions are always suspect and subject to challenges (see, for example, Raudenbush, 2004; Rubin, Stuart & Zanutto, 2004)

  22. Value-Added Models Technical Considerations Causality  Value-added models return norm-referenced effectiveness quantities  With regard to schools, quantities indicate whether a school is significantly more or less effective than the mean school effectiveness in the district or state  In a standards based assessment environment, how much effectiveness is enough?  Especially important in light of universal proficiency policy mandates  Growth-to-standard models created to provide criterion-referenced growth models

  23. Growth Model Pilot Program Technical Considerations Growth-to-standard  In response to requests for growth model use as part of AYP, USED allowed states to apply to use growth models  Fifteen states had models accepted  Models required to adhere to the “bright line principle” of universal proficiency (growth-to- standard)  Yen (2009) provides an excellent overview of the models  Growth-to-standard models returned, in general, results that closely aligned with AYP status results.

  24. Growth versus Value-Added Models Technical Considerations Description & Causality  Growth measures are descriptive  Accountability has skewed discussions of growth from description toward responsibility (i.e., causality)  All measures (even VAM) are potentially descriptive. However, some measures are specially crafted for causal inference/attribution  Good descriptive measures are interpretable, informative and capable of multiple uses

Recommend


More recommend