the nys appr system brenda myers
play

THE NYS APPR SYSTEM Brenda Myers Superintendent of Schools - PowerPoint PPT Presentation

EMPIRICAL ANALYSIS OF THE NYS APPR SYSTEM Brenda Myers Superintendent of Schools Valhalla UFSD Education Analytics Work with the Lower Hudson Council Analyzed state growth model methods and policy Acquired data from many districts in


  1. EMPIRICAL ANALYSIS OF THE NYS APPR SYSTEM

  2. Brenda Myers Superintendent of Schools Valhalla UFSD

  3. Education Analytics’ Work with the Lower Hudson Council  Analyzed state growth model methods and policy  Acquired data from many districts in the council  Analyzed results for these members  Unique cross district data collaboration  Allows for a better understanding of how state policy is affecting local decisions  Individual district data not enough for a broad picture

  4. Goals for Today  Present a high level discussion of data findings  Examine how APPR rating policy may affect measurement  Look at where this all fits in with other states

  5. Andrew Rice Vice President of Research and Operations Education Analytics

  6. EA Mission  Founded in 2012 by Dr. Robert Meyer, director of the Value-Added Research Center (VARC) at the University of Wisconsin-Madison  “Conducting research and developing policy and management analytics to support reform and continuous improvement in American education”  Developing and implementing analytic tools for systems of education based on research innovations developed in academia

  7. What Are Our Biases?  Support research and data based policy  Scientific perspective on decision making  Respect (not expertise) for political process  If the data say:  the emperor has no hat  the emperor has no shoes  the emperor has no robe  We would conclude:  it may be the case that the emperor has no clothes

  8. Who We Work With  Districts  States  Foundations (Walton, Gates, Dell)  Unions (NEA, AFT)  Understanding the data is useful to everyone

  9. Measures vs. Ratings  A Measure  Has technical validity  Can be evaluated by scientific inquiry  SGP, Charlotte Danielson Rubric, Survey result etc.  A Rating  Is a policy judgment  Cannot be evaluated without policy judgment  APPR Categories, “Effective”, “Developing”, etc.

  10. Measure to Rating Conversion

  11. HEDI Scales State Growth, Comparable, Locally Selected  State Growth Model 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Highly Ineffective Developing Effective Effective  Comparable Growth & Locally Selected Measures 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Highly Ineffective Developing Effective Effective

  12. If HEDI Scales were Consistent  State Growth Model Highly Ineffective Developing Effective Effective  Comparable Growth & Locally Selected Measures Highly Ineffective Developing Effective Effective  Observation Rubrics and Practice Measures Highly Ineffective Developing Effective Effective  Hypothetically Aligned Composite Rating Highly Ineffective Developing Effective Effective

  13. The Actual Composite Rating  Hypothetically Aligned Composite Rating 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 1 to to to to to to to to to to to to to to to to to to to to 0 4 9 14 19 24 29 34 39 44 49 54 59 64 69 74 79 84 89 94 99 0 Highly Ineffective Developing Effective Effective  Actual Composite Rating 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 1 to to to to to to to to to to to to to to to to to to to to 0 4 9 14 19 24 29 34 39 44 49 54 59 64 69 74 79 84 89 94 99 0 Devel- Highly Ineffective Effective oping Effective

  14. Impact on Observation and Practice Measures Rating Scale  State Growth Model Highly Ineffective Developing Effective Effective  Comparable Growth & Locally Selected Measures Highly Ineffective Developing Effective Effective  Observation Rubrics and Practice Measures  Actual Composite Rating Devel- Highly Ineffective Effective oping Effective

  15. Alignment of Actual Lower Hudson Scores to Compressed Scale

  16. Observation Rubrics and Practice Measures Scores  Districts are responding to a particular set of rules that requires them to abandon almost all of the rating scale  it would be optimal if they did not have to  Does your district retain the “measures” for decision making  Report “ratings” as compressed  Effort not wasted as long as information retains value

  17. State Growth Model Study Findings

  18. SGP Model  NY SGP model is rigorous and attempts to deal with many growth modeling issues  In phase 1 (2011/2012) we were concerned with strong relationships between incoming test performance and SGP  In phase 2 (2012/2013) we note that this has been changed through the addition of classroom average characteristics or “peer effect” variables

  19. Distribution of State Growth Scores

  20. State Growth Model  Distributions are largely spread out over the scale  High-growth teachers and low-growth teachers in almost every district  Peer effects have evened out scores between high and low proficiency regions  Class size has some impact but is mitigated with translation from measurement to rating

  21. Comparable Measures Study Findings

  22. Distribution of Comparable Measures Scores

  23. Comparable Measures  Substantial differences between districts in  the way their policies measure effectiveness with Comparable Measures ratings OR  their teacher’s ability to score highly on these metrics  NYSED policy allows variance in implementation of comparable measures  rating comparability between districts is suspect.  Seems not possible for a teacher to attain any of the scores from 0-20 as required by regulation

  24. Local Measures Study Findings

  25. Distribution of Local Measures Scores

  26. Local Measures  Substantial differences between districts in  the way their policies measure effectiveness with Local Measures ratings OR  their teacher’s ability to score highly on these metrics  NYSED policy allows variance in implementation of Local measures  rating comparability between districts is suspect.  Seems not possible for a teacher to attain any of the scores from 0-20 as required by regulation

  27. Student Outcome Measures Comparability Across Districts Flexibility at the local level seems to have produced ratings that are not comparable across districts

  28. State Growth or Local Observation Comparable Growth Measure / Practice

  29. Overall System

  30. What’s Driving Differentiation of Scores? Each Measure Distributed State Growth Drives Differences 100 100 90 90 80 80 70 70 60 60 50 50 40 40 30 30 20 20 10 10 0 0 75 83 92 100 75 83 92 100 Observation and Practice Observation and Practice Local Measure Local Measure State Growth or Comparable Growth State Growth or Comparable Growth

  31. In Lower Hudson  Only 10% of variance driven by Observations

  32. Summary of Findings  Strong variation between district implementations  SLOs 3 points higher than MGP  Local measures 4 points higher than MGP  Observation ratings almost no differentiation  Likely driven by the rules set forth in the composite score HEDI bands

  33. Nationwide Context

  34. Total system issues  Two big rating systems  Index: weighted points  Matrix: categories based on multiple positions of measures  A visual: 3 E H H H Measure 1 2 I D E H 1 I I D H 1 2 3 4 Measure 2

  35. Pro/Con of Rating Systems  Index  Pro: easy to calculate  Pro: easy to communicate  Con: Compensatory model may incent cross component influence  Matrix  Pro: more flexible  Con: more difficult to explain  Pro: may allow for disagreeing measures to be dealt with in a different way than index

  36. What About the Other 49 States?  Much experimentation  High weights on growth  High weights on observations  Student surveys  Assessment system redesigns  Index and Matrix approaches  Who gets it right?

  37. Exemplars  Developing field  No state has it right  Some components work  Growth on assessment measurement in NY is good  No state has gotten SLO’s right (RI is getting there)  Observations are coming under fire for poor implementation and possible bias (some great work on the measures – not yet on ratings)  Total system scoring and policy is all over the map

  38. Louis Wool Superintendent of Schools Harrison Central School District

Recommend


More recommend