Measurement in the context of formative classroom assessment Mark Wilson UC Berkeley Presented at the University of California, Berkeley March 7, 2017
Abstract This talk will survey the last 15 years of work that we have been doing at the BEAR Center on the BEAR Assessment System (BAS). It will begin by noting the initial motivations for developing this approach to measurement/assessment, focusing on the question of what are the measurement demands in the context of formative classroom assessments. This will be followed by a brief description of the BAS, accompanied by discussion of how it reflects a response to this question. Following that it will also explore the following. (a) What are the most important developments in the BAS since 2000? (b) What are some important examples of BAS assessments? (c) How should BAS interface with state testing? (d) What are the challenges and opportunities?
Outline • Why should we think of classroom assessment as “measurement”? • What should we want from an Assessment System? • How is the BEAR Assessment System (BAS) a response to the question? • What are the most important developments in the BAS since 2000? • What are some important examples of BAS assessments? • How should BAS interface with state testing? • What are the challenges and opportunities?
Outline • Why should we think of classroom assessment as “measurement”? • What should we want from an Assessment System? • How is the BEAR Assessment System (BAS) a response to the question? • What are the most important developments in the BAS since 2000? • What are some important examples of BAS assessments? • How should BAS interface with state testing? • What are the challenges and opportunities?
A bit of my own background • Q1. How can educational measurement help classroom assessment? • Worried on this in 1970s, and since – Saw the need to give teachers better assessment information in the classroom – Saw the need for the assessment to track the “logic behind the curriculum” – B/c that way the teachers can understand what to do with that information.
A bit of my own background • Q2. Where does the scale (aka, the continuum, the construct, the learning progression) in tests come from? • Worked on this in 1980s and early 90s – Discovering and interpreting effects on item difficulty for “ wild ” items – Looking for a consistent story about why items are ordered the way they are • Published 10 papers on these topics during that period …
Wilson, M., & Bock, R.D. (1985). Spellability: a linearly-ordered content domain. American • Educational Research Journal, 22 (2), 297-307. Wilson, M. (1989). Empirical examination of a learning hierarchy using an item response theory • model. Journal of Experimental Education, 57 (4), 357-371. Wilson, M. (1989). Saltus: A psychometric model of discontinuity in cognitive development. • Psychological Bulletin, 105( 2), 276-289. Wilson, M. (1990). Measuring a van Hiele geometry sequence: A reanalysis. Journal for Research • in Mathematics Education, 21 (3), 230-237. Wilson, M. (1990). Investigation of structured problem solving items. In G. Kulms (Ed.), Assessing • higher order thinking in mathematics. Washington, DC: American Association for the Advancement of Science. Wilson, M. (1992). The ordered partition model: An extension of the partial credit model. Applied • Psychological Measurement. 16 (3), 309-325. Masters, G.N., Adams, R.J., & Wilson, M. (1990). Charting of student progress. In T. Husen & T.N. • Postlethwaite (Eds.), International Encyclopedia of Education: Research and Studies. Supplementary Volume 2 (pp. 628-634) . Oxford: Pergamon Press. Reprinted in: T. Husen & T.N. Postlethwaite (Eds.), (1994). International Encyclopedia of Education (2nd. Ed.) (pp. 5783-91 ). Oxford: Pergamon Press. Wilson, M. (1990). Measurement of developmental levels. In T. Husen & T.N. Postlethwaite (Eds.), • International Encyclopedia of Education: Research and Studies. Supplementary Volume 2. Oxford: Pergamon Press. Wilson, M. (1992). Measurement models for new forms of assessment in mathematics education. • In J.F. Izard & M. Stephens (Eds.) Reshaping assessment practices: Assessment in the mathematical sciences under challenge . Hawthorn, Australia: ACER. Wilson, M. (1992). Measuring levels of mathematical understanding. In T. Romberg (Ed.), • Mathematics assessment and evaluation: Imperatives for mathematics educators . New York: SUNY Press. From: Wilson, 2010.
Wilson, M., & Bock, R.D. (1985). Spellability: a linearly-ordered content domain. American • Educational Research Journal, 22 (2), 297-307. Wilson, M. (1989). Empirical examination of a learning hierarchy using an item response theory • model. Journal of Experimental Education, 57 (4), 357-371. Wilson, M. (1989). Saltus: A psychometric model of discontinuity in cognitive development. • Psychological Bulletin, 105( 2), 276-289. Wilson, M. (1990). Measuring a van Hiele geometry sequence: A reanalysis. Journal for Research • in Mathematics Education, 21 (3), 230-237. Wilson, M. (1990). Investigation of structured problem solving items. In G. Kulms (Ed.), Assessing • higher order thinking in mathematics. Washington, DC: American Association for the Advancement of Science. Wilson, M. (1992). The ordered partition model: An extension of the partial credit model. Applied • Psychological Measurement. 16 (3), 309-325. Masters, G.N., Adams, R.J., & Wilson, M. (1990). Charting of student progress. In T. Husen & T.N. • Postlethwaite (Eds.), International Encyclopedia of Education: Research and Studies. Supplementary Volume 2 (pp. 628-634) . Oxford: Pergamon Press. Reprinted in: T. Husen & T.N. Postlethwaite (Eds.), (1994). International Encyclopedia of Education (2nd. Ed.) (pp. 5783-91 ). Oxford: Pergamon Press. Wilson, M. (1990). Measurement of developmental levels. In T. Husen & T.N. Postlethwaite (Eds.), • International Encyclopedia of Education: Research and Studies. Supplementary Volume 2. Oxford: Pergamon Press. Wilson, M. (1992). Measurement models for new forms of assessment in mathematics education. • In J.F. Izard & M. Stephens (Eds.) Reshaping assessment practices: Assessment in the mathematical sciences under challenge . Hawthorn, Australia: ACER. Wilson, M. (1992). Measuring levels of mathematical understanding. In T. Romberg (Ed.), • Mathematics assessment and evaluation: Imperatives for mathematics educators . New York: SUNY Press. From: Wilson, 2010.
Wilson, M., & Bock, R.D. (1985). Spellability: a linearly-ordered content domain. American • Educational Research Journal, 22 (2), 297-307. Wilson, M. (1989). Empirical examination of a learning hierarchy using an item response theory • model. Journal of Experimental Education, 57 (4), 357-371. Wilson, M. (1989). Saltus: A psychometric model of discontinuity in cognitive development. • Psychological Bulletin, 105( 2), 276-289. Wilson, M. (1990). Measuring a van Hiele geometry sequence: A reanalysis. Journal for Research • in Mathematics Education, 21 (3), 230-237. Wilson, M. (1990). Investigation of structured problem solving items. In G. Kulms (Ed.), Assessing • higher order thinking in mathematics. Washington, DC: American Association for the Advancement of Science. Wilson, M. (1992). The ordered partition model: An extension of the partial credit model. Applied • Psychological Measurement. 16 (3), 309-325. Masters, G.N., Adams, R.J., & Wilson, M. (1990). Charting of student progress. In T. Husen & T.N. • Postlethwaite (Eds.), International Encyclopedia of Education: Research and Studies. Supplementary Volume 2 (pp. 628-634) . Oxford: Pergamon Press. Reprinted in: T. Husen & T.N. Postlethwaite (Eds.), (1994). International Encyclopedia of Education (2nd. Ed.) (pp. 5783-91 ). Oxford: Pergamon Press. Wilson, M. (1990). Measurement of developmental levels. In T. Husen & T.N. Postlethwaite (Eds.), • International Encyclopedia of Education: Research and Studies. Supplementary Volume 2. Oxford: Pergamon Press. Wilson, M. (1992). Measurement models for new forms of assessment in mathematics education. • In J.F. Izard & M. Stephens (Eds.) Reshaping assessment practices: Assessment in the mathematical sciences under challenge . Hawthorn, Australia: ACER. Wilson, M. (1992). Measuring levels of mathematical understanding. In T. Romberg (Ed.), • Mathematics assessment and evaluation: Imperatives for mathematics educators . New York: SUNY Press. From: Wilson, 2010.
More of my own background • Concluded that there was little enlightenment about learning progressions in results from these analyses – The items sets were too diverse, hence too sparse in what they conveyed about student learning, – The items differed in idiosyncratic ways. • Gave up doing that – needed to find another way ...
Things I learned … • Maxim No. 1. Assessment content specialists (“item writers”) do not know why they build items the way they do. Corrollary : They cannot predict the difficulty of their items. • Maxim No. 2. “Wild” items confound deeper cognitive/structuralist/diagnostic underpinnings with surface features (such as item wording, etc.).
Things I learned … • But Good curriculum developers create their curricula using a developmental way of thinking about learning, • Hence Need to engage with curriculum developers, not item writers • But Curriculum developers do not know how to write items.
Recommend
More recommend