Audrey Amrein-Beardsley reports here on new research by Steven Klees of the University of Maryland, which concludes that the contribution of individual teachers to student learning cannot be isolated or quantified as “value-added modeling” claims to do.

Accumulating evidence continues to demonstrate that the teacher evaluation systems imposed by Arne Duncan in the Race to the Top is invalid, inaccurate and unreliable. How many teachers and principals have been fired because of these flawed metrics?

VAMs Are Never “Accurate, Reliable, and Valid”

Open the article for her many links.

She writes:

The Educational Researcher (ER) journal is the highly esteemed, flagship journal of the American Educational Research Association. It may sound familiar in that what I view to be many of the best research articles published about value-added models (VAMs) were published in ER (see my full reading list on this topic here), but as more specific to this post, the recent “AERA Statement on Use of Value-Added Models (VAM) for the Evaluation of Educators and Educator Preparation Programs” was also published in this journal (see also a prior post about this position statement here).

After this position statement was published, however, many critiqued AERA and the authors of this piece for going too easy on VAMs, as well as VAM proponents and users, and for not taking a firmer stance against VAMs given the current research. The lightest of the critiques, for example, as authored by Brookings Institution affiliate Michael Hansen and University of Washington Bothell’s Dan Goldhaber was highlighted here, after which Boston College’s Dr. Henry Braun responded also here. Some even believed this response to also be too, let’s say, collegial or symbiotic.

Just this month, however, ER released a critique of this same position statement, as authored by Steven Klees, a Professor at the University of Maryland. Klees wrote, essentially, that the AERA Statement “only alludes to the principal problem with [VAMs]…misspecification.” To isolate the contributions of teachers to student learning is not only “very difficult,” but “it is impossible—even if all the technical requirements in the [AERA] Statement [see here] are met.”

Rather, Klees wrote, “[f]or proper specification of any form of regression analysis…All confounding variables must be in the equation, all must be measured correctly, and the correct functional form must be used. As the 40-year literature on input-output functions that use student test scores as the dependent variable make clear, we never even come close to meeting these conditions…[Hence, simply] adding relevant variables to the model, changing how you measure them, or using alternative functional forms will always yield significant differences in the rank ordering of teachers’…contributions.”

Therefore, Klees argues “that with any VAM process that made its data available to competent researchers, those researchers would find that reasonable alternative specifications would yield major differences in rank ordering. Misclassification is not simply a ‘significant risk’— major misclassification is rampant and inherent in the use of VAM.”

Klees concludes: “The bottom line is that regardless of technical sophistication, the use of VAM is never [and, perhaps never will be] ‘accurate, reliable, and valid’ and will never yield ‘rigorously supported inferences” as expected and desired.