Some weeks back, the media reported that the District of Columbia’s infamous teacher evaluation program–known as IMPACT–was successful, based on a paper by researchers Thomas Dee and James Wyckoff. The takeaway allegedly was that VAM (value-added measurement) works and that DC is right to judge teacher quality by student test scores.

But Audrey Amrein-Beardsley, one of the pre-eminent national experts on VAM, says “not so fast. Don’t believe the hype.”

In this post, she dissects the DC research and says it was never peer-reviewed and is deeply flawed.

She identifies many errors, but this one is the most egregious:

“Teacher Performance:” Probably the largest fatal flaw, or the study’s most major limitation was that only 17% of the teachers included in this study (i.e., teachers of reading and mathematics in grades 4 through 8) were actually evaluated under the IMPACT system for their “teacher performance,” or for that which they contributed to the system’s most valued indicator: student achievement. Rather, 83% of the teachers did not have student test scores available to determine if they were indeed effective (or not) using individual value-added scores. It is implied throughout the paper, as well as the media reports covering this study post release, that “teacher performance” was what was investigated when in fact for four out of five DC teachers their “performance” was evaluated only as per what they were observed doing or self-reported doing all the while. These teachers were evaluated on their “performance” using almost exclusively (except for the 5% school-level value-added indicator) the same subjective measures integral to many traditional evaluation systems as well as student achievement/growth on teacher-developed and administrator-approvedclassroom-based tests, instead.

Thus, it is wrong to say that the paper vindicates IMPACT or its reliance on VAM when more than four of every five teachers in the study did not have value-added scores available.