Audrey Amrein-Beardsley of Arizona State University is one
of the nation’s leading authorities on teacher evaluation. She has
the advantage of having taught middle school math for several
years. She understands better than almost any other researcher just
how flawed value-added measurement is.
Next year, her book on the
limitations of test-based accountability will be published.
I invited her to contribute to the blog so you would become familiar
with her valuable work.
She writes:
Stock Your Bunkers with VAMmunition
While “Top Ten Lists” have become a recurrent trend in
periodicals, magazines, blogs, and the like, one “Top Ten List,”
presented here, should satisfy readers’ of this blog and hopefully
other educators’ needs for VAMmunition, or rather, ammunition
practitioners need to protect themselves against the unfair
implementation and use of VAMs (i.e., value-added models).
Likewise, as “Top Ten Lists” typically serve reductionistic
purposes, in the sense that they often reduce highly complex
phenomenon into easy-to-understand, easy-to-interpret, and
easy-to-use strings of information, this approach is more than
suitable here whereas those who are trying to ward off the unfair
implementation and use of VAMs do not have the VAMmunition they
need to defend themselves in research-based ways.
Hopefully this list will satisfy at least some of these needs. Accordingly, I
present here the “Top Ten Bits of VAMmunition” research-based
reasons, listed in no particular order, that all public school
educators should be able to use to defend themselves against VAMs.
1. VAM estimates should not be used to assess teacher
effectiveness. The standardized achievement tests on which VAM
estimates are based, have always been, and continue to be,
developed to assess levels of student achievement and not levels
growth in student achievement nor growth in achievement that can be
attributed to teacher effectiveness. The tests on which VAM
estimates are based (among other issues) were never designed to
estimate teachers’ causal effects.
2. VAM estimates are often
unreliable. Teachers who should be (more or less) consistently
effective are being classified in sometimes highly inconsistent
ways over time. A teacher classified as “adding value” has a 25 to
50% chance of being classified as “subtracting value” the following
year(s), and vice versa. This sometimes makes the probability of a
teacher being identified as effective no different than the flip of
a coin.
3. VAM estimates are often invalid. Without adequate
reliability, as reliability is a qualifying condition for validity,
valid VAM-based interpretations are even more difficult to defend.
Likewise, very limited evidence exists to support that teachers who
post high- or low-value added scores are effective using at least
one other correlated criterion (e.g., teacher observational scores,
teacher satisfaction surveys). The correlations being demonstrated
across studies are not nearly high enough to support valid
interpretation or use.
4. VAM estimates can be biased. Teachers of
certain students who are almost never randomly assigned to
classrooms have more difficulties demonstrating value-added than
their comparably effective peers. Estimates for teachers who teach
inordinate proportions of English Language Learners (ELLs), special
education students, students who receive free or reduced lunches,
and students retained in grade, are more adversely impacted by
bias. While bias can present itself in terms of reliability (e.g.,
when teachers post consistently high or low levels of value-added
over time), the illusion of consistency can sometimes be due,
rather, to teachers being consistently assigned more homogenous
sets of students.
5. Related, VAM estimates are fraught with
measurement errors that negate their levels of reliability and
validity, and contribute to issues of bias. These errors are caused
by inordinate amounts of inaccurate or missing data that cannot be
easily replaced or disregarded; variables that cannot be
statistically “controlled for;” differential summer learning gains
and losses and prior teachers’ residual effects that also cannot be
“controlled for;” the effects of teaching in non-traditional,
non-isolated, and non-insular classrooms; and the like.
6. VAM estimates are unfair. Issues of fairness arise when test-based
indicators and their inference-based uses impact some more than
others in consequential ways. With VAMs, only teachers of
mathematics and reading/language arts with pre and post-test data
in certain grade levels (e.g., grades 3-8) are typically being held
accountable. Across the nation, this is leaving approximately
60-70% of teachers, including entire campuses of teachers (e.g.,
early elementary and high school teachers), as VAM-ineligible.
7. VAM estimates are non-transparent. Estimates must be made
transparent in order to be understood, so that they can ultimately
be used to “inform” change and progress in “[in]formative” ways.
However, the teachers and administrators who are to use VAM
estimates accordingly do not typically understand the VAMs or VAM
estimates being used to evaluate them, particularly enough so to
promote such change.
8. Related, VAM estimates are typically of no
informative, formative, or instructional value. No research to date
suggests that VAM-use has improved teachers’ instruction or student
learning and achievement.
9. VAM estimates are being used inappropriately to make consequential decisions. VAM estimates do not have enough consistency, accuracy, or depth to satisfy that
which VAMs are increasingly being tasked, for example, to help make
high-stakes decisions about whether teachers receive merit pay, are
rewarded/denied tenure, or are retained or inversely terminated.
While proponents argue that because of VAMs’ imperfections, VAM
estimates should not be used in isolation of other indicators, the
fact of the matter is that VAMs are so imperfect they should not be
used for much of anything unless largely imperfect decisions are
desired.
10. The unintended consequences of VAM use are
continuously going unrecognized, although research suggests they
continue to exist. For example, teachers are choosing not to teach
certain students, including those who teachers deem as the most
likely to hinder their potentials to demonstrate value-added.
Principals are stacking classes to make sure certain teachers are
more likely to demonstrate “value-added,” or vice versa, to protect
or penalize certain teachers, respectively. Teachers are
leaving/refusing assignments to grades in which VAM-based estimates
matter most, and some teachers are leaving teaching altogether out
of discontent or in protest.
About the seriousness of these and
other unintended consequences, weighed against VAMs’ intended
consequences or the lack thereof, proponents and others simply do
not seem to give a VAM.
The author misses one huge negative impact. Given the unreliable estimates, a poor teacher, one who probably should go, could receive a “highly effective” rating, and with that in hand, would be almost impossible to deal with.
There’s much that is objectionable about VAM, apart from its statistical lack of validity, having to do with the values and worldview embedded in the term itself.
According to Investopedia, a value-added metric seeks to measure “the enhancement a company gives its product or service before offering the product to its customers.”
Do we really want to have our students and children to be seen as “products” offered to “customers” (which, in the eyes of so-called reformers, means employers)?
I doubt that the teachers, administrators or parents at Sidwell Friends (Obama), The Lakeview School (Gates) and Spence (Bloomberg), among other elite private schools, see their students and children as “products” to be “sold.”
However, in the eyes of so-called reformers, that’s apparently fine for the children of the riff-raff, meaning 99% of the children in the country.
The emergence of VAM as a method of evaluating teachers corresponds to the rise of personal data as a fungible commodity. In the eyes of the education privateers, the kids are data, and we all know data is to be monetized. Just ask Eric Schmidt or Mark Zuckerberg about that. Thus the rise of InBloom, Inc., and other similar schemes.
VAM should be opposed not just because it’s a political project masquerading as pseudo-science, but because the mindset that underlies it is grotesquely at odds with a rich, humanistic education, one which all children deserve.
Here in NY, under APPR, EVERY TEACHER is VAM eligible. ELA and math teachers, in grades 3 – 8 are stuck with CCSS summative exams, however the rest of us are being evaluated with a variety of different formative and summative assessments. Some are teacher constructed, 8th grade science teachers must use a four-year (5-8) cumulative
state science test, many high school teachers must now use their Regents exams as summative assessments as well. In fact we need a combination of local and “state” tetsts to see if we produce student growth and/or achievement. The inequity, unreliability, and lack of validity of the VAM portion of our APPR evaluation should have many a lawyer drooling with anticipation. “Send lawyers, guns, and money, the shxxt has hit the fan” WZ
I’ve been recommending much the same to my local union leadership. I think we should challenge the DEformers if our education system and we would be able to win. Looking at the few public records available, I can see that VAM is unreliable in that it gives hugely different results for the same teacher over time, and has no a priori validity and has never been subjected to any trial runs.
Teachers and their unions need to lawyer up, to stats up (ie hire forensics statisticians), and to gear up for struggle in the streets too.
I hate to complain about this post, because I think it’s very, very important to point out the (numerous and varied) flaws with VAM being used for teacher evaluation.
But I don’t see how this post provides much in the way of actual “ammunition.” First, there’s really maybe two main points (the data is biased, the use of it in evaluations unfair), but somehow it’s put in a top ten list for effect?
There’s a lot of things claimed, but nothing cited to back up the claims. Look, I worked in a terrible, terrible charter school where the administration was at odds with many teachers in a pretty major way. And even I find the claim about principals in #10 difficult to believe. Without any proof, this just comes off as paranoid and discredits much of the rest of the (uncited) claims.
How exactly do you game a model that no one understands? Isn’t that kind of stacking of the classroom some kind of slow-burn punishment that may not even work? And if we went to a system where we didn’t have these “objective” measures like the high-stakes tests, wouldn’t we be relying even more on the principals’ evaluations? Wouldn’t the teachers be even more subject to their whims via more subjective evaluations?
There are real problems with VAM, and they can be explained in plain English. Cathy O’Neil, a mathematician who blogs over at mathbabe.org, has done a good job of this over the years:
The research is most certainly out there, so much so that citing it all would not have fit in such a blog post — it is, however, in the forthcoming book. Please email me if you would like more information about any of the 10 pieces of VAMmunition above: audrey.beardsley@asu.edu
Thank you for this excellent and clear eyed assessment of the very flawed VAM measures. I take a humorous look at the controversy here. I think teachers and teacher leaders can use these talking points to battle back against the implementation of the VAM.
http://russonreading.blogspot.com/2013/10/the-vam-moose-coming-to-school-near-you.html
I went to an Orange County Florida school board meeting and afterwards told a couple members “I am the world expert on the VAM -potentially” I smiled. They had not even looked at the equation. No one in that room or probably in the administrative building for 180,000 students had even looked at this. What had they almost passed the last meeting? Flunking, then firing all the new teachers in a high poverty school. I told the school lawyer.The attitude? They may care but they have been hoodwinked. They have all bought into voodoo school management. Who would ever want to say about the VAM? “My boss is an algorithm.”