Kevin Welner, director of the National Education Policy Center, wrote this commentary in response to the complaints of teachers who are evaluated by the scores of students they never taught. Few people can understand the complex algorithms underlying VAM scores, and the people who wrote these formulae can’t explain them in pain English. Yet teachers are fired or get a bonus if their incomprehensible rating is low or high. Bear in mind that few, if any, states would have adopted these measures without the financial and political pressure exerted by Arne Duncan, Race to the Top, and the Obama administration, which demanded them.
Welner writes:
“As you probably know, Diane, my biggest concerns about high-stakes accountability systems tied to measures of academic growth aren’t technical—they’re about perverse incentives. Yes, the technical problems are very real, but even if they were all somehow overcome, we’d be left with a much poorer system of education that’s narrowly focused on what’s being measured.
“Having said that, I do want to add to your earlier post concerning the Florida VAM. I think the post makes three good points but overlooks the most important one.
“As you point out, the model is nonsense when applied to educators who don’t teach the tested subjects. And as you point out, application of the model results in misclassifications—as do all such models. Finally, as you point out, very few readers can understand the model.
“But that leads to a somewhat different point that I think is very important. Florida’s legislators, its Commission of Education, and the members of the State Board of Education almost surely are among those who cannot understand the model. My hunch is that the AIR experts who developed FL’s model have walked through it, possibly multiple times, with these policy makers. But the math is just too complex. (Note that the excerpt you pasted from page 6 of the AIR report is just the general form of the model; if expanded it would be much more overwhelming—see the next 10 pages of the report.)
“This is not a criticism of the model or its developers; simple regression models that could be relatively easily understood have well-documented flaws. But adding vectors capturing the effect of lagged scores, mathematical descriptions of Bayesian estimates, and within-student covariance matrices—while all justified in the report—has the obvious effect of placing policy makers at the mercy of whichever experts they choose to listen to.
“This sort of problem does come up in other contexts; to some extent it’s unavoidable. When Congress votes to fund a NASA mission, the underlying math, physics and engineering are similarly beyond normal understanding. When judges hear expert testimony in a pharmaceutical case, etc., they also must confront their own limitations. But at least in those instances, there’s a procedure in place to take oppositional testimony.
“The best analogy here is probably to the defense industry, which works with people in the defense department to design a new weapons system and then helps to market it to Congress. The result is often something technically sophisticated and, for most members of Congress, well beyond their ability to understand strengths and weaknesses.
“Perhaps that’s why the non-technical evidence is so important. We can all understand the problem when a teacher explains that her evaluation is based on the academic growth of students in areas she doesn’t teach. We can all also, to some extent, understand the problem of unreliable evaluations that result in misclassifications.
“But we should, at the very least, recognize and acknowledge the reality that these policies are being adopted by policy makers who pretty much have no clue what it is that they’re putting in place.”

Hard to believe that one of W’s lines in debates with Al Gore was “the man is practicing fuzzy math again.”
Fuzzy math rules the day now.
LikeLike
All math was fuzzy to W.
LikeLike
And yet, he was President not one but two terms. Go figure.
LikeLike
I think this leaves out, poverty by law cannot be factored into the formula, and despite all it’s flaws AIR was just awarded 220 million to create our next round of standardized tests, which they plan to field test in Utah, you know because Utah and Florida are so similar.
LikeLike
While proponents likely do not grasp the calculus of VAM or any other aspect of the accountability metrics, they do know well that the demonization of public schools and their educators is a function of the metrics machine. The metrics are all based upon standardized test performance as everyone knows. But, because these tests reflect only the lived experience of the students, they are not psychometrically valid tests, reflecting only the socio-economic conditions the students are reared in. The are also passing legislation that totally misused VAM even against the recommendations EVASS’s highly talented statisticians. The tests are false proxies for performance and the system represents pseudo accountability–holding educators responsible for forces and factors over which they have no professional control.
LikeLike
Not only “proponents likely do not grasp the calculus” but more likely than not the opponents. But it’s not that hard:
You multiply this or that factor by this or that constant (found in the thin air of the brain of the formula maker) add it together with this or that other factor that may or may not have been multiplied by a constant, subtract other constants that may or may not have been multiplied by some other constants and run it through your silicon crystal ball and the VAM score for each teacher magicaly appears in the center of the ball but only if you are wearing the right outfit (see below)
LikeLike
If you take time to read about the Florida legislature’s shenanigans while it is currently in session you will see quite clearly that they don’t really care at all if they don’t understand the underlying formulae for VAM. Jeb Bush says it has to be done so they do it. It is all in service to Bush’s presidential campaign aspirations and appeasing ALEC, the Business Roundtable, and the Chamber of Commerce, along with the Tea Party for some.
I object to giving the legislators and members of the State Board of Education a pass for not understanding the complex math. They most likely saw that as a feature rather than a bug because it would make it easier to dupe the public. That’s how they roll.
LikeLike
I’ll ask a dumb question, what’s lagged score mean? Is that commonly termed improvement? Or, measuring change in score?
LikeLike
Notice how we all jumped in to answer your question. 🙂
LikeLike
In this case, the model creators were using multiple prior test scores of the individual students. The vector approach was used to incorporate into the model those past scores.
LikeLike
Except when the students don’t have prior test scores, like Kindergarteners, 1st and ins graders, art, music, physical education, etc.
Then they just made it up out of whole cloth by using the scores of students whom never entered your classroom.
LikeLike
But teachers are encouraged to use the data collected throughout the year. . .it’s the norm now and everyone knows their evaluations are tied to it (at least that of certain grade level peformance; our entire school will have VAM based on the scores of two grade levels. . .every teacher, every subject. . .this is why administrators can easily push that we are all part of the preparing the students (for the test). It is the daily reality for teachers. I think opting out by parents is the only way around it.
Teachers are coached as follows:
— all teachers in grades K-3 to drill down into their data. Make sure that you are looking at the numbers for the goals in Dibels and
which goals need to be mastered. This means, students reading below grade level, probably
need to be Progress Monitored on Dibels Assessments that have been administered before
your grade level. Assess on the lowest level Dibels reading assessment until the student obtains
the goal. Please look at “What’s Next” in the upper right hand corner of the screen. Go to
your leveled groups and pick the name of a student. You will see possible interventions for that
child on the right hand side of the screen. Some of these interventions could really help your
students. These interventions need to be reflected in your Form C updates.
LikeLike
You are assuming legislators understand any of the legislation they vote on. Is there any proof of this contention?
LikeLike
Of course there is the expression garbage in, garbage out. But in this case the data going in is fine, but as per many examples, there is garbage out. So the logical conclusion is that the equation is garbage.
LikeLike
No, TC, the data going in is not “fine” but rather manipulated junk. Assessing teachers on the test scores of students they don’t currently (and may never have) taught is not fine. Raising and lowering the arbitrary cut scores on the FCAT test (here in Florida) as an act of political expedience is not “fine”. Dis-allowing factors such as poverty. ELL status, and ESE status is not “fine”.
The date going in is garbage.
The equation is garbage.
The VAM scores are garbage.
LikeLike
The VAM scores are not garbage . . .
. . . they’re weapons.
The tests themselves are the spear point of the invasion.
LikeLike
Yes I agree Chris, the whole concept is misguided. The way to extract the best results from teachers, is to pay them sufficiently, and give them freedom of autonomy, mastery, and purpose.
Rating, ranking, and micromanaging work, where people are assumed to be respected professionals, is proven to be counterproductive to motivation and the best end results.
LikeLike
“When judges hear expert testimony in a pharmaceutical case, etc., they also must confront their own limitations.”
They do, but as he notes it’s adversarial, so they hear “dueling experts”. They also have a process to certify the expert as an expert.
I think judges would be helpful in the ed reform debate. They hear tons of BS over the course of a career, just heaps of it, so many of them (not all) are really, really good at sniffing it out.
For example, if the Common Core were on trial and the defense started with this, from Arne Duncan:
“Let’s lower standards and go back to lying to ourselves and our children, so that our community can feel better.”
Not even the most rookie municipal court judge would consider that an “argument” 🙂
LikeLike
“Stupid is as stupid does.” – Forest Gump
All we need is lawyers willing to go to court and we can all start filing cases. Imagine a million court cases all focused on NCLB, Race to the Top, Common Core, standardized testing, President Obama, Bill Gates, etc.
Okay you older teachers, dust off the old rosters and start looking for former students who are now lawyers and make those calls.
LikeLike
You can’t make this stuff up.
President George W. Bush Jr., Mr. NCLB himself: “A reading comprehension test is a reading comprehension test. And a math test in the fourth grade—there’s not many ways you can foul up a test… It pretty easy to ‘norm the results.’” [from early 2001, quoted in Daniel Koretz, MEASURING UP: WHAT EDUCATIONAL TESTING REALLY TELLS US, 2009, p. 7]
The lack of even an elementary understanding of standardized testing from the “Education President” is stunning.
From the Land of VAM (i.e., Tennessee) during hearings in 2004 on the Tennessee Value Added Assessment System [TVAAS]:
[start quote]
… Vernon Coffey, Director of Grainger County Schools and former Tennessee Commissioner of Education, stated that he believe TVAAS to be as reliable and valid as SAT and ACT, while admitting that “I don’t understand all the numbers, but I’m not supposed to.”
[end quote]
[Jim Horn and Denise Wilburn, MISMEASURING EDUCATION, 2013, p. 105]
“I’m not supposed to.” This is what passes for intellectual and moral leadership among the leaders of the “new civil rights movement of our time.”
You don’t have to understand the sometimes arcane mathematics behind VAM and standardized testing to grasp the basic conceptual framework and intuition behind them. Nor is it difficult to understand the practical consequences and outcomes.
But what can we do when the folks in the education establishment have this sort of attitude towards numbers and stats of all kinds:
“I told Dr. Steve Perry and Dr. John Deasy I broke my leg in two places. They told me to quit going to those places.” [with all apologies to Henny Youngman]
😎
LikeLike
Great post.
“This is not a criticism of the model or its developers; simple regression models that could be relatively easily understood have well-documented flaws. But adding vectors capturing the effect of lagged scores, mathematical descriptions of Bayesian estimates, and within-student covariance matrices—while all justified in the report—has the obvious effect of placing policy makers at the mercy of whichever experts they choose to listen to.”
I’ve sat through a presentation on Florida VAM although everyone in that particular room did understand the technical aspects. Nevertheless, the speaker still used the terms “good” and “bad” teachers and no one questioned the underlying logic of such evaluations even though they knew better. They were impressed by the statistical difference in test scores between the “bad” and “good” teachers.
LikeLike
No one considered looking at the issue like Pasi Sahlberg does.
LikeLike
Here is a link to Ohio Teacher Evaluation System with other links as to how it is implemented. 50% is tied to testsing !
I am so glad I retired from this madness.
Our principal is the most hateful evaluator. And, as I have said before, this is with teachers who have been delivering top county schools for 20 years…before NCLB. But this less than kind principal has made life a living cesspool since she arrived.
Anyway, this is what ODE has put into place.
http://education.ohio.gov/Topics/Teaching/Educator-Evaluation-System/District-Educator-Evaluation-Systems/eTPES-Help
LikeLike
This just in … they tore up the OTES bill and have inserted another …via Kasich and henchmen.
http://www.plunderbund.com/2014/03/26/house-education-committee-hijacks-bill/
LikeLike
VAM: The Scarlet Letter. A talk given to the School Board of Palm Beach County, FL.
http://youtu.be/dfMymU86Bjo
LikeLike