Cathy O’Neill is a mathematician who wrote a wonderful book called “weapons of Math Destruction,” in which she showed how math and big data can be misused to reach very bad decisions.
She recently wrote an article for Bkloomberg News in which she explained why VAM is a terrible way to evaluate teachers.
She writes:
“For more than a decade, a glitchy and unaccountable algorithm has been making life difficult for America’s teachers. The good news is that its reign of terror might finally be drawing to a close.
“I first became acquainted with the Value-Added Model in 2011, when a friend of mine, a high school principal in Brooklyn, told me that a complex mathematical system was being used to assess her teachers — and to help decide such important matters as tenure. I offered to explain the formula to her if she could get it. She said she had tried, but had been told “it’s math, you wouldn’t understand it.”
“This was the first sign that something very weird was going on, and that somebody was avoiding scrutiny by invoking the authority and trustworthiness of mathematics. Not cool. The results have actually been terrible, and may be partly to blame for a national teacher shortage.
“The VAM — actually a family of algorithms — purports to determine how much “value” an individual teacher adds to a classroom. It goes by standardized test scores, and holds teachers accountable for what’s called student growth, which comes down to the difference between how well students performed on a test and how well a predictive model “expected” them to do…
“Fundamental problems immediately arose. Inconsistency was the most notable, statistically speaking: The same person teaching the same course in the same way to similar students could get wildly different scores from year to year. Teachers sometimes received scores for classes they hadn’t taught, or lost their jobs due to mistakes in code. Some cheated to raise their students’ test scores, creating false baselines that could lead to the firing of subsequent teachers (assuming they didn’t cheat, too).
“Perhaps most galling was the sheer lack of accountability. The code was proprietary, which meant administrators didn’t really understand the scores and appealing the model’s conclusions was next to impossible. Although economists studied such things as the effects of high-scoring teachers on students’ longer-term income, nobody paid adequate attention to the system’s effect on the quality and motivation of teachers overall.”
Happily, she says, VAM is on the way out. The new federal law does not require it, and courts have been ruling against. She cites Sheri Lederman’s court victory in New York and the recent court victory in Houston, where the judge said the algorithm was so opaque that it should not be used at all.
VAM has ruined the careers, the reputations, and the lives of many educators. One teacher, Rigoberto Ruelas, committed suicide soon after his VAM rating was published online by the Los Angeles Times. This is an example of a deadly use of math to damage real people, not just a game played by economists. Whoever participated willingly in this sham exercise should do penance.
VAM is an abomination. But let us not forget that evaluating teachers or students on the basis of test scores is an abomination even if the algorithms are elegant and the process is impeccable and transparent.
YES: The habit of those who “lead” education in turning a blind eye to the repeatedly proven fact that higher test scores and economic stability go hand in hand makes the blaming, harassing and test-score labeling of those schools and teachers working with our nation’s poorest students an abomination even more unforgivable.
Thanks Diane.
Mark Eichenlaub
On Wed, May 17, 2017 at 10:01 AM, Diane Ravitch’s blog wrote:
> dianeravitch posted: “Cathy O’Neill is a mathematician who wrote a > wonderful book called “weapons of Math Destruction,” in which she showed > how math and big data can be misused to reach very bad decisions. She > recently wrote an article for Bkloomberg News in which she expl” >
Which reminds me of s short and very informative piece that has been mentioned before on this blog.
“Mathematical Intimidation: Driven by the Data” by John Ewing. Available at—
Link: http://www.ams.org/notices/201105/rtx110500667p.pdf
😎
The first question people should always ask about the output from any mathematical model is “What are the underlying assumptions on which the model is based?”
The math is merely a way of getting from the assumptions to the conclusions, so if the assumptions are wrong, the math (whether done correctly or not) is actually completely irrelevant.
For example, if a value added model assumes that students in a teacher’s class are randomly assigned to teachers, but in reality, are NOT randomly assigned, there is no need to even look at the details of the VAM calculation.
Once you know that the model is based on a bogus assumption that does not adhere to the reality of the situation, you know the result can not possibly be correct.
Doing math based on bogus assumptions is called mathturbation and there is a great deal of it going on all over the place these days.
“Once you know that the model is based on a bogus assumption that does not adhere to the reality of the situation, you know the result can not possibly be correct.”
And that bogus assumption in education is that one can measure the teaching and learning process including what a student learns and supposedly knows. The most misleading concept/term in education is “measuring student achievement” or “measuring student learning”. The concept has been misleading educators into deluding themselves that the teaching and learning process can be analyzed/assessed using “scientific” methods which are actually pseudo-scientific at best and at worst a complete bastardization of rationo-logical thinking and language usage.
There never has been and never will be any “measuring” of the teaching and learning process and what each individual student learns in their schooling. There is and always has been assessing, evaluating, judging of what students learn but never a true “measuring” of it.
But, but, but, you’re trying to tell me that the supposedly august and venerable APA, AERA and/or the NCME have been wrong for more than the last 50 years, disseminating falsehoods and chimeras??
Who are you to question the authorities in testing???
Yes, they have been wrong and I (and many others, Wilson, Hoffman etc. . . ) question those authorities and challenge them (or any of you other advocates of the malpractices that are standards and testing) to answer to the following onto-epistemological analysis:
The TESTS MEASURE NOTHING, quite literally when you realize what is actually happening with them. Richard Phelps, a staunch standardized test proponent (he has written at least two books defending the standardized testing malpractices) in the introduction to “Correcting Fallacies About Educational and Psychological Testing” unwittingly lets the cat out of the bag with this statement:
“Physical tests, such as those conducted by engineers, can be standardized, of course [why of course of course], but in this volume , we focus on the measurement of latent (i.e., nonobservable) mental, and not physical, traits.” [my addition]
Notice how he is trying to assert by proximity that educational standardized testing and the testing done by engineers are basically the same, in other words a “truly scientific endeavor”. The same by proximity is not a good rhetorical/debating technique.
Since there is no agreement on a standard unit of learning, there is no exemplar of that standard unit and there is no measuring device calibrated against said non-existent standard unit, how is it possible to “measure the nonobservable”?
THE TESTS MEASURE NOTHING for how is it possible to “measure” the nonobservable with a non-existing measuring device that is not calibrated against a non-existing standard unit of learning?????
PURE LOGICAL INSANITY!
The basic fallacy of this is the confusing and conflating metrological (metrology is the scientific study of measurement) measuring and measuring that connotes assessing, evaluating and judging. The two meanings are not the same and confusing and conflating them is a very easy way to make it appear that standards and standardized testing are “scientific endeavors”-objective and not subjective like assessing, evaluating and judging.
That supposedly objective results are used to justify discrimination against many students for their life circumstances and inherent intellectual traits.
C’mon test supporters, have at the analysis, poke holes in it, tell me where I’m wrong!
I’m expecting that I’ll still be hearing the crickets and cicadas of tinnitus instead of reading any rebuttal or refutation.
Because there is no rebuttal/refutation!
And thanks for that opening, SDP!
Duane.
You are right that the fundamental assumption is that the standardized tests measure learning.
But the funny thing about VAM is that there is a whole list of assumptions that go into the model and if any one of them is false, the whole thing falls apart.
Audrey Amrein-Beardsley listed all the assumptions on her blog and pointed out that most of them were actually not met and could probably never be met, in practice.
So, it’s basically garbage in garbage out.
Not that it’s surprising that economists are behind this stuff. They are infamous for basing their models on completely bogus assumptions. Mainstream economics is a veritable tower of bogus assumptions followed by copious and furious mathturbation.
The only really surprising thing is that anyone is surprised that economists get things so wrong so often.
I doubt that Gates, one of the big promoters of VAM, will ever do any “penance.” for all the harm caused to teachers, students, schools and districts. He is too busy with his next assault on public education, competency based education or personalized learning which is neither competent nor personal. CBE is death by infiltration. Gates knows how this works. When he dangles cash in front of cash strapped schools districts, he gets to use our young people as guinea pigs and sell more products. It won’t matter that it will fail; Gates will get to exploit everyone with his latest “big idea.” This is what our country is about, feeding the hungry “market” looking for miracles and allowing billionaires to decide our future. http://curmudgucation.blogspot.com/2017/05/not-new-conversation.html
“It’s math. You wouldn’t understand it”
Translation: “It’s fraud and you WOULD understand that, unfortunately, so that’s we hide behind a facade of fake math and won’t let anyone see our computer code”
You got it.
Anyone grading teachers should have previously taught a class of 30, many second language teachers, many low income, many living in over crowded conditions before they’re grading teachers.
My guess is that the people making up these “algorithms” not only have never taught k-12 but also know little if anything about teaching and learning.
In fact, many of these folks (mostly economists) actually pride themselves on not knowing anything about the subjects that they “crunch numbers” on.
In their bizarro world, it is actually advantageous NOT to know. I have read actual comments from economists to the effect that knowing something about a field “biases” ones conclusions.
Well, yes, but one would hope as much. Knowing that energy is conserved in physical processes biases one’s conclusions in the interpretation of results, but that is hardly a bad thing. It allows one to reject clearly bogus “effects” that seem to violate energy conservation: Perpetual motion machines and the like.
The primary reason that some of these economists actually believe their own nonsense is precisely that they don’t know enough about the fields they are supposedly “analyzing” to recognize bogus results.
Me gusta esto.
Hell, don’t grade teachers at all!
Assess, evaluate, and/or judge their effectiveness which by definition takes a lot more of a narrative than just A, B, C etc. . . .
And let’s quit grading students also.
The absurdities and insanities that abound in grading practices are so hard to counteract because those grades are ingrained into, hammered into our brains from a very early age through to adulthood that grades seem “natural and normal”. It’s just the way things are.
And so was the Soviet system for almost 70 years harming and killing many until It imploded. When will we stop the harms and injustices caused to all children by stopping the sorting, separating and ranking grade system? Which by the way was enshrined for elementary and secondary schools about the same time as the start of the Soviet system. Historical coincidences can be interesting, eh!
You can keep up with the amazing work of Cathy O’Neal at her blog: math babe.
I applaud Cathy for what she is doing, but I don’t understand why more mathematicians have not spoken out about stuff like VAM.
Why was the statistics department at Harvard virtually silent when their colleague Raj Chetty was peddling his Chetty picked nonsense?
I can understand that mathematicians probably don’t wish to weigh in every time an economist says something dumb (or they would not have time to do any math!), But I don’t understand how they can simply sit back and watch while teachers and others are being abused by “math”.
While the ASA did weigh in on VAM with their position paper, I believe they were far too weak in their criticisms.
Given their conclusions about the problems of VAM (eg about the unreliability of VAM) they could have been much more forceful in their statements: eg, “VAM should not be used to evaluate individual teachers — period.”
Instead, they left enough room for some proponents of VAM to claim that ASA found no problems with VAM that could not be addressed with more data (eg, over several years).
Is this algorithm derived from genetics and the cattle industry? I’ve read that somewhere but don’t really know if it’s true or not. Really not a way to grade anyone IMHO….and theachers shouldn’t be graded anyway (evaluated yearly….yes).
“Arne DuncanVerified account @arneduncan 1h1 hour ago
More
Congrats to @KellyGonez & @nickmelvoin for winning the chance to serve all the schoolchildren in LA! Thanks for having the courage to lead!”
except the 84% who attend public schools. They weren’t mentioned at all in the charter campaign.
As usual in ed reform, public schools were treated as an inconvenient barrier to privatization. If they’re mentioned at all it’s in an unfavorable comparison to the totally awesome charter sector.
Congratulations to the 16% who attend the schools we support! Tough luck to those other kids who are in the yucky “government” schools we disdain.
Congrats to @KellyGonez &@nickmelvoin for winning the chance to serve all the billionaires…I mean school children in LA. Thanks for having the dollars …I mean courage to lead.
If the academics who promoted VAM believe in it, why don’t they apply it to professors?
We could have students take the SAT or ACT every year in college and see which professors “add value”.
Why apply it to only K-12? They should be lining up to be measured.
Good idea.
And college tenure decisions should also be based on VAM
If their students fail, it’s “No tenure! No tenure for you!” (The Tenure Nazi)
Of course, the reason that the vast majority of academics have said not one word about VAM is precisely that it does not affect them.
As long as it is just k-12 teachers, they don’t,’t give a damn.
“Whoever participated willingly in this sham exercise should do penance.”
No, they should be jailed for fraud and pay restitution to those harmed.
Agreed
Penance is for those who did something unintentionally and are truly sorry.
The people behind VAM did it quite purposefully and are not at all sorry for the havoc they wreaked. In fact, havoc and disruption were the intended outcomes. Firing teachers was the whole point.
I’d like to see all the teachers fired by VAm get together and go after Bill Gates’ with a billion dollar class action suit.
At the very least, it would be a way of forcing Gates to release his and the Gates Foundation’s emails and other correspondence with Arne Duncan regarding VAM.
Unfortunately, accountability is for us little folks. People like Gates can use people and then toss them away. I would love to read those e-mails, but I doubt we ever will.
SDP,
I remember the NY Times article about the Chetty study, in which one of the authors said the lesson of the study was “Fire teachers sooner, rather than later.”
I don’t think anyone was looking at the unintentional (or maybe it was intentional) results. Forcing probationariy teachers out because their score was so low they could not be saved. What happened to evaluations to make you a BETTER teacher? Sorry, this one is not gonna make it, lets cut them loose now.
Nor was anyone thinking the top 20% were going to walk out the door in protest because they could see it was all BS. So much for the career teachers. Teachers are leaving because of low pay. Well yes, but VAM is the straw that broke the camels back.
It was Chetty’s colleague Friedman who made the comment ghat “The message is to fire people sooner rather than later”
That from a fellow who has tenure at Harvard and can never be fired no matter how dumb his statements are.
Lucky for him. Otherwise, he would have been fired long ago.
When VAM was first used, I remember saying to my friends, “Teachers will just go to court” but for a long time, no one did. And then the New York teacher and her husband took action and that seemed to be the beginning of the end. When people speak out against a great injustice, as those two did, they help so many other people.
The really evil part of this whole thing is that a person doesn’t need to know math to know how ridiculous VAM is. Anyone who has children or reads a newspaper should know that standardized test scores reflect the socio-economic background of the student. In order to evaluate the influence of the teacher, a skilled professional would have to evaluate each child individually in the fall, and then follow that child’s progress on school-based instruction throughout the year. Even that might not get accurate results because we all know that some students get much help from parents and tutors on schoolwork, while others get nothing. In the end, a teacher should be evaluated just as other professionals are, by their peers.
Why haven’t university professors spoken out more about this? Once, when I pointed out to a researcher that the standardized tests scores he used weren’t reliable for his study, he said, “Oh, we know that but it’s all we have.”
Did the journalists of the Los Angeles Times know that those test scores did not reflect the competence of the teacher? If so, I hope they carry the guilt of that poor teacher Rigoberto, who killed himself after he was publicly labeled as a low-performing teacher by the newspaper. Ironically, he was one of those teachers who devoted his life to his students. It still makes me so angry to think about it and I wish his family had sued.
Sad.
During the early days of VAM, I also thought there would be a ton of lawsuits. Teachers and their unions seemed to be caught in some type of paralysis in the early VAM days. I think the Ledermans for their courage to be the first to expose the lies of VAM. The sad part is that the unions were complicit with Gates’ CCSS and VAM. The unions were getting paid by Gates as they sat at the table and gave credence to Gates’ mechanism for destroying teachers’ careers. I found an old article from Mercedes Schneider wrote in 20 13-14. Read it and weep!
http://www.huffingtonpost.com/mercedes-schneider/nea-aft-common-core-and-v_b_4252679.html
I first realized the unions were “bought off” when they deserted the DC teachers.
“It’s stupid, but it’s all we’ve got”
We know that VAM’s a sham
But that is all we’ve got
Our work is simply spam
And science it is NOT
But that is all you’ll get
Till better comes along
It’s stupid, you can bet
But Stupid is our song
I knew William Sanders, one of the purveyors of VAM. I did not know him well, but my conversation with him convinced me that he thought his system foolproof. Political leaders are to blame for foisting this system upon the public, not statisticians like Sanders. I believe he erred in his assumptions, as SDP pointed out above.
I once asked him how his system accounted for differences in opinion about what was appropriate for a child to learn, what we now call standards. He was completely unable to even process the idea that two history teachers might disagree about what needed to be taught. So SDP is right. Beginning with improper postulates leads ultimately to silly conclusions.
No matter whether you are honest about your intentions or committed to all manner of crime, reasonable discourse in government is supposed to weed out the stupid and dishonest ideas. But there has not been reasonable discourse for a long time where education is concerned. Where are we going?
If I may add a qualifier to your statement, Roy: “for differences in opinion about what was appropriate for a child to learn, what we now WRONGLY call standards.”
The word standard is in the top 1000 most used words in American English and the Miriam Webster online dictionary gives the following definitions :
Standard
1: a conspicuous object (as in a banner) formerly carried at the top of a pole and used to mark a rallying point especially in battle or to serve as an emblem
2a: a long narrow tapering flag that is personal to an individual or corporation and bears heraldic devices b: the personal flag of the head of state or of a member of a royal family c: an organization flag carried by a mounted or motorized military unit d: banner
3: something established by authority, custom, or general consent as a model or example: criterion <quite slow by today’s standards>
4: something set up and established by authority as a rule for the measure of a quantity, weight, extent, value, or quality
5a: the fineness and legally fixed weight of the metal used in coins b: the basis of value in a monetary system
6: a structure built for or serving as a base of support
7a: a shrub or herb grown with an erect main stem so that it forms or resembles a tree b: a fruit tree grafted on the stock that does not induce dwarfing
8a: the large odd upper petal of a papilionaceous flower (as of the pea) b. one of the three inner usually erect and incurved petals of an iris
9: a musical composition (as a song) that has become a part of the standard repertoire
For the purposes of this discussion, obviously definitions 1, 2, 5, 6, 7, 8, and 9 do not concern us. It is the somewhat similar and perhaps inter-confusing definitions of 3 and 4 that interest us.
As mentioned above before NCLB the definition of standard as used in the individual state’s curriculum standards and even today in curriculum standards promulgated and promoted by subject area organizations such as the National Council of Teachers of Mathematics or the American Council of Teachers of Foreign Languages the term standard as used fell/falls under definition three as they were never meant to be used as “a rule for the measure of a quantity, weight, extent, value, or quality” as in definition four but as a model for teachers to use. Confusing indeed!
Another way to look at the concept of standards is that there are two accepted types of standards, metrological and documentary.
Metrology is the science of measurement and a metrological standard “is an object, system, or experiment that bears a defined relationship to a unit of measurement of a physical quantity. Standards are the fundamental reference for a system of weights and measures, against which all other measuring devices are compared. Measurements are defined in relationship to internationally-standardized reference objects, which are used under carefully controlled laboratory conditions to define the units of length, mass, electrical potential, and other physical quantities.”
A documentary standard “is a document established by consensus and approved by a recognized body, that provides, for common and repeated use, rules, guidelines or characteristics for activities or their results, aimed at the achievement of the optimum degree of order in a given context.” Many governmental departments promulgate documentary standards, for example the Food and Drug Administration (FDA) or the Environmental Protection Agencies (EPA) while at the same time being the certifying agent to ensure that the standards are followed. The ISO promulgates international standards but is not the certifying agency, other agencies do the certifying of companies compliance with their standards. The ISO has strict rules for making and issuing standards. The key principles in standard(s) development:
1. ISO standards respond to a need in the market.
ISO does not decide when to develop a new standard, but responds to a request from industry or other stakeholders such as consumer groups. Typically, an industry sector or group communicates the need for a standard to its national member who then contacts ISO.
2. ISO standards are based on global expert opinion.
ISO standards are developed by groups of experts from all over the world that are part of larger groups called technical committees. These experts negotiate all aspects of the standard, including its scope, key definitions and content.
3. ISO standards are developed through a multi-stakeholder process.The technical committees are made up of experts from the relevant industry, but also from consumer associations, academia, NGOs and government.
4. ISO standards are based on a consensus
Developing ISO standards is a consensus-based approach and comments from all stakeholders are taken into account.
The Common Core State Standards (CCSS) and all other state educational standards might be considered a documentary standard but in the development of the standards no procedures have followed the formal protocol and processes as outlined by the OSI or government agencies in their development.
In addition to that and perhaps even worse is that the proponents of these standards claim that the CCSS are standards against which ‘student achievement’ can be measured. In doing so educational standards proponents claim the documentary standard (definition three) as a metrological standard (definition four). In doing so they are falsely claiming a meaning of standard that should not be given credence .
I would agree with all of this and have written such sentences myself, though not with your clarity or reason. I can recall chaffing at the idea of “behavioral objectives” when I was young and thoug that teaching had to do with ideas. I also had a friend who told of being introduced to Blooms taxonomy by Bloom himself, although he did not remember who Bloom was. He did remember a crusty old teacher storming out of the meeting with comments about Blooms canine heritage. I spotted the subject matter as Bloom and since the friend recalled the professor was from U of Chicago, it seems that Bloom was called out for these ideas in his day.
It’s from Ch. 6 Standards and Measurement of my forthcoming book. Will let all know when I get it from the printer.
As far as Bloom, first it is Bloom, et al.
not just him. I have also read/heard that Bloom himself condemned the way his taxonomy was being used by most educators.
7a A shrub or herb grown with an erect main stem so that it forms or resembles a tree”
Undoubtedly the definition of standard used by George Bush for No Child Left Behind
But, on a less serious note, I look forward to your book, Duane.
“Foolproof or Proof of fool?”
“Foolproof” is a foolish goal
When people are involved
The proof of fool is that he holds
That “foolproof” can be solved
“That foolproof is resolved”
Is VAM really on the way out? I believe it. Junk science can’t last forever. But if VAM disintegrates, what ever are we going to call you, DAM?
DAM I am
And DAM I’ll be
If VAM be damned
I’ll still be me
Any other questions?
Inspired by Kristina’s comment on Mathbabe’s site
“The VAM behind the curtain”
The man behind the curtain
Is faking it, for certain
He uses VAM
A mathy sham
And Toto is alertin’