Daniel Katz of Seton Hall University explores the meaning of Sheri Lederman’s victory in court over New York State’s teacher evaluation system, the one promoted by former Commissioner John King (now Secretary of Education). He shows the complicated statistical calculations that produce “VAM” ratings and growth scores. Bruce Lederman, the attorney representing his wife in the proceedings, called them “a statistical black box.” It is not clear that anyone understands these models or can claim that they accurately measure teacher quality. This case is probably the first in the nation where a teacher has successfully overturned her rating.
Katz writes:
Not only are these models difficult to impossible for teachers and most administrators to understand, they simply do not perform as advertised. Schochet and Chiang, in a 2010 report for Mathematica, found that in trying to classify teachers via growth models, error rates as high as 26% were possible when using three years of data, meaning one in four teachers could easily be misclassified in any given evaluation even if the evaluation used multiple years of data. Dr. Bruce Baker of Rutgers wanted to test the often floated talking point that some teachers are “irreplaceable” because they demonstrate a very high value added using student test scores. What he found, using New York City data, was an unstable mess where teachers were much more likely to ping around from the top 20% to below that and back up again over a five year stretch……
Equally important as the court’s recognition of arguments against value-added models in teacher evaluation, is the ground that was broken with the ruling. Ms. Lederman’s attorney (and husband), Bruce Lederman, sent out a message reported by New York City education activist Leonie Haimson which said, in part, ” …To my knowledge, this is the first time a judge has set aside an individual teacher’s VAM rating based upon a presentation like we made.” The significance of this cannot be overstated. For years now, teachers have been on the defensive and largely powerless, subjected to poorly thought out policies which, nevertheless, had force of policy and law on their side. Lederman v. King begins the process of flipping that script, giving New York teachers an effective argument to make on their behalf and challenging policy makers to find some means of defending their desire to use evaluation tools that are “capricious and arbitrary.” While this case will not overturn whatever system NYSED thinks up next, it should force Albany to think really long and hard about how many times they want to defend themselves in court from wave after wave of teachers challenging their test-based ratings.
Great result here based on the Lederman’s courage, intelligence, tenacity, and principled actions. Is there a case to be made seeking damages for the pain and injury SED has caused victims of the VAM?
An additional question about VAM that hasn’t received much attention is how chronic absenteeism affects scores. When a teacher never has the same group of kids in a particular class each time they meet due to high chronic absenteeism in the school, the kids will do poorly on any and all tests because they have not been taught the material. An additional burden is placed on the teacher who has to somehow accomidate the absentees when they do come to class while continuing instruction with the rest of her students, which also takes away from their instruction. The inquiries I’ve made on this topic revealed that no one really knows how bad the effect is due to the black box nature of VAM ‘s, even VAM experts don’t know because they have been unable to see what exactly any adjustments for absenteeism might be, whether or not they have any basis in reality or any validity/accuracy as an adjustment.
By 8th grade testing season, it is not uncommon for a student to have missed 100 to 300 periods of math instruction. no amount of remediation, extra help, or extra time can make up for this loss.
Other factors excluded from VAM formulas:
Student teachers, maternity leave, and long term teacher illness.
The main factors missing in VAM formulas:
validity and reliability.
In a faculty meeting about 8 years ago, we were all told to skim through a 15-16 page article which extolled the wonders of the higher scores being recorded at this or that mostly non-dominant-culture school. It wasn’t until the very last page where there was a short caveat stating that ALL statistical information had been presented based upon a student’s 80% attendance rate. No one in the room brought that up as we were all simply extolled to go out there and get our scores up, attendance rates be damned.
Every teacher who is evaluated on the basis of students’ standardized test scores should vigorously oppose the evaluation, citing the authoritative “Statement on Using Value-Added Models (VAM) for Educational Assessment” made by the American Statistical Association (ASA) that — quoting The Washington Post — “slammed” VAM. Teachers should be vigorously backed-up in this opposition by their unions because the ASA Statement completely shreds the phony foundations of VAM.
A copy of the seven-page ASA Statement should be posted on the union bulletin board at every school site and should be explained to every teacher by their union at individual site faculty meetings so that teachers are aware of what it says about how invalid it is to use standardized test results to evaluate teachers.
Even the anti-public school, anti-union Washington Post newspaper said this about the ASA Statement: “You can be certain that members of the American Statistical Association, the largest organization in the United States representing statisticians and related professionals, know a thing or two about data and measurement. The ASA just slammed the high-stakes ‘value-added method’ (VAM) of evaluating teachers that has been increasingly embraced in states as part of school-reform efforts. VAM purports to be able to take student standardized test scores and measure the ‘value’ a teacher adds to student learning through complicated formulas that can supposedly factor out all of the other influences and emerge with a valid assessment of how effective a particular teacher has been. THESE FORMULAS CAN’T ACTUALLY DO THIS (emphasis added) with sufficient reliability and validity, but school reformers have pushed this approach and now most states use VAM as part of teacher evaluations.”
The ASA Statement points out the following and many other failings of testing-based VAM:
> “VAMs typically measure correlation, not causation: Effects – positive or negative – attributed to a teacher may actually be caused by other factors that are not captured in the model.”
> “Most VAM studies find that teachers account for about 1% to 14% of the variability in test scores, and that the majority of opportunities for quality improvement are found in the system-level conditions.”
“System-level conditions” include everything from overcrowded and underfunded classrooms to district-and site-level management of the schools and to student poverty.
Fight back! Never, never, never give up!
VAM will single handedly cause a shortage of Math and Language Arts teachers, along with any subjects of selected grades which use VAM. VAM is brutal. VAM reminds me of a vampire. VAM has drained the life out of the joy of teaching. I’m relieved and beyond thankful to be able to walk away from education next year. The VAMpire will be looking hard for me, but it will not be able to find me. This profession does not deserve the wonderful young teachers entering into it. Their sad, bewildered discouraged faces break my heart.
There is a dark side to this victory. I think that the psychopathic billionaire oligarchs funding the autocratic, for profit, opaque and often fraudulent corporate public education reform movement will start spending more money to get their own judges appointed or elected until they control as many courtrooms and verdicts as possible — for instance, like the judge in the first Vergara decision.
Did any mainstream news organizations report on a this landmark case? I certainly did not read about it in the LA Times.
One might think that at least NPR would be interested in a story like this.
But then one would be wrong.
Because Bill Gates undermines…I mean underwrites them.
The case was reported in the NY Times,Wall Street Journal, Washington Post and several other places. Sheri also cosigned a letter with Randi Weingarten to help get out word. You can Google Sheri Lederman and you will see many articles.
National Propaganda Radio will continue to ignore truth in education.
They will continue to ignore the voices of experienced teachers. And they will continue to happily cash his checks.
The formula F=ma of Newton was the result of (hence verified by) hundreds of years of experiments. It’s a much simpler formula than VAM’s, and it’s in a field (physics) where we have about 2000 years’ of evidence that math can be used to accurately describe what’s happening.
So now, we are supposed to trust the VAM formula, the result of a few months of pondering, so much that we immediately dare to use it for all the 2 million teachers in this country. Show me just a single other formula that plays such a sweeping role in any society. Better yet, show me the formula Microsoft uses in evaluating its employees.
The input of the VAM formula is test scores. So even before the effectiveness of the formula is discussed, the validity of the input data needs to be discussed.
Validity check means answers to the questions “Do test scores reflect knowledge? Do they have causal relationship to teachers’ work in the class room?”
Thinking about the first question is way overdue, anyways. We have been using written tests is schools for a 100 years. Why? Because they really reflect knowledge or because as class sizes grew, it became impossible to develop enough relationship with kids to evaluate their knowledge satisfactorily.
“The Fudge Factory”
A factory for fudge
Of measures and of weights
Reality won’t budge
To VAM and Billy Gates
“Do test scores reflect knowledge?”
I took some CLEP tests and other tests which were allowed in lieu of courses that I had to take to get certified to teach in NY. I took Educational Psy. and Human Growth and Development. I never had a class in either one of these subjects and I didn’t study AT ALL. I mean AT ALL. And I scored extremely high on each test. Why? Do my test scores reflect my knowledge of the subject area? In my opinion, no. I had read some psychology books previously but nothing about human growth and development. I guess the closest I came to any formal training in that was high school biology class. I may be lucky or a good test-taker or able to access background knowledge, but I was certainly NOT very knowledgeable about those subjects in my estimation. I also took a Reading in the Elementary School test and scored an “A” without any studying at all. I don’t think I know the first thing about teaching reading in the elementary school!
Test scores must reflect innate intelligence combined with life experience plus a dash of the “test taking” gene.
You are describing how and why the level and success of parent education (not necessarily income) correlates so closely to test scores of their children.
There is something to be said about a student who is flat out “smart”.
Hi Rage,
You got it. My father was a doctor and my mother started out as a teacher and then became a lawyer later in life.
MKA
I really enjoy your contributions to this blog. You make your points so cogently using clear examples and well written plain-speak. Keep it going!
Some of the people who do very well on tests could not apply what they know in a real world situation if their life depended on it, particularly if the situation is one that was never talked about in their classes or in the texts they have read.
On the other hand, there are people who do very poorly on tests but are very good at sizing up and dealing with new situations.
I suspect that many folks from TFA fall into the former category. They undoubtedly have a good background in the subjects they studied, but that does not mean they can apply the information in a classroom setting. It takes much more than 5 weeks of summer camp to acquire the latter expertise.
Assuming otherwise is like expecting that someone who got a high score on the med school admissions test is ready to be a doctor with just a few weeks of summer camp. They may know what a “myocardial infarction” is but in all likelihood, have no clue how to deal with it when it happens.
So true SDP. So standardized tests are excellent instruments for identifying how well one is at taking standardized tests. For certain technical fields, standardized test results must have some value.
I would guess that a high SAT math score is somewhat useful for screening applicants into an engineering program. No?
” would guess that a high SAT math score is somewhat useful for screening applicants into an engineering program.”
I doubt it, unless all you want from an engineer to follow directions patiently and precisely.
Thank you, Rage. And I do it all without the “benefit” of a Common Core “education.” Sheesh! I wonder how that happened!
“First they standardize you, then they test you, then they VAM you, then you win” — Mahatma Lederman
This is not related, but it really is. Zuckerberg is in the middle of a controversy about the perceived liberal bias of Facebook. Zuckerberg blames an algorithm which is biased. He claims algorithms are like recipes. You put in the elements you want to get the desired results. That’s why VAM is designed to cause turmoil to teachers. http://thinkprogress.org/politics/2016/05/13/3777772/facebook-media-bias-conspiracy/