In New Mexico, District Judge David K. Thomson issued a preliminary injunction against the use of the state’s teacher evaluation system, which tied consequences for teachers to student test scores. Unfortunately for the state, the research, the facts, and the evidence were not on their side.
Audrey Amrein-Beardsley was the expert witness against the New Mexico Public Education Department’s value-added teacher evaluation system, and she explains here what happened in court. Her account includes a link to the judge’s full ruling.
She writes:
Late yesterday [Tuesday], state District Judge David K. Thomson, who presided over the ongoing teacher-evaluation lawsuit in New Mexico, granted a preliminary injunction preventing consequences from being attached to the state’s teacher evaluation data. More specifically, Judge Thomson ruled that the state can proceed with “developing” and “improving” its teacher evaluation system, but the state is not to make any consequential decisions about New Mexico’s teachers using the data the state collects until the state (and/or others external to the state) can evidence to the court during another trial (set for now, for April) that the system is reliable, valid, fair, uniform, and the like.
As you all likely recall, the American Federation of Teachers (AFT), joined by the Albuquerque Teachers Federation (ATF), last year, filed a “Lawsuit in New Mexico Challenging [the] State’s Teacher Evaluation System.” Plaintiffs charged that the state’s teacher evaluation system, imposed on the state in 2012 by the state’s current Public Education Department (PED) Secretary Hanna Skandera (with value-added counting for 50% of teachers’ evaluation scores), is unfair, error-ridden, spurious, harming teachers, and depriving students of high-quality educators, among other claims (see the actual lawsuit here).
Thereafter, one scheduled day of testimonies turned into five, in Santa Fe, that ran from the end of September through the beginning of October (each of which I covered here, here, here, here, and here). I served as the expert witness for the plaintiff’s side, along with other witnesses including lawmakers (e.g., a state senator) and educators (e.g., teachers, superintendents) who made various (and very articulate) claims about the state’s teacher evaluation system on the stand. Thomas Kane served as the expert witness for the defendant’s side, along with other witnesses including lawmakers and educators who made counter claims about the state’s teacher evaluation system, some of which backfired unfortunately for the defense, primarily during cross-examination. [Kane, an economist] has been the chief research advisor to the Gates Foundation about teacher evaluation.]
Open the post to see her many links, her analysis of the decision, and the many local articles about it.
The state, not surprisingly, called the decision “frivolous” and “a legal PR stunt.” It claimed that it would continue doing what the judge said it was not allow to do. I think a judge’s order trumps the will of the New Mexico PED.

What students really need is their teachers fighting for their jobs in courts of law. (snark alert)
LikeLike
Reblogged this on David R. Taylor-Thoughts on Education.
LikeLike
VAM the politicians.
LikeLike
Theory on why Cuomo (and Duncan’s) teacher ranking schemes aren’t popular with the public:
“One of my theories is that the governor overestimated public dissatisfaction with teachers and schools.
We’ve all had bad teachers, but for many of us, the
good teachers outnumber the bad. Most of us also know teachers personally, and count them among our friends and relatives.
We’re probably more inclined to listen to their opinions on education than, say, Bill Gates or Arne Duncan — two big proponents of education reform.”
I wonder if it’s because they have so surrounded themselves with ed reform “movement” people that they (literally) don’t hear anything else, so assume everyone shares the opinions of that group. It seems to take so long for public disapproval to penetrate, which is odd since we’re talking about public schools- the actual people who use them.
http://www.dailygazette.com/weblogs/foss/2015/dec/02/1202_foss/
LikeLike
Your wondering snare interesting in light of a person Inheard recently on the radio suggesting that a limit of about 150 was average for the circle of friends for an individual. I was driving, and do not have a memory, but I think they called that the monkeysphere. Given that, it is no surprise that any public official would have a limited group from which he or she takes advice. Thus the natural isolation of humanity “doth make nearsighted cowards of us all”. Perhaps that perverts the idea I heard about and Shakespeare too. At least he would forgive me.
LikeLike
Yvonne, I agree! What is amazing is police officers and Teachers get all the scrutiny of being public servants yet the highest ranking public servants (politicians) are under no scrutiny or accountability. Yes, it is possible to say that politicians every moves are watched but only criticism they get is in the OP-ED sections. It is extremely rare that one takes their representatives to court to sue them for being ineffective.
To the politicians and the Gates Foundations and the Waltons and the like, the biggest hurdle to reform was the teachers. Setting up models to get teachers out of the way seems to have been the entire plan all along. Thinking that those who can’t, teach meant that teachers are unintelligent and it would have been an easy process to get those stupid, unintelligent teachers out of the way. Careful in judging the intelligence of a profession based on the rate of pay. Teachers can light any profession up with rational discourse that can put anyone in their place.
LikeLike
You are right. VAM was intended to keep teacher’s in their place.
And folks like Gates, Duncan, Obama, Chetty and Kane and other (Hee) Haaaavid types completely misunderestimated teachers.
“Achille’s VAM Boot”
VAM’s a steel-heeled boot
To guard against an arrow
From teachers who won’t root
For testing straight and narrow
LikeLike
“the state is not to make any consequential decisions about New Mexico’s teachers using the data the state collects until the state (and/or others external to the state) can evidence to the court during another trial (set for now, for April) that the system is reliable, valid, fair, uniform, and the like.”
In other words, “until hell freezes over”.
A toast to Audrey Amrein-Beardsley and her colleagues for a job well done. Their accomplishment will be sung for ages to come
“The night they drove Statricksy down” (parody of “The night they drove Old Dixie down)
Thomas Kane is my name and I drove on the VAMville train
‘Til Audrey Beardsley came and tore up the tracks again.
After the ASA* paper knife , we were hungry, just barely alive.**
By twenty-fourteen, Rich man had fell.
It’s a time I remember, oh so well.
(*American Statistical Association, **only had $ 45 million from the “Rich man”, Bill Gates)
The night they drove statricksy down
And all the bells were ringing,
The night they drove statricksy down
And all the people were singing
They went, “Na,na,na.na,
Na na na na na na na na na.”
Back with my colleague, Raj Chet-ty, when one day he called to me,
“Thomas, quick, come see, there goes the Gastesly Billee!”
Now I don’t mind I’m choppin’ stats, and I don’t care if I’m paid by the brats
You take what you need and leave the rest,
But they should never have taken the VAMmy best.
The night they drove statricksy down
And all the bells were ringing,
The night they drove statricksy down
And all the people were singing
They went, “Na,na,na.na,
Na na na na na na na na na.”
Like my “Father”*** before me, I will work the VAM
And like my colleague before me, I took a junk-stat stand.
He was just 34, proud and brave,
but the ASA put him in his grave.
I swear by the mud below my feet
You can’t raise a Kane back up when he’s in defeat
(***Eric Hanushek, Father of VAM)
The night they drove statricksy down
And all the bells were ringing,
The night they drove statricksy down
And all the people were singing
They went, “Na,na,na.na,
Na na na na na na na na na.”
The night they drove statricksy down
And all the bells were ringing,
The night they drove statricksy down
And all the people were singing
They went, “Na,na,na.na,
Na na na na na na na na na.”
LikeLike
Wow. A judge who has common sense. I’m not sure why it takes a judge to tell the State that their accountability standards are unfounded.
LikeLike
This line from the posting struck me:
“Thomas Kane served as the expert witness for the defendant’s side, along with other witnesses including lawmakers and educators who made counter claims about the state’s teacher evaluation system, some of which backfired unfortunately for the defense, primarily during cross-examination.”
An excellent example of why rheephormsters don’t want to engage in actual (and public) civil conversation about their weapon of choice: VAM & its close relations.
Genuine give-and-take is toxic to their drive to garner as much $tudent $ucce$$ as possible. Reminder: many times in the past the owner of this blog appeared on “panels” with a host and others where she was outnumbered by rheephormistas not just two or three to one but sometimes four to one. *The host was there to facilitate rheephorm talking points and functioned, in effect, as a pro-rheephorm salesperson.*
Coincidence? Unintended consequence? Now what are the odds of that?
Go figure…
😎
LikeLike
“Whipped VAM”
Can’t debate a sham
Best to keep to script
Can’t revive a VAM
After it’s been whipt
LikeLike
The PED (Skandera) “claimed that it would continue doing what the judge said it was not allow to do.” Skandera has NEVER been willing to listen to Teachers, Superintendents, Parents, or Legislators when it comes to education. Who in their right minds thinks Skandera gives a flip what any Judge says or any injunction placed upon her and the PED. Skandera fully believes that she and only she has all the answers to fixing the educational troubles in New Mexico. Skandera believes she is god’s gift to education and her god is Jeb Bush who placed his blessing on Skandera when she worked for him in Florida. Note I did not use a capital letter “G” for “god”. I can’t wait for April to get here.
LikeLike
Skandera worked for Jeb Bush and before that for Arnold Schwarzenegger.
LikeLike
Skandera was New Mexico’s Secretary of Education, DESIGNATE for almost 4 years. The NM State Senate would not confirm her because the NM Constitution states that this position must be filled by a true educator and Skandera has never taught. Her experience as a volunteer at a Catholic School teaching abstinence does not count. Her Koch-funded boss, Gov. Susana Martinez campaigned BOTH times promising the Florida Model of education reform. This includes third grade retention, teacher evaluations using VAM and school grades.
To be fair, Skandera is not only the darling of Jeb and Arnold but she has been praised often by Duncan and Obama.
LikeLike
I have no idea why Chomsky doesn’t like lawsuits. This post shows that some could help civil rights issues. We do want laws that protect the public.
Since the basic problem was that VAM, a scientific method, was used to evaluate education, which is not science, the best win in this direction would be if a judge declared
“Science cannot be used to evaluate non-science”.
LikeLike
VAM is ‘scientific’? No, actually science takes into account the margin of error and doesn’t claim claim demonstrable results unless that margin is small. Science also defines terms very clearly, and doesn’t claim any relevance beyond those definitions. The only thing ‘scientific’ about VAM is that is uses math. Math isn’t science, it’s a tool. Like a hammer, it can drive a nail or bash a skull.
LikeLike
Feyman is spinning out of control in his grave like a nuclear reaction gone rogue upon hearing that “VAM is a scientific method”.
LikeLike
Feynman not Feyman
LikeLike
I am sure Virginia SGP will pipe up any minute now to explain how the judge got it wrong and must be just like the one that fined him….VAM may be hemorrhaging and starting it’s death throes. The fight is not over but we know we can win. Science can identify junk science. When theories do not match the evidence science changes the theory, the ideologue persists and ignores the evidence.
LikeLike
VAM isn’t scientifically rock solid? Sporting all those numbers and stats?
Next thing you know, you’ll be telling us that you can’t turn base metals into gold with a few well chosen abracadabras…
😎
LikeLike
KrazyTA,
I could be wrong, but I believe that in the case of VAM, the proper spell is “VAMbracaDAMbra”
But we’d have to consult with the Wizard of VAM (Raj Chetty) to verify that.
LikeLike
SomeDAM Poet:
There you go again obliquely referring to that long-discredited Campbell’s Conjecture business again!
😏
Just ask Dr. Raj Chetty. Viagra Trial. Er, Vergara Trial. Well, in any case, just remember to call his fanboys and fangirls the next time you are in Los Angeles and in need of police assistance in a neighborhood patrolled by the phantom “ghost cars” of the LAPD.
Because just like the stability and validity and reliability of VAM—you are gonna be waiting until pigs can fly and a certain hot spot freezes over before someone comes to give you a hand.
Because when it comes to rheephorm “thinking” and the mad dog pursuit of $tudent $ucce$$, you’re on your own, good buddy.
😎
LikeLike
“Viagra trials”
Ha ha ha ha!
You are one KrazyTA.
If your VAM excitement lasts for more than 4 hours, consult a medical doctor.
LikeLike
Audrey Amrein-Beardsley has been an incredible expert witness in this lawsuit. AFT-NM president, Stephanie Ly along with my local, Albuquerque Teacher’s Federation, President Ellen Bernstein have worked brilliantly and tirelessly to bring justice to public education in New Mexico. Skandera’s punitive evaluation scam has taken a toll on teachers and students in NM. Skandera announced just today, one day after the Judge’s injunction, that the Public Education Department was going to continue with the flawed evaluation and referred to the lawsuit as a PR stunt of AFT-NM and ATF.
LikeLike
Diane, thanks for the links. Very helpful.
First, I will never support VAM systems that are not implemented well. In the opinion letter, it’s clear that teachers are being evaluated on subjects they did not teach (p29-32). The court basically said that VAMs are “sound” but must be implemented in a fashion that is consistent with the sound modelling. In other words, you can’t have a perfect theory and then rate teachers on 6 students here or 27 students’ scores from a class you didn’t teach. I’m not sure anybody would disagree with that.
In fact, New Mexico had voluntarily halted its VAM rating system prior to the judge’s ruling to ensure that it was being applied consistently and appropriately. The injunction merely put that halt into law until the full trial can be heard. Based on the transcript, I think they need to have very clear specifications of when and how VAMs should be applied.
But here’s the key point. Look at the transcript beginning on page 27 at the testimony of William Soules who had recently retired as a teacher. He states that he was rated ineffective in 2013-2014 because there was NO student test data. In other words, the observation ratings gave him bottom-of-the-barrel marks. But when the AP scores came back over the summer, his students’ scores were so high, he was awarded $5000 as a bonus! Only then would the administrator relent and slightly increase his observation scores (remember, there was no student data for VAMs in 2013-2014) so that he had a lower level “effective” rating. Mind you, the administrator refused to look at the objective test data and give him an outstanding rating. The admins trusted their clearly inaccurate “observations” over the obvious results from the AP Psychology exams.
This is why VAMs are needed. In the case that your esteemed union used in the trial, the data vindicated a teacher over (likely retaliatory) poor ratings by an administrator. Where in the world does Audrey talk about that!!!!
LikeLike
vsgp,
“. . . to look at the objective test data. . . ”
There is no such thing as “objective test data” in regards to the teaching and learning process. Those standardized tests which you so idolize are false “objective” gods. It would be wise of you to realize that itty bitty fact.
LikeLike
Virginia, your point is well-taken.
There are indeed vindictive admins who’d love to get rid of a squeaky wheel or oust a teacher so that they can install a mousy (though ineffective) hireling to be loyal and beholden for years regardless of whether or not the children learn. I know this, I’ve seen this in the field spanning three decades.
That said, we must be careful of creating a false dichotomy:
While bad administrators indeed hurt the teaching profession AND children by making poor hires, kicking out powerful teachers who yield too much sway with the community, and filling positions with relationship prospects and politically connected sloths; and yes bad admins must be kept in check, VAM still may not be the answer.
Certainly, the standardized tests are gamed like a boxing match.
Boxing is the “sweet science”, it’s extremely athletic, brutal but it’s not realistic. One can study the other fighter, see his/her weaknesses and strengths and prepare for a match months in advance.
It’s not real fighting though, and it’s often worthless on the street versus a wrestler or an MMA stylist. In short, it’s not a great metric.
Here’s an argument against VAMs: We can all teach to a test, but that doesn’t mean students will be prepared for the creative challenges of life.
LikeLike
p.s. If you haven’t figured it out I’ll spell it out for you, as when Cuomo noted that fewer than 2% of teachers failed according to his APPR rigamarole: It’s not the teachers!
LikeLike
VirginiaSGP. You REALLY need to do your homework before you reply to really anything anymore…. to also try to preserve what little credibility you might have left. Why don’t you take a look at the full testimony of Soules before you make one comment about one piece on one page of the Judge’s final 77 page ruling, before you also attempt to generalize to the hilt from one interpretation of one claim in support of your a priori claims/beliefs. I mean, seriously. Email me for the transcript if you actually want to make a more defensible claim, in this case or even someday for that matter. It is public after all.
LikeLike
Dr. Amrein-Beardsley, if you have the full transcript, I would appreciate that (freethatdata@gmail.com). Thanks.
As to the case, maybe you can explain the following:
1. The judge referred to a “cafeteria” like plan. By that, I understood him to mean different teachers were evaluated on different components. If a teacher taught a subject with a statewide test, then those scores were used. If not, they sometime they were evaluated on related subjects that had statewide tests. Others who taught non-tested subjects were evaluated on completely different components altogether. Overall, the judge said VAMs were “sound” but just the implementation was questionable. Can you point to a page/line in the judge’s order that contradicts that assessment? Do you not agree that the “cafeteria” style implementation played a large role in his decision?
2. Do you dispute that Soules AP test scores saved his evaluation from being “ineffective”? Are you suggesting the previous system based on observations (which would have resulted in an ineffective rating for him) is “fair”, “consistent” and “superior” to that of the VAM-based model you criticize? I’m sure you have counter points but in an argument, generally one tries to refute the points made by the other side before espousing their own argument. You just ignored mine.
Btw, your affidavits in the NY Lederman case should be quite helpful in my Virginia case coming up. Thanks.
LikeLike
I will email the transcript asap. I have all transcripts for all days, but I opened that day for Soules to send it along and it only consists of one page. I’ve emailed the lawyer for the full copy and will email it as soon as I get it.
.
1. “Cafeteria” is situated in a state with a mandate that the system must be objective and uniform. There are many teachers throughout the state for whom the state is literally pulling whatever test they can “off of the shelf” to get them each pre/post test scores. Some are state tests, which is okay but not “valid” in the sense that these tests have never been validated to measure student growth upwards, not to mention teachers’ “causal” attributions to that growth over time. Setting that aside, the state is also using norm-referenced test as well as district developed and “state approved” tests that require nothing more than somebody developing a test and submitting a statistic to justify its use. This is the “cafeteria” style system to which the judge is referring. To hold all teachers accountable, the state has grabbed at literally everything it can, despite educational measurement/validity standards, to get a pre/post test for calculations. And yes, teachers were also evaluated on other subject areas using other subject area tests if tests also weren’t available, as related. They all had/have test-based components, unlike other states.
On the other note, yes, the judge did write that he believes that “VAMs [are] ‘sound’ but just the implementation was questionable” in this state. I interpret that to mean, as both Kane and I testified, that the methodology behind them is sound, but not in terms of use in this state, especially given its implementation issues. It’s the most sound it’s going to get – this does not mean this is going to ever work IF one is to tie “defensible” consequences to it.
So, yes, the “cafeteria” style issue played an important role, as did all of the other five points listed in this piece – see again here: http://vamboozled.com/victory-in-court-consequences-attached-to-vams-suspended-throughout-new-mexico/ …in my opinion and as per my read of the judge’s Order. What also played a role that I did not mention in the post was the feds recent announcements re: their pending Student Success Act and its teacher evaluation revisions: http://edworkforce.house.gov/studentsuccessact/
2. I believe (after revisiting the judge’s Order and not Soules’s full testimony) that you have this preliminary assertion incorrect: “In other words, the observation ratings gave him bottom-of-the-barrel marks.” Him not having test scores, from what I recall, did not revert his evaluation scores to observational scores only. His VAM scores were never positioned to have saved him, and his AP scores were not based on VAMs as this, too, was an issue with him (e.g, being evaluated for teaching AP statistics using algebra tests). From what I recall, again, Soules’s testimony was more about the cafeteria menu issue above, the tests on which he was evaluated for which he did not teach (being primarily an AP teacher), and data errors evident throughout his evaluations. Again, I will email you the full transcript as soon as the lawyer gets back to me.
Yes, on your final note, using a similar observational system across teachers would be MUCH more in line with the objective requirement than all else that is happening on the VAM side of things in this state. Using a student/parent survey would do the same (with other methodological issues noted and guarantees needed before done well). It is the VAM system, and the state forcing the eligibility of 100% of its teachers that is going awry.
Thank you, VirginiaSGP, for being more professional this time in your response(s) than in the past. I am more than happy to discuss all of this if “we” can continue to discuss these issues in such a manner.
LikeLike
audreybeardsley writes ” the judge did write that he believes that “VAMs [are] ‘sound’ but just the implementation was questionable ”
How does a judge know that VAM is sound? Is she a statistician who knows better than the American Association of Statisticians?
And of course, the problem is always with the application of VAM in education, or, more generally, with the application of science to non-science.
Using VAM in education is like using science to music: you certainly are free to tinker with it, you may find out interesting connections, but when you think you have a formula that shows that Mozart was a greater composer than Beethoven, you need to check your sense of reality and ambition.
LikeLike
Dr. Beardsley,
Let me first say that I always support:
1. Transparent publishing of margins of error, confidence intervals, etc. A little uncertainty is nothing to be scared of and both the public and scholars alike should be provided with that information.
2. Publishing the reliability factors of a given methodology. For example, how many teachers who were rated in the top 20% in one year were rated in each grouping the following year. Or of the bottom 20%. The metrics are not perfect.
3. Evaluating teachers only on subjects they taught. I do not support Florida’s implementation where history teachers appear to be measured on math/ELA scores.
4. Having a longitudinal exam to measure growth. In math and ELA, it’s pretty clear that students progress linearly through certain skills until the end of middle school. Not so with other subjects. If either the subjects do not lend themselves to growth or the tests can’t measure growth, then VAMs should not be used to evaluate that teacher.
5. Never using a “homegrown” assessment to generate evaluations. In Virginia, the teachers basically create their own student growth assessments since they completely blow off the requirement to use the statewide measures. The teacher might give a simple quiz at the beginning of the year and again at the end of the year to show “growth”. This quiz may never be evaluated by another teacher/admin nor is it required to be used by other teachers. Sounds like some of that was going on in New Mexico as well.
Many of these problems come down to the politicians not understanding the issues. It’s easier to pass a bill saying we will be “consistent” in evaluations that to tell the truth. And to be fair, the union supporters (of which I would respectfully say you are one) intentionally put in poison pills to these laws knowing that no scheme can ever measure every teacher of every subject objectively and consistently. Diane has basically admitted as much in that she supports laws/regulations that undermine the very nature of standardized testing.
My solution would be to admit that some subjects are more important than others and we need a completely different “market” for those teachers. One cannot survive in today’s world without being able to read/write or understand math. History/art/PE are all important but do not rise to the level of math and ELA. In addition, we can consistently and objectively measure student growth in K-8 in math and ELA. So we should be honest about objectively evaluating teachers of ELA and math on reliable VAMs. Poor teachers will then eschew those subjects. That’s a good thing. In return, teachers of math and ELA should receive higher wages since it’s a different “market” than that of history teachers. You would see self-selection.
The best teachers would rise to the challenge and teach math/ELA both because of it’s importance and the higher pay (e.g. $10K/year). The less effective teachers (remember that Diane doesn’t believe there are any of these out there so my definition of less effective equals “great” for many readers of this blog) would gravitate to subjects that can’t be objectively measured. They would be paid less and whine incessantly but such is the nature of the free market.
Do you disagree with this approach?
Thanks for the transcripts btw, whenever you can send them.
LikeLike
I said that “the judge did write that he believes that “VAMs [are] ‘sound.” I agree that the methods behind the madness, for the lack of a better term are “sound,” but that doesn’t meant that they work ESPECIALLY in this social science application. That is what ASA and AERA all say; hence, the calls for transparent reporting of errors, confidence intervals, etc. to evidence how UNSOUND their application as “sound” can be.
LikeLike
I understand what you are saying, audrebeardsley,
I again looked into the VAM report by the Am. Stat. Assoc.,
Click to access ASA_VAM_Statement.pdf
and it does seem that their main criticism of VAM is not on statistical (scientific) grounds but in their application.
What stands out from the ASA report is
“We know from experience with other quality improvement undertakings that changes in evaluation strategy have unintended consequences. A decision to use VAMs for teacher evaluations might change the way the tests are viewed and lead to changes in the school environment. For example, more classroom time might be spent on test preparation and on specific content from the test at the exclusion of content that may lead to better long-term learning gains or motivation for students.”
I claim that this opinion can be generalized to be applicable to any situation where a (quantitative) scientific method is used to evaluate something which is non-scientific. So I claim the following is true.
“If a scientific method is used in the high-stakes evaluation of a non-scientific endeavor, the endeavor will get distorted to fit the method evaluating it.”
There can be only one conclusion “science shouldn’t be used to the (high stales) evaluation of non-science.”
Can we imagine what kind of music we would have to listen to if statistical methods were used by critics and listeners to evaluate musicians?
LikeLike
For some reason I cannot reply to your post VirginiaSGP, so I’ve replied here. First, see the Soules transcripts coming via email. I received them, again, this morning.
Otherwise, I am not a “union supporter.” I am, however, an educational researcher who takes issues of educational measurement very seriously. I am also a former teacher who happened to never join the union for a plethora of reasons I will save for another day. Hence, I’d also ask that you be a bit more judicial in terms of the claims you’ve also made about me that (in my opinion) you’ve simply deduced in your own mind. Related, I also reject your blanket and potentially polarizing critique of unions being deliberately “poisonous” when it comes to public policy..
As for your claims about “A little uncertainty” and the metrics not being “perfect.” Do you think “we” are after perfection here? We are not even close to perfection as per the current research, nor do I think many of “us” are disillusioned by this potential. Are you reading the research?? No state and no study and no researcher has ever evidenced “little uncertainty.” THIS is the problem, especially when the data are treated as if “little uncertainty” existed.
LikeLike
Dr. Beardsley,
Yes, I am saying that on most of the VAMs for ELA and math, the uncertainty is rather small. But note these caveats:
1. I am not saying that a teacher judged to be at the 20% mark is definitely at the 20% mark. There is variation in his/her range. But when you want to identify teachers in the bottom 20% by only including teachers in the bottom 5%, you have a relatively high certainty that any teacher placing in the bottom 5% on multiple years of data and a significant number of scores (e.g. 40+), is truly in the bottom fifth. I’ll have to dig up the gov’t study (national center for education stats I think), but the probability was around 95%+ that a bottom 5% teacher was in the bottom 20% or so.
2. I am not suggesting that a teacher be fired on one year of data. The scores allow identification of teachers who are likely ineffective. A teacher who scores at the 20% mark is most likely to actually be at the bottom 20%. But I understand that his/her range could reasonably be much wider. Absent other information, that teacher should be monitored for additional training and assistance. And if that teacher consistently scores at the 20% range, then we may want to moving him/her out of the core classes.
3. Prof Friedman showed me a lot of the key studies/figures during the prep for my VDOE trial. One of their key findings related to whether VAMs were inherently biased (school-switching data). Their research showed that the bias from their methodology was < 5%. I understand that Rothstein duplicated their work and has suggested that potential biases could be present. there doesn't appear to be any systematic bias in their methodology alone.
New Mexico and Florida are a warning about how theory is not always implemented as purely in practice as one would hope. And we should be careful to limit the consequences when we can't have solid controls. I'll be happy to reference the figures/papers if you take exception but I'm a little busy preparing for a hearing tomorrow right now so I'll have to ask for some extra time.
Is it accurate to say that you don't support VAMs being used for any evaluation purposes whatsoever?
LikeLike
NM teacher weighs in – last year I experienced a process that vaguely resembled the already unsupportable VAM that Hanna and her crew will push, injunction or not. Long story short, the principal who did this to me (and to others) worked in Santa Fe and was removed from his office by police in October of this year. No one knows why. But in light of evaluations and processes that defy logic – How credible can the evaluations, hiring decisions and day-to-day working directives of a person who is removed from his office by police possibly be? NM is not a rich state, has a known problem with corruption and cannot find teachers to fill the jobs they have here. The calibre of administrators and evaluators is uneven at best. In ten years I have worked for 8 different principals. Only one is still in an administrative position. All the others are gone, cannot-locate-for-reference-checking gone. Faked credentials, outlandish personal proclivities, on and on it goes. What system can we find to fix human frailty? When the evaluators are unable to grow teachers, how can the teachers grow the students?
LikeLike
Again, I cannot respond to you VirginiaSGP directly under your link, but I am responding here. And this will be my last response for now as I definitely have other duties to which I must attend, like students’ finals.
First, have you read my book? I ask that you do so so you can get a better overall sense of me – and we can engage in a more informed discussion. All of your recent questions, and most of those beforehand and covering the last X months, would be answered then. Then you would not have such questions re: whether I’m an absolutist in terms of my VAM support or lack thereof. That’s all clarified there, AND I’m not making any royalties off of it, just in case you were wondering.
As for your claims re: Friedman. Two points: Why do you think the confidence intervals are so large, so that “they” can only get at the bottom 20% (or often the bottom and top 16% to be more specific). It’s because there is so much error in these estimates, that only those teachers in the extremes, over 3 or so years, are those whom “they” can “accurately” identify. Take this into consideration, though, especially when you’re talking about the bottom 16%. You’re really talking about the bottom 16% of the approximately 30% of all teachers who are VAM eligible, which gives you “accurate” estimates for only about 5% of elementary (not high or early childhood teachers) IF consistent over time, which due to error is often not the case. AND this says nothing to say of the bias that can come into play with these 5% (and others)…which was the source of a prior lawsuit in Houston…a teacher who redlined across the board for more than three years, and of the set of terminated teachers, was the one who got her job back, as the court found that her extreme set of students non-randomly assigned to her classroom indeed biased her scores downward. So…no, you really don’t “have a relatively high certainty that any teacher placing in the bottom 5% on multiple years of data and a significant number of scores (e.g. 40+), is truly in the bottom fifth.”
For whatever gov’t study for which you’re looking, take a look at the USDOE study conducted by Schochet and Chiang and see if what they found evidences what you are claiming is “little uncertainty….” Same thing when you write “his/her range could reasonably be much wider.” That IS some of the uncertainty about which most VAM scholars speak.
Again, this is all explained in the book…as is my take on Friedman and Chetty’s work, and their “takes” on bias v. Rothstein. Rothstein did replicate their work, which is nothing new these days in that replicating model output using the same sets of data has become quite saturated, BUT this says nothing of bias which is yet another highly related/controversial issue. Just because Friedman says this or that, though, does not mean it is true, NOR does it mean it’s representative of the current knowledge in this area. Same thing with Kane – his, in fact, is probably the most interesting transcript from the New Mexico case to read as he did not know how to answer questions about such error. I would venture to say Friedman would not have fared much better on the stand, for their general lack of training in measurement (v. econometrics) would likely come through…
Finally, as for whether “it [is] accurate to say that [I] don’t support VAMs being used for any evaluation purposes whatsoever?” Read the book. I align with what the other approximately 90% of scholars in this area argue, but you should know this already IF you claim some level of knowledge in this area….
LikeLike