Archives for category: VAM (value-added modeling)

This practice of evaluating teachers by the scores of students they never taught in subjects they don’t teach is absurd. In New York City, more than half of all teachers are judged based on work they didn’t do.

Teachers in Florida went to court to challenge it, the state court judge said it was unfair but not unconstitutional. That judgement was as nonsensical as the evaluation method.

“Just over half of New York City teachers were evaluated in the 2015–16 school year, in part, by tests in subjects or of students they didn’t teach, according to data obtained by Chalkbeat through a public records request.

“At 53 percent of city teachers, it’s significant number, but substantially lower than in previous years, possibly thanks to a moratorium placed on using state tests, instituted mid-year.

“That figure also highlights a key tension in evaluating all teachers by student achievement, even teachers who work with young students or in subjects like physical education. Being judged by other teachers’ students or subjects has long annoyed some educators and relieved others, who otherwise might have had to administer additional tests.

“Supporters say evaluating teachers by group measures — often school-wide scores on standardized tests — helps create a sense of shared mission in a school. But the approach could also push teachers away from working in struggling schools.”

Ironically, evaluating teachers based on the scores of students they did teach is also invalid, because other factors–beyond the teacher’s control–like home life, family income, etc.–affect test scores more than the teachers.

Evaluations based on test scores have proven to be unreliable and invalid. They have been thrown out in several courts. They have been found wanting by major scholarly association.

VAM is a zombie Policy.

Nick Melvoin beat Steve Zimmer for the LAUSD school board in the most expensive school board race in history.

The LA Times says he has fresh ideas.

Here they are.

Most of what he says is intended to enable the normalization of charter schools. Or is trite.

But get this:

“About 40% of a teacher’s evaluation should be based on measurable academic growth, such as standardized test scores, Melvoin said.”

Melvoin obviously is in the dark about the total failure of VAM.

But what would you expect from a puppet of Eli Broad?

The blog poet makes the assertion that economists are like weathermen, but in this case, I disagree respectfully. Economists of education express a certainty that no weatherman would. Weathermen (meteorologists) give you different scenarios, warn you that the track of the storm might shift, recognize that there are many uncertainties. Economists of education (not all of them! Think Helen Ladd of Duke) claim that they can use test scores to measure teacher effectiveness, and that they have taken all contingencies into account. Hundreds, probably thousands, of educators have lost their careers because of the specious claims of economists like Raj Chetty, who assert that they can judge the value-added of teachers by the test scores of their students, that third-grade teachers affect the life-time earnings of their students, and that teachers who don’t hit arbitrary test scores marks should be fired, sooner rather than later.

“Economists are like weathermen”

Economists are like weathermen
This cannot be denied
Cuz if, by chance, they get it right
It’s greatly AMPLIFIED!!

But mostly, they just get it wrong
And utter not a word
For them to actually point this out
Would really be unheard

And when their goof’s so blatant
They really can’t ignore it
They simply claim they found a “flaw”
But “markets” will restore it

Today, teachers in Houston won a major court victory against the discredited teacher evaluation method called VAM, or “value-added measurement.” The court battle was led by the AFT and the Houston Federation of Teachers.

VAM was originally developed by an agricultural statistician, William Sanders, who believed that the rise or fall of student test scores can be attributed to the students’ teachers. This theory was incorporated into the Race to the Top program, which led many states to adopt it, despite the fact that it had never been proven to Wotan in a real-world situation. Seventy percent of teachers do not teach tested subjects, which led to bizarre strategies of evaluating teachers by scores of students they never taught in subjects they never taught.

Here is the press release from the AFT about the decision:

May 4, 2017

AFT, Houston Federation of Teachers Hail Court Ruling
on Flawed Evaluation System

Statements by American Federation of Teachers President Randi Weingarten and Houston Federation of Teachers President Zeph Capo on U.S. District Court decision on Houston’s Evaluation Value-Added Assessment System (EVAAS), known elsewhere as VAM or value-added measures:

AFT President Randi Weingarten: “Houston developed an incomprehensible, unfair and secret algorithm to evaluate teachers that had no rational meaning. This is the algebraic formula: 𝑦𝑖𝑗𝑘𝑙= 𝜇𝑗𝑘𝑙+ (Σ𝑘∗≤𝑘Σ𝑤𝑖𝑗𝑘∗𝑙∗𝑡 × 𝜏𝑖𝑗𝑘∗𝑙∗𝑡𝑇𝑖𝑗𝑘∗𝑙∗𝑡=1)+ 𝜖𝑖𝑗𝑘𝑙

“U.S. Magistrate Judge Stephen Smith saw that it was seriously flawed and posed a threat to teachers’ employment rights; he rejected it. This is a huge victory for Houston teachers, their students and educators’ deeply held contention that VAM is a sham.

“The judge said teachers had no way to ensure that EVAAS was correctly calculating their performance score, nor was there a way to promptly correct a mistake. Judge Smith added that the proper remedy is to overturn the policy; we wholeheartedly agree. Teaching must be about helping kids develop the skills and knowledge they need to be prepared for college, career and life—not be about focusing on test scores for punitive purposes.”

HFT President Zeph Capo: “With this decision, Houston should wipe clean the record of every teacher who was negatively evaluated. From here on, teacher evaluation systems should be developed with educators to ensure that they are fair, transparent and help inform instruction, not be used as a punitive tool.”

###

Van Schoales is part of the corporate reformer group that has controlled public education in Colorado for most of the past decade. When I visited Denver in 2010 to talk about my recently published book “The Death and Life of the Great American School System: How Testing and Choice Are Undermining Education,” Van was running Education Reform Now on behalf of Democrats for Education Reform, the hedge-fund managers organization that lobbies for charters and high-stakes testing. I recall what a very nice guy he was and how generous he was in introducing me, even though we disagreed.

At the very time I arrived in Denver, the state legislature was nearing a vote on a teacher and principal evaluation plan devised by a young state senator named Michael Johnston, whose background was in Teach for America and New Leaders for New Schools. Several members of the legislature, who were former teachers, showed up for my lecture in Boulder and spoke to me afterwards about their concerns about this fast-moving bill. Johnston’s legislation, known as Senate Bill 10-191, promised to evaluate teachers and principals based on the test scores of their students. Fifty percent of their evaluation would be tied to test scores. I was scheduled to debate Johnston on the day of the vote, but he did not enter the room until the minute I finished speaking, so he never heard my side of the debate. Johnston, however, was flushed with excitement about his legislation. He said that if every educator was evaluated by test scores, then Colorado would have “great schools, great principals, and great teachers.” I tried my best to dissuade him and the audience of their obsession with the value of standardized testing, but it was too late. The legislature passed 10-191, and Johnston was considered a rising star.

Except, as Van Schoales now admits in this article in Education Week, the corporate reformers were wrong. SB 10-191 did not work out as planned, even though the framers relied on the very best Ivy League prognosticators.

He writes:

Back in May 2010, hundreds of the nation’s education foundation, policy, and practice elites were gathered for the NewSchools Venture Fund meeting in Washington to celebrate and learn from the most recent education reform policy victories in my home state of Colorado and across the country.

The opening speeches highlighted the recent passage of Colorado Senate Bill 10-191—a dramatic law which required that 50 percent of a teacher evaluation be based upon student academic growth. This offered a bold new vision for how teachers would be evaluated and whether they would gain or lose tenure based on the merits of their impact on student achievement.
Colorado would be one of several “ground zeros” for reforming teacher evaluation in the country. Many, including myself, thought these new state policies would allow our best teachers to shine. They would finally have useful feedback, be differentiated on an objective scale of effectiveness, and lose tenure if they weren’t performing. Teachers would be treated like other professionals and less like interchangeable widgets.

Colorado’s law and similar ones in other states appeared to be sound, research-backed policy formulated by education reform’s own “whiz kids.” We could point to Ivy League research that made a clear case for dramatic changes to the current system. There were large federal incentives, in addition to private philanthropy fueled by the Bill & Melinda Gates Foundation, encouraging such changes. And to pass these teacher-evaluation laws, we built a coalition of reform-minded Democrats and Republicans that also included the American Federation of Teachers. Reformers were confident we had a clear mandate.

And yet. Implementation did not live up to the promises.

Ah, implementation! The Soviet experiment might have worked had it been implemented the right way. When allegedly great ideas don’t work out in reality, then something is wrong with the idea. For one thing, it never had the support of educators, who were expected to make Michael Johnston’s big idea work. It didn’t work.

What went wrong? Almost everything.

Most teachers don’t teach tested subjects. The majority of teachers teach in states’ untested subject areas. This meant processes for measuring student growth outside of literacy or math were often thoughtlessly slapped together to meet the new evaluation law. For example, some elementary school art-teacher evaluations were linked to student performance on multiple-choice district art tests, while Spanish-teacher evaluations were tied to how the school did on the state’s math and literacy tests. Even for those who teach the grades and subjects with state tests, some debate remains on how much growth should be weighted for high-stakes decisions on teacher ratings.

Few educators “embraced” the new evaluation system. They complied, but they never believed.

Teacher evaluators were giving teachers higher scores than they allegedly deserved. This, of course, was a problem with the district and school culture, not the model, which was supposedly flawless.

Last, every one of the state’s charter schools waived themselves out of the teacher evaluation system.

Van Schoales doesn’t mention that test-based accountability has been criticized by leading scholarly organizations, like the American Statistical Association and the American Education Research Association.

Value-added measurement, or VAM, has fallen into disrepute for two reasons. First, it has not produced positive results anywhere. There is a solid body of research that has shown that it doesn’t work and will never work, because students are not randomly assigned, because home influences outweigh teacher influences on student test scores, and because most teachers do not teach the tested subjects.

Colorado had the perfect teacher evaluation plan, in theory, perfect enough to excite the corporate reformers, Arne Duncan, Bill Gates, et al. Except it didn’t work. I salute Van Schoales for admitting that the experiment failed.

Unfortunately it is still the law in Colorado. Educators are still evaluated by flawed and invalid measures. Seven years after passage of SB 10-191, Colorado does not have “great schools, great principals, great teachers.” Actually, it does have great schools, great principals, and great teachers in affluent districts, as it did in 2010. It even has great educators and schools in urban districts, but only if they are not measured by their students’ test scores. Don’t blame the victims of this effort to turn educators into widgets. The best evaluation of professionals is done by human judgment, taking multiple factors into account, not by standardized test scores.

Due to term limits, Michael Johnston is no longer in the State Senate. In January, he announced that he is running for Governor of Colorado. On his wikipedia page, he still boasts about SB 10-191. He owes an apology to the thousands of dedicated educators who were subjected to his invalid teacher evaluation plan, many of whom were unjustly terminated and lost their careers.

Audrey Amrein-Beardsley has devoted a large portion of her professional life to criticizing the work of William Sanders, creator of the value-added model for measuring teacher effectiveness. She reports in this post that he passed away at his Tennessee home at the age of 74.

Sanders’ model was tried out first in Tennessee in the late 1980s and then widely disseminated to other states. Many teachers lost their jobs because they didn’t get the gain scores that the Sanders’ model predicted.

Sanders was trained in animal genetics. His hometown obituary reports that:

He received a bachelors of Science degree in animal science in 1964, and a doctorate in Statistics and Quantitative Genetics in 1968….

Beginning in 1972, Sanders created and led a statistical and consulting group for the Institute of Agricultural Research for The University of Tennessee system. Over the next 28 years, Sanders worked with scientists to plan experiments and analyze the resulting data on research projects ranging from agronomy to physics. One of his research projects modeled the nutrient flow in the Peace River system in Florida, which proved that the environmental degradation in the Gulf was not a function of the development along the west coast as was previously thought but rather was a result of the phosphate mining activity in Central Florida. Other projects were as wide ranging as developing a forecasting system for Bike Athletic to improve forecasts for over 2,500 different inventory units to working with a longtime friend to develop a process that improved calibration of an invention that measures fiber properties.

He achieved lasting fame by transferring his attention from agricultural assessments to teacher assessments. He seemed never to recognize that assessing the effectiveness of teachers was far more complicated than studying animals and plants, which may be raised in a controlled environment.

Although his methodology was critiqued by scholars like Amrein-Beardsley, the American Statistical Association, and the American Educational Research Association, Sanders continued to peddle it to credulous school board members looking for a simple formula to determine teacher “effectiveness.”

Amrein-Beardsley writes:

Sanders thought that educators struggling with student achievement in the state should “simply” use more advanced statistics, similar to those used when modeling genetic and reproductive trends among cattle, to measure growth, hold teachers accountable for that growth, and solve the educational measurement woes facing the state of Tennessee at the time. It was to be as simple as that…. I should also mention that given this history, not surprisingly, Tennessee was one of the first states to receive Race to the Top funds to the tune of $502 million to further advance this model; hence, this has also contributed to this model’s popularity across the nation.

Arne Duncan too bought the horse manure that Sanders was selling and that he patented. Keeping his methodology secret and proprietary posed a challenge to scholars. But there were always buyers, no matter what the scholars said. States that wanted to be eligible for Race to the Top funding had to adopt a test-based evaluation system, which was a boon for Sanders’ model.

Thoughts on the recent events in education reform, by our blog Poet:

 

 

 

“The Maestro”

 
(A brief historical recap for those who have already forgotten — or perhaps never knew)

 

 

Chetty played the VAMdolin
At Nobel-chasing speed
Arne played the basket-rim
And Rhee, she played the rheed

 

Coleman played his Core-o-net
Eva played the lyre
Billy Gates played tete-a-tete
With Duncan and with higher

 

Sanders* beat his cattle drum
Devalue added model
Pseudo-science weighted sum
Mathturbated twaddle

 

John King played the slide VAMbone
But Maestro was Obama
Who hired the band and set the tone
For current grizzly drama

 

 

*William Sanders, who tweaked his algorithm for modeling cattle growth to model the intellectual growth of students and evaluate teachers.

 

 

 

Bruce Baker of Rutgers University is frustrated. He and colleagues have published study after study about the uses and misuses of standardized test scores to measure teachers and schools.The evidence is clear, he writes. Yet states remain devoted to failed, erroneous methods that pack any evidence!

 

“It blows my mind, however, that states and local school districts continue to use the most absurdly inappropriate measures to determine which schools stay open, or close, and as a result which school employees are targeted for dismissal/replacement or at the very least disruption and displacement. Policymakers continue to use measures, indicators, matrices, and other total bu!!$#!+ distortions of measures they don’t comprehend, to disproportionately disrupt the schools and lives of low income and minority children, and the disproportionately minority teachers who serve those children. THIS HAS TO STOP!”

 

 

Today, you should be relaxing and having some fun. Here is a good way to begin.

 

I plan to post good thoughts, silly, happy thoughts today.

 

That means I will not mention Little Hands, the Orange One. All day. (Unless it is absolutely necessary.)

 

So here goes.

 

W. James Popham is an international expert on educational assessment.

 

He has a great sense of humor.

 

In this video, he explains value-added assessment.

 

Sit back and enjoy!

 

State Senator Michael Johnston, architect of Colorado’s failed, punitive teacher evaluation law, may run for governor.

 

Johnston, an alumnus of Teach for America, is a devout believer in standardized testing. His law, passed over the objection of the state’s teachers, makes test scores 50% of teacher evaluations.

 

I happened to be in Denver the day that his bill came to a vote. We were scheduled to debate at lunch time,  but young Senator Johnston showed up after I finished speaking. I got to hear him, but he never heard me. He told the audience that his bill would produce great teachers, great principals, great schools. All by basing evaluations on test scores.

 

Senator Johnston’s fantabulous claim never came true. Six years after passage of his law, Colorado has the harshest teacher evaluation statute in the nation and apparently no will to change it.

 

What are the results? When measured by the National Assessment of Educational Progress, Colorado is in stagnation since passage of Senate Bill 191. Scores in fourth and eighth grade math and English are flat or have declined. None have gone up.

 

His greatest achievement was a bust. Since its passage, the theory that teachers can be evaluated by the test scores of their students has repeatedly been debunked by scholarly associations like the American Statistical Association, but Mr. Johnston is unable or unwilling to admit his ruinous error or to take steps to repeal it.