Archives for category: Teacher Evaluations

It takes a comedian or a cartoonist to explain the nutty world of education reform.

Check out this great cartoon by Dilbert, giving a fast explanation of the idiocy of VAM.


PS: Thanks for KrazyTA for sending me the cartoon and also giving the correct link!

Cathy O’Neil has written s new book called “Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy.” I haven’t read it yet, but I will.

In this article, she explains that VAM is a failure and a fraud. The VAM fanatics in the federal Department of Education and state officials could not admit they were wrong, could not admit that Bill Gates had suckered the nation’s education leaders into buying his goofy data-based evaluation mania, and could not abandon the stupidity they inflicted on the nation’s teachers and schools. So they say now that VAM will be one of many measures. But why include an invalid measure at all?

As she is out on book tour, people ask questions and the most common is that VAM is only one of multiple measures.

She writes:

“Here’s an example of an argument I’ve seen consistently when it comes to the defense of the teacher value-added model (VAM) scores, and sometimes the recidivism risk scores as well. Namely, that the teacher’s VAM scores were “one of many considerations” taken to establish an overall teacher’s score. The use of something that is unfair is less unfair, in other words, if you also use other things which balance it out and are fair.

“If you don’t know what a VAM is, or what my critique about it is, take a look at this post, or read my book. The very short version is that it’s little better than a random number generator.

“The obvious irony of the “one of many” argument is, besides the mathematical one I will make below, that the VAM was supposed to actually have a real effect on teachers assessments, and that effect was meant to be valuable and objective. So any argument about it which basically implies that it’s okay to use it because it has very little power seems odd and self-defeating.

“Sometimes it’s true that a single inconsistent or badly conceived ingredient in an overall score is diluted by the other stronger and fairer assessment constituents. But I’d argue that this is not the case for how teachers’ VAM scores work in their overall teacher evaluations.

“Here’s what I learned by researching and talking to people who build teacher scores. That most of the other things they use – primarily scores derived from categorical evaluations by principals, teachers, and outsider observers – have very little variance. Almost all teachers are considered “acceptable” or “excellent” by those measurements, so they all turn into the same number or numbers when scored. That’s not a lot to work with, if the bottom 60% of teachers have essentially the same score, and you’re trying to locate the worst 2% of teachers.

“The VAM was brought in precisely to introduce variance to the overall mix. You introduce numeric VAM scores so that there’s more “spread” between teachers, so you can rank them and you’ll be sure to get teachers at the bottom.

“But if those VAM scores are actually meaningless, or at least extremely noisy, then what you have is “spread” without accuracy. And it doesn’t help to mix in the other scores.”

This is a book I want to read. Bill Gates should read it too. Send it to him and John King too. Would they read it? Not likely.

When this statement first appeared in 2014, I said at the time that it should be on the bulletin board of every public school.

The American Statistical Association explains here why the evaluations of individual teachers should not be based on their students’ test scores.

Here is an excerpt. Read the whole statement, which is only 8 pages long:

It is unknown how full implementation of an accountability system incorporating test-based indicators, such as those derived from VAMs, will affect the actions and dispositions of teachers, principals and other educators. Perceptions of transparency, fairness and credibility will be crucial in determining the degree of success of the system as a whole in achieving its goals of improving the quality of teaching. Given the unpredictability of such complex interacting forces, it is difficult to anticipate how the education system as a whole will be affected and how the educator labor market will respond. We know from experience with other quality improvement undertakings that changes in evaluation strategy have unintended consequences. A decision to use VAMs for teacher evaluations might change the way the tests are viewed and lead to changes in the school environment. For example, more classroom time might be spent on test preparation and on specific content from the test at the exclusion of content that may lead to better long-term learning gains or motivation for students. Certain schools may be hard to staff if there is a perception that it is harder for teachers to achieve good VAM scores when working in them. Overreliance on VAM scores may foster a competitive environment, discouraging collaboration and efforts to improve the educational system as a whole.

Research on VAMs has been fairly consistent that aspects of educational effectiveness that are measurable and within teacher control represent a small part of the total variation in student test scores or growth; most estimates in the literature attribute between 1% and 14% of the total variability to teachers. This is not saying that teachers have little effect on students, but that variation among teachers accounts for a small part of the variation in scores. The majority of the variation in test scores is attributable to factors outside of the teacher’s control such as student and family background, poverty, curriculum, and unmeasured influences.

The VAM scores themselves have large standard errors, even when calculated using several years of data. These large standard errors make rankings unstable, even under the best scenarios for modeling. Combining VAMs across multiple years decreases the standard error of VAM scores. Multiple years of data, however, do not help problems caused when a model systematically undervalues teachers who work in specific contexts or with specific types of students, since that systematic undervaluation would be present in every year of data.

Despite the warning from ASA, which has no special interest and does not represent teachers or public school administrators, many states continue to use this method (called VAM, or value-added measurement or value-added modeling).

States were coerced into adopting this unproven method by the U.S. Department of Education, which said that states had to adopt it if they wanted to be eligible to compete for nearly $5 billion in federal funds in 2009, as every state was undergoing a budget crisis caused by the economic meltdown of fall 2008.

Many states adopted it, and it has not had positive effects in any state.

In Colorado and New York, among others, VAM scores count for as much as 50% of teachers’ evaluation.

A state court in New York ruled this method “arbitrary and capricious” when challenged by fourth grade teacher Sheri Lederman and her lawyer-husband Bruce Lederman.

Some states assign VAM scores to teachers based on students they never taught in subjects they don’t teach.

This is an example of federal and state policy that has no basis in evidence and that has harmed the lives of many teachers. It very likely has caused teachers to leave the profession and contributed to teacher shortages.

This post, with an anonymous author, reviews the research on value-added measurement, with frequent references to those who claim that the rise or fall of test scores is the best way to judge teacher quality.


The basic question he or she addresses is whether the actions of your kindergarten teacher or your third-grade teacher can affect your lifetime earnings, as Raj Chetty and his team asserted in a study a few years back.


The author goes into a lengthy back-and-forth about whether such claims make sense.


But the one essential fact that his post is missing is that 70% of teachers do not teach tested subjects. A district or school can evaluate teachers with VAM only when there are enough years of test scores to document the effects of the teacher over several years. Teachers of subjects other than reading and mathematics in grades 3-8 will never get VAM ratings.


But many states have solved this problem by assigning VAM ratings to the 70%. Their ratings are based on the scores of students they never met in subjects they never taught. This is called an “attributed rating.”


That makes sense, said no one ever.


That may be why Hawaii and Oklahoma have dropped VAM. It is expensive and gives false positives and false negatives. Expect more states to join these two states.

Earlier today, I posted a piece by Andy Jones, a high school language arts teacher in Hawaii. In the introduction, I mistakenly said that Hawaii had eliminated the teacher evaluation system that was created in response to winning Race to the Top funding. I wrote: “This past spring, Hawaii dropped the test-based teacher evaluation that Race to the Top had forced on the state as a condition of winning RTTT funds. The money was all gone, and so was this bad idea.” That was not completely accurate.

Andy Jones sent this note to correct me:

“I so hate to ask, but…the HSTA secretary, who along with everyone else here is a great admirer of your work, just asked me if I could mention to you that your intro to my article is somewhat inaccurate, in that the teacher evaluation system (EES) is still in place. We were merely successful in eliminating test scores from the evaluation. We are, however, almost equally angry at present, because in the months since that announcement was made, they have insisted on maintaining the excoriated SLO component (Student Learning Outcomes) – the mindbogglingly ridiculous quasi-action-research template we have to go through in our evaluations in which we choose learning objectives, administer pre-assessments, make predictions about student learning, collect evidence of student growth, etc., and are only rated “distinguished” if 90% of our predictions are accurate. This is now worth 50% of our evaluation, along with the standard Danielson-based observation cycle and Core Professionalism (basically a portfolio on Danielson Domain 4) comprising the other 50%.

“HSTA secretary Amy Perruso wanted me to post you on this, because she’s afraid that HSTA will now be bombarded with emails and telephone calls asking “How did you get rid of the teacher evaluations?” – which, unfortunately we have not and which we’re going to be attempting very aggressively this year to get rid of.

– Andy”

Andy Jones is a high school language arts teacher in Hawaii. The public school teachers in Hawaii have been energetic in stopping the worst extremes of the Race to the Top grant that the state received. This past spring, Hawaii dropped the test-based teacher evaluation that Race to the Top had forced on the state as a condition of winning RTTT funds. The money was all gone, and so was this bad idea.

Andy knows the history of American education, the cycles of criticism that follow one another, and he believes that the new federal law will give the state the ability to chart a better course than the one imposed by the Bush-Obama agenda of test and punish. I regret to say that his article is behind a pay wall, but I will share with you what he shared with me. In this time of doom-and-gloom, it is great to hear an optimistic forecast!

Andy writes:

Alarm, overhaul, stagnation, denial, recognition. Repeat ad infinitum.

Students of educational history will recognize in these five words the cycle that has defined American school culture for decades.

Thirty-three years ago, the “A Nation at Risk” report rang the alarm of educational decline. Its clarion call for improved public education resounded for the next generation and led eventually to initiatives such as No Child Left Behind and Race to the Top – programs that inaugurated sweeping, possibly irreversible changes to schools and school communities across the country.

A consensus has now emerged that these changes have led to dismal failure — a consensus signaled by the Every Student Succeeds Act (ESSA), which emphatically seeks to reverse the damage done, in part by giving states back the freedom to define and enact their own vision of 21st-century education.

On the heels of ESSA and the widespread discussion it has initiated, the National Conference of State Legislatures (NCSL) has released a report that may prove as decisive as “A Nation at Risk.”

The title — “No Time to Lose: How to Build a World-Class Education System State by State” — is misleading in that it seems to announce a mere repeat of the alarmist tone of the “Nation” report, perhaps to be followed by a new round of dubious policy suggestions from non-educators.

However, in what must come as a welcome shock to educators accustomed to routine governmental denial of policy failure, “No Time to Lose” fully acknowledges the mistakes of the past 15 years and seconds the sustained criticisms of prominent researchers such as Diane Ravitch and Pasi Sahlberg. These and many others have analyzed the extensive Organisation for Economic Co-operation and Development reports on international education and have concluded that the misguided “reforms” of the past years have had an overwhelmingly negative impact on American schools, leading to ever further decline internationally.

They have also highlighted an additional, sinister aspect of these “reforms,” which have involved the gradual removal of educational decisions from the purview of teachers and educators and the corresponding enrichment of educa- tional corporations profiting from the proliferation of mediocre materials and programs that schools are forced to use.

We are fortunate to be living in a state led by a governor who recognizes what is at stake and who has created a robust task force that is working to establish grassroots consensus as to what is best for Hawaii schools and the students they serve.

We are also fortunate to have an increasingly dynamic teachers union that has sponsored a teacher- written report, “Schools Our Keiki Deserve,” which echoes the advice of our top educational researchers as well as the urgent tone of “No Time to Lose.”

Hawaii Department of Education (DOE) officials have shown signs recently that they are beginning to veer away from the pattern of denial that for years has characterized state and district education departments across the country. They have, for instance, conceded the unhealthy aspects of standardized testing, and they have also begun to embrace the idea of whole-child education as practiced in the world’s top-performing school systems.

As the DOE continues revising the Strategic Plan which will guide Hawaii education over the course of the coming years, teachers and citizens should encourage DOE officials to fully embrace the sobering findings of “No Time to Lose,” the tremendous energy and wealth of ideas emerging from Gov. David Ige’s task force, and the “Schools Our Keiki Deserve” report.

The report outlines a plan that is fully in accordance with the best educational research — one that can and should be integrated into the blueprint of the document that will determine much of what happens in our schools.

Karin Klein wrote education editorials for the Los Angeles Times for years. She now writes freelance, and she wrote this sensible article for the LA Times.

So-called reformers have advocated their view that the way to improve schools is to fire “bad” teachers. The way they would identify “bad” teachers is by whether the test scores of students went up or down or stayed flat. Reformers seldom acknowledged that test scores reflect family income far more than teacher quality.

This hunt for bad teachers has proved fruitless, as scores have misidentified good and bad teachers, good teachers are demoralized by an idiotic way of evaluating their work, and there are teacher shortages now in many districts, as good teachers leave and the pipeline of new teachers has diminishing numbers.

Linda Darling-Hammond once memorably said, “You can’t fire your way to Finland.”

Karin Klein agrees.

One day, when the current era of test-based evaluation is evaluated, reformers will be held accountable for the damage they have done to teachers, students, and public education. That day will come.

Teachers need help and support to become better teachers.

There is no waiting line of great teachers searching for a job.

School districts must work with the teachers they have, making sure they are encouraged and mentored. And paid well.

New Jersey has decided that teachers are now fully familiar with the Common Core and PARCC testing, even though most kids “fail” it, and henceforward the rise or fall of test scores on PARCC will count for 30% of teacher evaluations. Previously they had counted for only 10%.

This method has been debunked by the American Statistical Association and the American Educational Research Association. It has been in use in Colorado and in many states for five years without producing any results.

This is faith-based policy.

The only sensible aspect of this change is that it counts only for teachers who teach the tested subjects in the tested grades. In neighboring New York and in other states, this discredited method applies to all teachers, and they are judged by the scores of students they didn’t teach in subjects they don’t teach.

In New York, an outstanding fourth grade teacher, Sheri Lederman, sued the state after receiving a low rating. The judge ruled that the rating system was “arbitrary and capricious.” For now, the rating system is in abeyance. At some point the Regrnts and Legislature will have to clarify how this ruling affects state law.

This is an unusual political campaign. Matthew Fitzpatrick, an educator in Orange County, Florida, is running for a seat on the district school board on a platform opposed to the evaluation methods of Robert Marzano. Now, I have no views for or against Mr. Marzano since I am not a classroom teacher and I am not familiar with his method, but I have seen remarkable pushback on this blog from teachers. Since I too oppose the reduction of teaching to numerical measurements, I am sympathetic to his arguments.

He gives 40 reasons to oppose the Marzano method. I am posting only four of them. Read his post if you want to see the other 36.

My name is Matthew J. Fitzpatrick, and I am running for the District 7 Seat on the Orange County School Board. I am currently an Assistant Director at Orange Technical College, Westside Campus in Winter Garden. I’ve been in education for 23 years — 12 years as a teacher, and 11 years as a school and district administrator. In all my years of being involved in education, in my opinion, I have never seen a more demoralizing and destructive system than the OCPS implementation of the Marzano Teacher Evaluation system. I believe the Marzano system, more than anything else, is driving teachers out of education…and thus, OCPS has long lists of teacher vacancies. I believe this enough that I am willing to set aside my own administrative career and take a 50% pay cut in order to bring common sense back to the classroom. We must turn things around now.

Here are my first 40 Reasons to Replace the Marzano Teacher Evaluation System…splitting hairs on a system designed to split hairs on the art of teaching…

1. Dr. Marzano himself said on page 4 of his famous book, The Art and Science of Teaching, that, “It is certainly true that research provides us with guidance as to the nature of effective teaching, and yet I strongly believe that there is not (nor will there ever be) a formula for effective teaching.” If Dr. Robert J. Marzano says there is not a formula for effective instruction, who am I to argue with him? Why have we settle for a cookie-cutter approach to teaching?

2. Non-educators may not completely understand all of this “teacherese” jargon about teacher evaluations, but simply mention the name Marzano to an Orange County Public School teacher and take note of how they react…watch what happens to their face…feel the emotions of their words. Anything that causes such disdain among the very lifeblood of education–the teachers–surely is not good for education…no matter how much the sanitized research is quoted in support of it.

3. Where are the amazing results from using the “research-proven” Marzano strategies? Our District’s test scores and grades went down in many areas and schools. Why haven’t 6 years of Marzano transformed our District? If something is not delivering results, and at the same time it is driving great teachers out of the profession, we must make a data-driven decision and move in another direction…for the sake of our students and teachers.

4. Teaching should not be reduced to the numerical measurements of individual instructional strategies. Just as Mr. Keating (Robin Williams), in Dead Poets Society instructed his students to resist the armies of academics who want to reduce poetry to a passionless score that misses its true beauty and purpose, so, too, must students, parents, teachers and administrators stand against such a heartless, nitpicking view of the art of instruction. We must “Rip It Out” as an evaluation tool in our District.

A reader who works for a software company explains why it is so difficult to teach the standards effectively and so unfair to judge teachers by an impossible task: It takes 300 days to teach them, but there are only 180 days in a school year. Oops!



Here is the main problem with these tests. The FLDOE has absolutely no clue on how long it takes to teach each standard effectively. So the question is, “can a teacher teach the standards in the allotted time during the year?” As an educational software company we looked at the standards that a fifth grade teacher is required to teach effectively and stopped counting when we found it would take a minimum of at least 300 school days to teach the standards to an effective level. This does not include teaching a child how to type effectively if the state required typing on the writing portion of the test. The problem is, it’s impossible for an elementary school teacher or for that matter anybody including the testing companies to teach the standards that are on the test in a school year. In order for a teacher or school to score effectively on these tests you have to hope that the students that are coming into your classroom have at least some prior knowledge of the standards.



You have to understand that these tests are not built to test your child’s learning knowledge, they are built to evaluate the schools and teachers on their effectiveness on teaching the standards. Finally, ask yourself this question… “Who benefits if the teachers and schools FAIL teaching the standards effectively?” Teachers? Schools? Children? No benefit here!… Private Charter Schools? Testing Companies? Publishers? ED Tech Companies? Lobbyists and the list goes on and on and on…..