Arthur Goldstein teaches English as a Second Language students at Frances Lewis High School in Queens, New York. He blogs as NYC Educator. In his letter, Goldstein refers to a meeting that Chancellor Farina had with a local superintendent, where she recognized that highly rated teachers were likely to get lower ratings in high-poverty schools. The blogger Perdido Street School wrote: “The dirty secret of education reform is that the problems in schools and districts with high poverty/high homelessness demographics are NOT caused by “bad teachers” – they’re caused by all the effects that poverty has on the psychological, emotional, physical and social development of the children in those schools and districts.”
Arthur Goldstein writes:
Dear Chancellor Fariña:
First of all, I applaud you for acknowledging that a highly-effective rated teacher entering a troubled school may suffer a reduced rating as a result of changing schools. I very much appreciate that you’ve taken a personal interest in this teacher and plan to attach an asterisk and follow her ratings. It’s inspirational not only to me, but also to teachers nationwide, that the leader of the largest school district in the country would acknowledge that a school’s population is a major factor in teacher ratings.
This, in fact, has been a major objection many of us, including experts like Diane Ravitch and Carol Burris, have had toward value-added evaluation programs. In fact, the American Statistical Association has determined that teachers impact test scores by a factor of 1-14%. They have also determined that rating teachers by such scores may have detrimental effects on education.
I am struck by the implications of your statement. If it’s possible that a highly-rated teacher may suffer from moving to a school with low test scores, isn’t it just as likely that a poorly-rated teacher would benefit from being moved from a low-rated school to a more highly-rated one? And if, as you say, the teachers are using the same assessments in either locale, doesn’t that indicate that the test scores are determined more by students themselves as opposed to teachers?
For example, I teach beginning ESL students. Teaching these kids is one of the very best things I’ve ever done, but I now consider it a very risky business. Kids who don’t speak English tend not to achieve high scores on standardized tests. I’m sure you also know that acquiring English takes a few years, varies wildly by individual, and that it can take 5-7 years to acquire academic English. The new NYSESLAT test seems to focus on academic English rather than language acquisition. Still, it would be irresponsible of me to neglect offering basic conversation and survival skills. (In fact, NY state now requires that we offer less standalone ESL., which is neither helpful to my students nor supported by research.)
Special education children also have specific needs and disabilities that can inhibit their ability to do well on tests. It doesn’t take an expert to determine that teachers in schools with high concentrations of students with disabilities already are more likely to incur adverse ratings. Who is going to want to teach in these schools? Who will want to teach special education or ESL?
Attaching high stakes to test scores places undue pressure on high-needs kids to pass tests for which they are unsuited. For years I’ve been hearing about differentiation in instruction. I fail to see how this approach can be effectively utilized when there is no differentiation whatsoever in assessment. It’s as though we’re determined to punish both the highest needs children and their teachers.
Since the advent of high-stakes evaluations, the morale of teachers I know and represent has taken a nose dive. Teachers, regardless of ratings, are constantly asking me about their ratings, and live in fear of them, as though the Sword of Damocles were balanced over their heads. Though the Danielson rubric is heralded as objective, in practice it’s very much in the eye of the beholder. As if that were not enough, ratings are frequently altered by test score ratings. Diane Ravitch characterizes them as junk science. (I concur, and having music teachers rated by the English Regents scores of their students pushes it into the realm of the ridiculous.)
Personally, I found the older evaluation reports to be much more thorough and helpful. Supervisors used to be able to give detailed reports of what they saw, and specific suggestions on what could be improved. ThoughI can’t speak for everyone in this, I found them easier to read than the checklists we currently receive. Just like our kids, we are not widgets. We are all different, and are good or not so good on our own merits.
Of course no one wants bad teachers in front of children. The current system, though, seems to focus on student test scores rather than teacher quality. It seems to minimize teacher voice in favor of some idealized classroom that may or may not exist.
It’s a fact that test scores are directly correlated with family income and level of special needs. There is no reliable evidence that test scores are indicative of teacher quality or lack thereof. Teachers are the second-best role models for children. It’s quite difficult for us to show children that life is a thing to be treasured when we have virtual guns placed to our heads demanding higher test scores or else. Just like our kids, we are more than test scores.
On behalf of children and teachers all over New York State, I ask that you join us in demanding a research and practice-based system of evaluating not only teachers, but our students as well.
Sincerely,
Arthur Goldstein, ESL teacher, UFT chapter leader
Francis Lewis High School

Closing the achievement gap requires rewarding the best teachers for taking the hardest jobs. Currently the are penalized for teaching challenged students and rewarded for teaching high socio-economic students at the other end of the GAP.
VAM is the perfect tool to widen the achievement gap. Who thought that up?
LikeLike
To Arthur,
Your letter to Farina is compelling because of its simple, yet cogent logic. Wish I could teach in FLHS just so I could be represented by you instead of the administratively incompetent hell hole in which I currently work.
LikeLike
You guys might like this:
http://www.vox.com/2015/7/29/9034235/teacher-common-core
I was surprised it was published at all in DC-based media, quite frankly.
LikeLike
Well written critique of vam. Thank you
LikeLike
Arthur should be leading the UFT. Really.
LikeLike
Dear Chancellor Farina,
I would like to add that the Danielson Framework for evaluating teachers is limiting creativity and inhibiting chance taking….and that is when it’s not being used as a robotic checklist by your incompetent administrators who couldn’t tell a gifted teacher from a horses petuti.
LikeLike
VAM is more of the carrot and stick attempt to micro-manage teachers. It is, however, a false measurement. It is an attempt to use a failed business practice to determine the “value” a teacher adds to a student’s learning. However, this is impossible to measure as there are so many variables in students’ lives. While the mathematical aspects to VAM provide the illusion of objectivity, it is a subjective measure that can be manipulated to achieve desired results. VAM is tool of complicit governors and mayors to achieve political objectives. The goal is to throw the lowest performing schools into receivership status, and the unfortunate teachers that work with the most vulnerable students out of work. Schools are then ripe for privatization plucking. Thus, VAM is a political tool and is not useful in improving public education or teaching.
LikeLike
The skewing phenomenon is not unique to K-12 nor, to quantitative measures. Research found that student opinion surveys, used to rate professors, showed lower ratings in colleges with open enrollment (no admissions requirements). Broadening to include those with admissions requirements, researchers found a greater expectation by students that female teachers, “nurture”, a societal bias that had greater adverse impact on women. Class size influenced scores. Whether the class was an elective or not, had an impact. The attractiveness and attire of the instructor, influenced. Even, the physical environment in the classroom had an effect.
Handicapping teacher scores, based on known demographic, societal and environmental influencers, can’t begin to account for the number of variables.
The use of cumulative test scores to contrast effectiveness of individual teachers, with no control for variables, is similar to measuring the physical properties of the globe and its , inhabitants, using only a wooden yard stick, and then, pronouncing with certainty that it has been accurately described.
LikeLike
Thanks to all for the kind words, and thanks to Diane for posting.
LikeLike
Such an awesome article! VAM is a scarlet letter that selected teachers are made to wear. Each school year is unique with different classes. There are so many variables out of the teacher’s control. One of my students missed 58 days of school, and I still had to take full VAM responsibility for missing the passage rate of 3 points on his test. I’m beyond relieved to almost be retired. The stress is too much.
LikeLike
I proposed several years ago that aft should issue every teacher a red tee shirt that says “i am a “developing” teacher” and every teacher wear it on the first day of school at every school across the country
LikeLike
I agree that ratings of teachers are also based on the type of class you have. If you have an extremely difficult class or a very low class your rating will drop significantly. Or if you are teaching the same grade for 15-18 years you are the teacher of the year or highly rated. They don’t take into consideration that you can’t have effective engagement of danielson if half your class is learning english or you have language delayed children in your class. The observation requirement are for average functioning children and not the children that are actually in our classrooms. Remember we have special education reform with the the support.
LikeLike