Archives for category: Teacher Evaluations

Adam Urbanski, president of the Rochester, Néw York, teachers’ union, is struggling to make sense of the state’s teacher and principal evaluation system, which varies wildly from district to district. Scarsdale, perhaps the most affluent and high-scoring district in the state, had no “highly effective” teachers. But Rochester, one of the districts with high poverty and low scores, had many. The reality is that none of the formulas for reducing teaching to a number make any sense. Teaching is an art, a craft, and a bit of science. A great teacher may be great one year, not the next, or great with this class but not another. (APPR in Néw York is the Annual Professional Performance Review.)

The ratings in Néw York are referred to as HEDI: Highly Effective, Effective, Developing, Ineffective. A commenter on the blog recently said that “Developing” is considered a low grade but she hoped that she was “developing” every day as a teacher.

This is what Adam wrote to his members:

“The Rochester Miracle?”

“Each year, we re-negotiate our APPR agreement with the District to do all we can to make it less damaging to our student and more fair to teachers.

“We are making progress in reducing the number of Rochester teachers (be)rated as Developing or Ineffective (40% in 2012-2013 but 11% in 2013-2014) and increasing the number rated as Effective or Highly Effective (60% in 2012-2013 but 89% in 2013-2014). Just one year ago, only 2% of Rochester teachers were rated as Highly Effective. This year, that number increased to 46%.

“Why such a huge fluctuation? Maybe it’s because we re-negotiated the agreement; or because teachers set more realistic SLO targets; or because the NYS Education Department adjusted the cut scores in ELA and Math; or because huge fluctuations are typical of invalid and unreliable evaluation schemes. Who knows? In any event, we continue to press for the total abolishments of APPR.

“Meanwhile, we are negotiating a successor agreement that would further diminish excessive testing of students and wrongful rating of teachers.”

Our regular reader and commenter Laura Chapman offers us another nugget of informed analysis and wisdom:

She writes:

A press release dated NEW YORK, Oct. 28, 2013 /PRNewswire/ announced that The Leona M. and Harry B. Helmsley Charitable Trust was investing $3 million “to establish a rigorous research project to modify and align the Framework for Teaching with the Common Core State Standards (CCSS).

This project will happen in four districts. One of these (unnamed) is in NY state.

You can find the application to market the 2013 Danielson Framework in NY state at http://usny.nysed.gov/rttt/teachers-leaders/practicerubrics/Docs/danielson-application.pdf

There you will see that the application required empirical evidence in support of “each rubric.” Whatever that “each rubric” meant, the application was approved with very brief references to eight “empirical” studies, three with more elaborate descriptions of the methodology.

In addition to the questions I asked about the full spectrum applicability of the Danielson protocol, I should have asked about studies that paid attention to the “demographics” in the classrooms observed—the proportional composition of students who qualify for lunch programs, those in gifted programs, special education, students still learning English, recent transfers, and so on.

Every teacher knows how these distributions shift from class to class and make a huge daily difference in what is taught, how, and so on.

For a recent summary of the many problems with this and related high stakes evaluation schemes see Leading via Teacher Evaluation: The Case of the Missing Clothes?
(July, 2013) Joseph Murphy, Philip Hallinger and Ronald H. Heck

http://216.78.200.159/RandD/Teacher%20Evaluation/Teacher%20Eval%20-%20Case%20of%20Missing%20Clothes%20-%20Murphy.pdf

See also a 2014 VIP article by David C. Berliner in Teachers College Record. His online summary of the craze to evaluate teachers by flawed methods closes with this great sentence:

“In fact, the belief that there are thousands of consistently inadequate teachers may be like the search for welfare queens and disability scam artists—more sensationalism than it is reality.” http://www.tcrecord.org/Content.asp?ContentId=17293

Audrey Amrein-Beardsley here summarizes and comments on a very enlightening interview with Jesse Rothstein in the Washington Post. Rothstein, an economist, conducts research on teacher evaluation and accountability.

Rothstein, on teacher evaluation:

“In terms of evaluating teachers, “[t]here’s no perfect method. I think there are lots of methods that give you some information, and there are lots of problems with any method. I think there’s been a tendency in thinking about methods to prioritize cheap methods over methods that might be more expensive. In particular, there’s been a tendency to prioritize statistical computations based on student test scores, because all you need is one statistician and the test score data….

“Why the interest in value-added? “I think that’s a complicated question. It seems scientific, in a way that other methods don’t. Partly it has to do with the fact that it’s cheap, and it seems like an easy answer.”

“What about the fantabulous study Raj Chetty and his Harvard colleagues (Friedman and Rockoff) conducted about teachers’ value-added (which has been the source of many prior posts herein)? “I don’t think anybody disputes that good teachers are important, that teachers matter. I have some methodological concerns about that study, but in any case, even if you take it at face value, what it tells you is that higher value-added teachers’ students earn more on average.”

“What are the alternatives? “We could double teachers’ salaries. I’m not joking about that. The standard way that you make a profession a prestigious, desirable profession, is you pay people enough to make it attractive. The fact that that doesn’t even enter the conversation tells you something about what’s wrong with the conversation around these topics. I could see an argument that says it’s just not worth it, that it would cost too much. The fact that nobody even asks the question tells me that people are only willing to consider cheap solutions.”

“Rothstein, on teacher tenure:

“Even if you give the principal the freedom to fire lots of teachers, they won’t do it very often, because they know the alternative is worse.” The alternative being replacing an ineffective teacher by an even less effective teacher. Contrary to what is oft-assumed, high qualified teachers are not knocking down the doors to teach in such schools.

“Teacher tenure is “really a red herring” in the sense that debating tenure ultimately misleads and distracts others from the more relevant and important issues at hand (e.g., recruiting strong teachers into such schools). Tenure “just doesn’t matter that much. If you got rid of tenure, you would find that the principals don’t really fire very many people anyway” (see also point above).

Ken Futernick, a wise educator who has written about the improvement of the teaching profession for many years, has a brilliant article in the Los Angeles Times about “grand bargain” post-Vergara. Futernick testified for the state in the Vergara trial. He has long understood that schools in urban districts with low scores often have poor working conditions, inadequate resources, and high teacher turnover.

The term “grand bargain” typically refers to compromises by warring parties. In this case, he has laid out a program that all states can learn from.

He writes:

“Unless it’s overturned on appeal, the Los Angeles Superior Court’s June decision in Vergara vs. California making it much easier to fire teachers will hurt students if lawmakers, unions and other state education leaders don’t move beyond its limited focus and address the many factors that adversely affect student learning and teacher performance.

“Stakeholders must come together around a “grand bargain” that would address not only teacher incompetence but all the obstacles educators face that, in the end, prevent many students from learning.”

Making it easier to fire “bad teachers” won’t make it easier to hire good ones.

“To be sure, many of those who teach in poor neighborhoods don’t have the same effect on test scores as those who teach in wealthier schools. But most schools that serve poor and minority students — those with high concentrations of English learners, transient students, students with health problems and so on — have fewer resources to meet students’ many needs, larger class sizes and inadequate materials and facilities. In addition, they are staffed with many beginning teachers who turn over at high rates. Not surprisingly, student achievement suffers.

“Also, schools that serve poor students routinely assign teachers to subjects in which they have no expertise. For instance, a 2008 study showed that 27% of math courses in schools serving poor students were taught by teachers who were not qualified to teach math.

“Why are schools that serve poor and minority students overstaffed with inexperienced and out-of-field teachers? Most teachers seek to make a difference and are eager to teach disadvantaged students. But many don’t want to teach in such schools because most of them are extraordinarily difficult, dysfunctional places to work. The teachers there suffer from poor professional support, low morale, run-down facilities, a revolving door of principals and unrelenting accountability pressures.

“Ineffectiveness in the classroom often does not derive from incompetence.

“Consequently, administrators in these schools can’t attract and keep enough well-qualified, experienced teachers. That, in turn, highlights another critical flaw in the judge’s decision — the assumption that these schools can find suitable replacements for fired teachers. Quite the contrary, and administrators’ power to fire teachers without real due process will only exacerbate the teacher recruitment problem….

“For starters, the state should develop a new teacher dismissal process that is fair and efficient. It should not take years and hundreds of thousands of dollars to fire an ineffective teacher if he or she has been given a reasonable chance to improve, has been carefully evaluated and hasn’t done better.

“[Governor Jerry] Brown signed legislation this year that provides a fair and efficient way to adjudicate cases of gross teacher misconduct. Education leaders should develop a similar way to handle cases of teacher incompetence. They also should develop solutions for the other statutes that the court struck down, such as the one that allowed teachers with more seniority to keep their jobs during layoffs. California could do what other states have done, recognize experience along with other factors in making layoff decisions.

“But California must have a solid due process system for teachers, and contrary to popular belief, that’s all that tenure provides. Without a reliable way to determine whether a teacher is truly incompetent, the state will return to an era when employment decisions were fraught with abuse that included higher-salaried, experienced teachers replaced with less-expensive beginners and competent teachers fired because of their political or religious views.”

“Here is the framework Futernick suggests for a “grand bargain”:

“*The state must develop a robust teacher evaluation framework designed to help all teachers improve, not just to identify low performers. Such systems would ensure that principals and other evaluators have the time and training needed to conduct meaningful evaluations.

“*The state should build on the successful peer assistance and review programs that exist in places such as Poway Unified and San Juan Unified. These programs provide high-quality support to struggling teachers. Most participating teachers improve; those who don’t either leave voluntarily or are dismissed without grievances and expensive lawsuits.

“*The state and school districts must improve the conditions in hard-to-staff schools to attract and retain the best teaching candidates and the strongest principals. Among other things, these schools need high-quality professional development, time for teachers to plan and collaborate, and the authority to make professional decisions.”

Without adequate resources, changes in the law will be a hollow promise.

The following was reported at politico.com:

“AMERICANS CALL FOR STEPPING UP THE TEACHING PROFESSION: Americans want better prepared teachers in the classroom – and a vast majority think educators should be required to pass board certification and submit to licensure standards like doctors and lawyers. Those views come from a PDK/Gallup poll, released today. Seventy percent of respondents said new teachers should spend at least a year teaching under the guidance of a certified colleague. And 60 percent said the entrance requirements for teacher training programs need to be more rigorous. The results come as the Obama administration plans to resurrect an effort to regulate teacher prep programs. They also reflect public attitudes about whether the standardized testing regime ushered in by No Child Left Behind has improved education, said William Bushaw, who until recently served as executive director of PDK. Segun Eubanks, director of teacher quality at the National Education Association, says it’s clear all that testing hasn’t boosted student learning. So naturally, the focus is now swinging to improving the teaching profession. I have the story: http://politico.pro/1uEZjLi

- In another intriguing finding, 61 percent percent of the 1,001 adults surveyed opposed using student test scores in teacher evaluations. On a related note, researchers at The Brookings Institution are out with a study today that argues improving teacher observations is the key to upgrading evaluations. Observations are often biased by student ability and background, the authors say; they urge districts to adjust their observation scores accordingly. The study, published in Education Next: http://bit.ly/1wxBpUV

Educators in Néw York are trying to make sense of the state’s evaluation system. The formula is supposed to consist of observations (60%); state scores (20%); and local assessments (20%). Yet the results don’t line up with common sense or common knowledge.

Some principals seem to be giving higher observation scores to teachers they want to protect because they believe they are valuable and don’t want to lose them

“In Scarsdale, regarded as one of the best school systems in the country, no teacher has been rated “highly effective” in classroom observations. It is the only district in the Lower Hudson Valley with that strict an evaluation. In Pleasantville, 99 percent of the teachers are rated as “highly effective” in the same category.”

Charlotte Danielson, whose rubric is the basis forest teacher evaluation systems, called these results “laughable.”

“Pleasantville schools Superintendent Mary Fox-Alter defended her district’s classroom observation scores, which use the Danielson model — saying the state’s “flawed” model had forced districts to scale or bump up the scores so “effective” teachers don’t end up with a rating of “developing.”

What is truly laughable is the effort to turn the art and craft of teaching into a scaled metric, like weighing apples at the supermarket. What is essentially a matter of human judgment, based on experience and wisdom, cannot be measured and graded. Its results will always be flawed, and the very act of measuring the unmeasurable will change teacher behavior to conform to the scale. If all we want is higher scores, this might be a good way to get them. If we want inspired teaching, it is not.

Audrey Amrein-Beardsley posted a guest blog by a rising star in the Academy, Jimmy Scherrer of North Carolina State University, who previously taught in LAUSD.

Scherrer wrote:

“As someone who works with students in poverty [see also a recent article Scherrer wrote in the highly esteemed, peer-reviewed Educational Researcher], I am deeply troubled by the use of status measures—the raw scores of standardized assessments—for accountability purposes. The relationship between SES and standardized assessment scores is well known. Thus, using status measures for accountability purposes incentivizes teachers to work in the most advantaged schools.

“So, I am pleased with the increasing number of accountability systems that are moving away from status measures. In their place, systems seem to be favoring value-added estimates. In theory, this is a significant improvement. However, the manner in which the models are currently being used and how the estimates are currently being interpreted is intellectually criminal. The models’ limitations are obvious. But, as a learning scientist, what’s most alarming is the increasing use of the estimates generated by value-added models as a proxy for “effective” teaching…..”

“Typically, research studies on teaching and learning are framed using one of three perspectives: the behaviorist, the cognitivist, and the situative. Each perspective is associated with a different grain size. The behaviorist perspective focuses on basic skills, such as arithmetic. The cognitivist perspective focuses on conceptual understanding, such as making connections between addition and multiplication. The situative perspective focuses on practices, such as the ability to make and test conjectures. Effective teaching includes providing opportunities for students to strengthen each focus. However, traditional standardized assessments mainly contain questions that are crafted from a behaviorist perspective. The conceptual understanding that is highlighted in the cognitivist perspective and the participation in practices that is highlighted in the situative perspective are not captured on traditional standardized assessments. Thus, the only valid inference that can be made from a value-added estimate is about a teacher’s ability to teach the basic skills and knowledge associated with the behaviorist perspective.”

This, he writes, is “intellectually criminal” and “intellectually lazy.”

Tell it! VAM is Junk Science.

A teacher in Texas wrote this comment, which depicts (to me) a system where data matters more than teachers or learning or children, either the system is on autopilot or is run by people who confuse numbers with learning.

“They recruited from NC and from Spain (for bilingual teachers) this year because they did expect vacancies. I think it’s important to mention that all are not based on EVAAS because not everyone has those standardized scores. They are also based on Stanford testing in 1st and 2nd grade and for classes like PE, a district made assessment. I teach Kinder and am still waiting to find out what growth they calculated for my scores last year (and yes, they were bubble-in multiple choice tests). No one could explain to me how it was going to work, what percentage growth was required to be considered effective and how that was going to be calculated– so I’m very anxious about it. I was rated highly effective in the professional and instructional areas but who knows. We are supposed to use 2 different assessments for more validity but that doesn’t happen-they end up using the reading and math versions of the same test given the same week. I did wonder how many vacancies they had to start the new school year yesterday?”

Laura Chapman, a regular contributor to the blog, has worked in arts education for many years.

She writes:

This desire to churn the teaching workforce is not just a push from Bill Gates and lawsuits to dismantle unions.
Six economists/statisticians brought together at the Brookings Institution offered a similar plan. These number crunchers said that district-wide VAM (value-added) scores should be used to determine the most effective teachers, irrespective of the subjects and grade-levels they teach.

This proposal is efficient and absurd. It is based on the assumption that a district’s value added scores are so highly correlated with “non-value added” measures that employment decisions for all teachers can be based on the performance of teachers with value added scores.

Under this system, all teachers would also have a composite evaluation based on multiple measures such as end of course test scores, observations, and student surveys. Even so, the teachers with VAM scores would determine the employment fate of all teachers. How is this conclusion reached?

Here is the magical thinking: “For example, we would assume that the correlation between observationally-based ratings of teachers and value-added (scores) in math would be the same in history, where value-added measures are not available.”

In other words, the statisticians freely invent (impute) a missing metric for the history teacher by assuming a math teacher’s rating on a classroom observation protocol can be used as a substitute for the history teacher’s missing value added score.

Those inferential leaps are just the beginning of a larger plan that would make all teacher evaluations “comparable” without any distinctions in grade level, or subject, or conditions under which teachers work.

The Brookings policy articulates principles for dismissing up to 25% of teachers in a district, on the assumption that this action plan would increase test scores and be “fair” to every teacher. The only exception to this formula might be for teachers of exceptional children. This case of econometric thinking ignores the educational, ecological, and substantive importance of different job assignments. See Corerelation, Para 5 in http://www.brookings.edu/research/reports/2011/04/26-evaluating-teachers

The Brookings paper is not radically different, (except for the 25 % churn) from a USDE plan for all teachers by a collective VAM for a school, but limited to one of the “priority” subtests such as reading or mathematics. In Florida, for example, the school wide VAM in reading or math is assigned to art and other teachers of nontested subjects. In other words, the curriculum and instruction that really matters is narrowed to the three R’s.

The use of a collective VAM focused on reading or math is a rapid and cost-effective way to meet federal or state requirements for teacher evaluation. Moreover, in 2014, a U.S. district judge ruled that evaluators in Florida are allowed to disregard a teacher’s job assignment in rating performance. The judge ruled that this practice is legal, even if it is unfair.

Teacher ratings based on a collective value-added score are likely to increase in states where Common Core State Standards (CCSS) are adopted and tested. The CCSS call for all teachers to improve student proficiency in English Language Arts and mathematics.

Although the American Statistical Association has denounced the practice of using VAM for rating individuals, that measure is unlikely to disappear as a tool for churning the workforce.

In the Obama/Duncan/McKinsey & Co. “RESPECT” project, for example, a teacher can only be judged “highly qualified” by producing more than a year’s worth of growth (gain in test scores) in three out of every five years. Teachers without that designation have shorter up-or-out criteria to meet.

This stack-ranking system, like the Brookings plan, banishes job security and churns the teaching workforce by insisting on one-size-fit-all criteria for “effective” teachers. http://www.ed.gov/blog/2012/02/launching-project-respect/

Zak Jason wrote a fascinating interview in “Boston” magazine with Barbara Madeloni, the recently elected president of the Massachusetts Teachers Association, the largest union in the state with 110,000 members.

I first learned of Madeloni when she was preparing teachers at the University of Massachusetts, Amherst, and she refused to give the Pearson test to evaluate new teachers. Michael Winerip wrote a story about her defiance in the New York Times, and within a matter of days, her contract was not renewed. Now all teacher candidates across the university are required to take the Pearson exam.

I learned many things from this article. I learned that Barbara was a psychotherapist before she became a high school English teacher. I learned that when she ran for union president, she was considered a very long shot. Some people thought she had no chance at all.

I learned that the State Commissioner of Education, Mitchell Chester, is also chair of the governing board of PARCC, one of the two federally-funded Common Core tests. Some in the state say he has a conflict of interest.

Madeloni has called for a three-year moratorium on all testing and teacher evaluations:

“We’ve been trying to do scale, instead of human beings. We need to do human beings,” she says. She lambasts the Common Core, a national set of curriculum standards that the state adopted in 2010, as “corporate deform,” and described its architects to CommonWealth magazine as “rich white men who are deciding the course of public education for black and brown children.”

“The past and present heads of the state’s top education offices I talked to dismiss Madeloni’s rhetoric as naive, absurd, and, in the case of the moratorium, illegal. Mitchell Chester, the commissioner of the state’s Department of Elementary and Secondary Education (DESE), says he’s concerned that her “hyperbolic” vision may force the DESE to tune out the entire union.”

Chester may dismiss her, but teachers view her as a savior. “She’s the first MTA leader willing to listen to their agony, and to tell the truth about how teaching in the age of accountability can be, as Holyoke teacher Cheri Cluff puts it, “like waiting tables at a busy restaurant; you’re running and running and running, and you’ve lost your head.” Whereas past presidents and her opponent, MTA vice president Tim Sullivan, were willing to compromise with state administrators, Madeloni is combative, unapologetic, and, as Agustin Morales, another Holyoke teacher, says, “unafraid to make her life uncomfortable.”

Morales, the article notes, was elected president of his local in Holyoke with a 70% majority; he complained about the data walls, where students’ names and test scores are publicly posted. He was fired.

Madeloni is a fighter. She is outspoken and unafraid. Will she be marginalized by the state? Can the state alienate its largest union? Watch for the battles ahead. Madeloni was elected to stand up for teachers. Richard Stutman of the Boston Teachers Union has agreed to collaborate with her.

Zak Jason concluded:

“When I first talked to Madeloni soon after her election, she agreed to have me follow her throughout her first week. But just before her presidency began, she told me, “As a psychotherapist, I know the presence of someone else in the room can affect how the room behaves,” and said she would only be available for an interview, and her communications director James Sacks would join.

“As I’m about to leave her office, Madeloni turns to Sacks and asks, half-joking, “Is there anything I didn’t say that I was supposed to say?”

“What’s your vision?” he says.

“That we reclaim the vision of public education as a space for democracy, for joy, for hope, for a better future for all of our children. All of our children.”

Follow

Get every new post delivered to your Inbox.

Join 111,801 other followers