Archives for category: Teacher Evaluations

Barbara Madeloni, who led the fight against outsourcing teacher credentialing to Pearson, was elected president of the Massachusetts Teacher Association, will take charge of a union of 110.000 educators

“Until last August, Madeloni directed the Secondary Teacher Education Program at the University of Massachusetts.

“While UMass said her employment ended as part of a move to reduce the use of adjunct professors, Madeloni stated in interviews that the school was punishing her for opposing a project in which UMass tested a teacher assessment program for the for-profit company Pearson.

“Madeloni, 57, said in an interview Sunday she plans as MTA president to “amplify the voice of educators and be a leader at the national level.”

“She noted that her victory comes amid efforts in Los Angeles, Seattle and Chicago to shift the debate back to supporting high-quality public education and the people who provide it over the interests of for-profit companies in the field.

“It should be national news,” Madeloni said of her win in Massachusetts. “It’s a message to everybody that teachers will not be silent and compliant as this assault on public education continues — and undermines public education. This is foundational to democracy and we need to defend it.”

Paul Karrer, who teaches fifth-grade in Castroville, California, takes a look at our policymakers’ obsession with bad teachers. Who are they? How can they be found out and fired?

Here is one example of a bad teacher:

“The Low Score bad teacher — Education reformers want high-stakes testing to be a prime determinant in teacher evaluation. But if one looks under the tests, interesting facts pop up. Often, teachers who were Teacher of the Year find they are considered bad teachers in the following years. How can this be? Because class composition changes. Teaching assignments (grade-level) change. And unlike charter schools, which expel obstructive, destructive and obnoxious kiddos — those in the public realm must teach all the kids. Just one grade-A-whack-a-mole angry student can destroy a classroom. Many teachers have many more than one. Such a child’s presence is subtractive to the learning environment of others.”

There are more.

But who is behind this pursuit and does it make sense?

“My point in all this is to show that variables — normal life variables — impact classroom outcomes. When pregnant teachers, their compassionate spouses and ill teachers are labeled as bad teachers, something is very, very wrong.

“The profit-oriented talking heads of education reform want to monetize public education. Ed reformers would have us believe poverty, trauma, parental drug use, violence, incarceration and homelessness have no impact. Teachers are losing their profession. Kids are losing their teachers. And communities are losing the democratic concept of public schooling.

“In the end, it is the wealthy profiteers and captains of privatization who are pointing fingers at hard-pressed teachers who work in communities of failure. It is a much easier political fix to scream “fire” than it is to acknowledge the conditions of poverty. And it makes money for a few too.”

Audrey Amrein-Beardsley noticed an interesting pattern among the states that won Race to the Top funding.

Most were states with highly inequitable school finance systems, as noted by the Education Law Center of New Jersey.

But Beardsley saw other correlations.

She writes:

“In this case, correlational analyses reveal that state-level policies that rely at least in part on VAMs are indeed more common in states that allocate less money than the national average for schooling as compared to the nation. More specifically, they are more likely found in states in which yearly per pupil expenditures are lower than the national average (as demonstrated in the aforementioned post). They are more likely found in states that have more centralized governments, rather than those with more powerful counties and districts as per local control. They are more likely to be found in more highly populated states and states with relatively larger populations of poor and racial and language minority students. And they are more likely to be found in red states in which residents predominantly vote for the Republican Party.”

These were the states most willing to evaluate teachers by test scores (VAM), despite the absence of evidence for doing so.

In Florida, teachers are given ratings based on the scores of students they never taught.

 

Teachers in several counties challenged the law in court.

 

The judge agreed that the system was unfair, but refused to overturn it.

 

Where teachers are concerned, Junk Science is just fine.

 

It is okay to rate a teacher based on the performance on tests of students the teacher never met, never taught.

 

This is “reform.” Thanks, Arne Duncan, for Race to the Top.

 

Thanks for introducing this insane, stupid policy into our nation’s schools.

 

The true education miracle will be if American public education can survive eight years of stupid policies like this one in Florida.

A federal judge in Florida dismissed a lawsuit against the state evaluation system, declaring that it was unfair to rate teachers based on the scores of students they never taught but not unconstitutional.

The evaluation system may be stupid; it may be irrational; it may be unfair; but it does not violate the Constitution. So says the judge.

An article in the Florida Education Association newsletter described the ruling:

“The federal lawsuit, known as Cook v. Stewart, was filed last year by the FEA, the National Education Association and seven accomplished teachers and the local education associations in Alachua, Escambia and Hernando counties. The lawsuit challenged the evaluation of teachers based on the standardized test scores of students they do not teach or from subjects they do not teach. They brought suit against the Florida commissioner of education, the State Board of Education and the school boards of those three counties, who have implemented the evaluation system to comply with 2011’s Senate Bill 736.

“On Tuesday afternoon, U.S. District Judge Mark Walker dismissed FEA’s challenges to the portions of SB 736 that call for teachers to be evaluated based upon students and/or subjects the teachers do not teach, though he expressed reservations on the practice.

We are disheartened by the judge’s ruling. Judge Walker acknowledged the many problems with this evaluation system, but he ruled that they did not meet the standard to be declared unconstitutional. We are evaluating what further steps we might take in this legal process.

Judge Walker indicated his discomfort with the evaluation process in his order.

“The unfairness of the evaluation system as implemented is not lost on this Court,” he wrote. “We have a teacher evaluation system in Florida that is supposed to measure the individual effectiveness of each teacher. But as the Plaintiffs have shown, the standards for evaluation differ significantly. FCAT teachers are being evaluated using an FCAT VAM that provides an individual measurement of a teacher’s contribution to student improvement in the subjects they teach.” He noted that the FCAT VAM has been applied to teachers whose students are tested in a subject that teacher does not teach and to teachers who are measured on students they have never taught, writing that “the FCAT VAM has been applied as a school-wide composite score that is the same for every teacher in the school. It does not contain any measure of student learning growth of the … teacher’s own students.”

In his ruling, Judge Walker indicated there were other problems.

“To make matters worse, the Legislature has mandated that teacher ratings be used to make important employment decisions such as pay, promotion, assignment, and retention,” he wrote. “Ratings affect a teacher’s professional reputation as well because they are made public — they have even been printed in the newspaper. Needless to say, this Court would be hard-pressed to find anyone who would find this evaluation system fair to non-FCAT teachers, let alone be willing to submit to a similar evaluation system.”

“This case, however, is not about the fairness of the evaluation system,” Walker wrote. “The standard of review is not whether the evaluation policies are good or bad, wise or unwise; but whether the evaluation policies are rational within the meaning of the law. The legal standard for invalidating legislative acts on substantive due process and equal protection grounds looks only to whether there is a conceivable rational basis to support them,” even though this basis might be “unsupported by evidence or empirical data.”

Audrey Amrein-Beardsley has been consulting with the seven Houston teachers who filed a lawsuit in federal court against the use of value-added metrics in their evaluations.

 

She has conducted extensive VAM research in Houston and concluded it was arbitrary and inaccurate. “Houston, the 7th largest urban district in the country, is widely recognized for its (inappropriate) using of the EVAAS for more consequential decision-making purposes (e.g., teacher merit pay and in the case of this article, teacher termination) more than anywhere else in the nation.”

 

She believes that this is the lawsuit that has the potential to bring down VAM as a valid way of measuring teacher quality.

 

Read here to learn why.

 

If VAM goes down, as it should, it would be yet one more piece of evidence that Race to the Top is a $5 billion flop, as if any more evidence were needed.

 

Of course, even a court victory against inappropriate teacher evaluation would not deter our Secretary of Education from claiming victory. If he were on the basketball court, he would claim victory if his team were beaten 152-18; we would never hear the end of those heroic, astonishing, incredible 18 points.

 

Jonathan Pelto here reports on a great new piece by civil rights lawyer Wendy Lecker.

 

He writes:

“In her latest MUST READ commentary piece, fellow public education advocate, Wendy Lecker, lays out the facts about Governor Malloy’s unfair, inappropriate and fatally flawed teacher evaluation system. Like the junk bonds that helped take down Wall Street, Connecticut’s teacher evaluation system is based on junk science and false assumptions.

 

The question is not whether the state should have a comprehensive teacher evaluation system, but whether the corporate education reform industry will continue to stand in the way of developing one.

 

Lecker says that Governor Dannell Malloy’s teacher evaluation system is fundamentally flawed.

 

Lecker writes that the solution to failed tests is not more tests.

 

From her article:

 

Fact: Connecticut’s teacher evaluation plan, because it relies on student standardized test scores, is fundamentally flawed. Student test scores cannot measure a teacher’s contribution to student learning. In fact, the president of the Educational Testing Service recently called evaluation systems based on student test scores “bad science.”

 

Rather than admit failure, the Malloy administration is trying futilely to “fix” the fatal flaw. Last week, PEAC, the panel charged with developing Connecticut’s teacher evaluation system, working under the direction of Commissioner Stefan Pryor, approved a change which calls for more standardized tests to be included in a teacher’s evaluation.

 

The commissioner’s “solution” is to add interim tests to a teacher’s rating. Determining what tests will be used, how they will be aligned to the standardized tests, and how all the test scores will be rolled into one “score” for teachers, will likely render this change completely unworkable.

 

She adds:

 

A recent comprehensive study by Northwestern Professor Kirabo Jackson found that children with teachers who help them develop non-cognitive skills have much better outcomes than those who have teachers who may help them raise test scores. Jackson found that every standard deviation increase in non-cognitive skills corresponds to a significant decrease in the drop-out risk and increased rates of high school graduation. By contrast, one standard deviation increase in standardized test scores has a very weak, often non-existent, relationship to these outcomes. Test scores also predict less than 2 percent of the variability in absences and suspensions, and under 10 percent of the variability in on-time grade progression, for example.

Increases in non-cognitive abilities are also strongly correlated with other adult outcomes, such as a lower likelihood of arrest, a higher rate of employment and higher earnings. Increased test scores are not.

In short, focusing on non-cognitive abilities, those not measured by test scores, are more important in predicting success in high school and beyond.

 

Why are the corporate reformers so wedded to standardized tests that they themselves probably could not pass? They love data. They want Big Data. They believe that every problem can be solved by measurement and manipulation of Big Data. They also believe that they can create the appearance of “failing public schools” by generating data showing how many kids are not meeting an artificial benchmark. This enables them to argue for more charter schools that are free to exclude the children who did not meet the artificial bench mark. Big Data is now part of the tool kit of privatization. It is not about helping kids or improving education, but finding a rationale for turning public dollars over to private managers. If we really wanted to help kids and improve education, we would take the billions now going into testing and use it to reduce class sizes, to increase the arts, and to provide every child the medical care they need.

 

President Obama chose Robert Gordon, who served in key roles in the first Obama administration, as assistant secretary for planning, evaluation, and policy development in the U.S. Department of Education. This is a very important position in the Education Department; he will be the person in charge of the agency that basically decides what is working, what is not, and which way to go next with policy.

 

When he worked in the Office of Management and Budget, Gordon helped to develop the priorities for the controversial Race to the Top program. Before joining the Obama administration, he worked for Joel Klein in the New York City Department of Education.

 

An economist, Gordon was lead author of an influential paper in 2006 that helped to put value-added-measurement at the top of the “reformers” policy agenda. That paper, called “Identifying Effective Teachers Using Performance on the Job,” was co-authored by Thomas J. Kane and Douglas O. Staiger. Kane became the lead adviser to the Gates Foundation in developing its “Measures of Effective Teaching,” which has spent hundreds of millions of dollars trying to develop the formula for the teacher who can raise test scores consistently. Gordon went on to Obama’s Office of Management and Budget, which is the U.S. government’s lead agency for determining budget priorities.

 

The paper co-authored by this triumvirate championed VAM (value-added measurement, i.e., the use of student test scores to judge teacher “effectiveness”) as one of the key policy levers of reform. Here is the abstract:

 

Traditionally, policymakers have attempted to improve the quality of the teaching force by raising minimum credentials for entering teachers. Recent research, however, suggests that such paper qualifications have little predictive power in identifying effective teachers. We propose federal support to help states measure the effectiveness of individual teachers—based on their impact on student achievement, subjective evaluations by principals and peers, and parental evaluations. States would be given considerable discretion to develop their own measures, as long as student achieve- ment impacts (using so-called “value-added” measures) are a key component. The federal government would pay for bonuses to highly rated teachers willing to teach in high-poverty schools. In return for federal support, schools would not be able to offer tenure to new teachers who receive poor evaluations during their first two years on the job without obtaining district approval and informing parents in the schools. States would open further the door to teaching for those who lack traditional certification but can demonstrate success on the job. This approach would facilitate entry into teaching by those pursuing other careers. The new measures of teacher performance would also provide key data for teachers and schools to use in their efforts to improve their performance.

 

This paper, based on economists’ speculation about what works, became a justification often cited for the importance of minimizing teacher certification (“paper qualifications”) and factoring student test scores into teachers’ evaluations, which are a major–if not THE major–component of Race to the Top. The papers’ advocacy of opening the door to uncertified teachers has become a government priority, as shown by Arne Duncan’s award of $50 million to Teach for America (Gordon’s wife worked for TFA), although there is no evidence that TFA can replace the nation’s 3 million teachers and a growing body of evidence that TFA teachers are not more effective than other new teachers or veteran teachers. And since they are usually gone in two years, they have little lasting impact except to increase churn in the teaching staff.

 

Much has happened since Gordon, Kane, and Staiger speculated about how to identify effective teachers by performance measures such as student test scores. We now have evidence that these measures are fraught with error and instability. We now have numerous examples where teachers are evaluated based on the scores of students they never taught. We have numerous examples of teachers rated highly effective one year, but ineffective the next year, showing that what mattered most was the composition of their class, not their quality or effectiveness. Just recently, the American Statistical Association said: “Most VAM studies find that teachers account for about 1% to 14% of the variability in test scores, and that the majority of opportunities for quality improvement are found in the system-level conditions. Ranking teachers by their VAM scores can have unintended consequences that reduce quality.”). In a joint statement, the National Academy of Education and the American Educational Research Association warned about the defects and limitations of VAM and showed that most of the factors that determine test scores are beyond the control of teachers.  Numerous individual scholars have taken issue with the naive belief that teacher quality can be established by the test scores of their students, even when the computer matches as many variables as it can find.

 

 

What we don’t know is this: Has Robert Gordon changed his mind in light of evidence undermining his belief in VAM?

 

Or will the Obama administration continue on its now well-established course, demoralizing veteran teachers, lowering standards for entry-level teachers, dismissing the professional preparation of teachers, and creating new opportunities for the inexperienced, ill-trained recruits of TFA?

 

Having met Robert Gordon and knowing him to be a very smart person, I am betting that he will help the Obama administration change course and inject the wisdom of experience into its policies. That’s my hope.

 

 

Secretary Arne Duncan recently announced his plan to judge teacher education programs by their “results,” including the test scores of the students taught by their graduates. If the Ed Schools can’t produce teachers who can raise test scores, Duncan said, they should go out of business. Spoken like a true businessman.

Mike Rose, celebrated author and professor emeritus at UCLA, has six questions for Arne.

He writes:

Six Questions for Secretary Duncan

1. Will you be evaluating with the same metrics all teacher preparation programs, alternative as well as traditional, Teach for America as well as California State University at Northridge or UCLA?

2. If the Department of Education will use close to $100 million per year on grants to forward its agenda, where will that money come from? From other educational programs that serve needy populations? If so, what services or funding will be cut or discontinued because of this reallocation?

3. Policy formation emerges out of staff research, consultation with experts, and political deliberation. What research and consultation leads you to the current project? I ask because your statement about teacher preparation programs needing to improve “or go out of business” as well as your general approach echoes last year’s report from the National Council of Teacher Quality, a report that has been roundly criticized by a wide range of experts.

4. The National Academy of Education recently issued a comprehensive report on evaluating teacher education programs that recommends an approach very different from yours. Have you read it or consulted its authors?

5. There is an increasing number of respected scholarly organizations—the National Academies Board on Testing and Assessment, AERA, the National Academy of Education, the American Statistical Association—that are advising caution in the use of procedures like value-added to evaluate teacher effectiveness. These organizations point to technical, logistical, and conceptual problems in doing so. One conceptual problem imputing causality between teachers’ activity and a test score, for so many other variables come into play. Your stated plan will use student test scores to not only judge teachers, but also the institutions from which they come, introducing another level of questionable causal attribution in your model. You will have a putative causal chain that goes from the student test score to the teacher to the teacher’s training institution. How do you plan to address this basic conceptual problem?

6. The implication in your plan that bad schools will go out of business assumes that all prospective teachers are the economist’s idealized free agents who can go wherever a highly rated program exists. But a number of prospective teachers from lower income backgrounds do not have the finances to travel—or cannot travel because of family obligations and expectations. How will you address the possible unintended consequence of your program placing burdens on this segment of the population?

Thanks, Mike. If I hear from Secretary Duncan, I will post his answers.

Audrey Amrein-Beardsley has just published a new book that explains value-added measurement (VAM). It is now available for pre-order.

Having been a classroom teacher and now a university scholar, Beardsley takes a highly critical view of simplistic approaches to teacher evaluation.

Here are her chapter headings.

“Paperback: 256 pages; Chapters: 8 and titled as follows:

Socially Engineering the Road to Utopia
Value-Added Models (VAMs) and the Human Factor
A VAMoramic View of the Nation
Assumptions Used as Rationales and Justifications
Test-Based, Statistical, and Methodological Assumptions
Reliability and Validity
Bias and the Random Assignment of Students into Classrooms
Alternatives, Solutions, and Conclusions”