Archives for category: Teacher Evaluations

Here is Arne Duncan’s statement on the Vergara decision that tenure and seniority are unconstitutional. Not a word about the real causes of unequal opportunity: poverty and segregation.. Who would have believed that a Democratic administration would stand by silently as collective bargaining rights are rolled back (Wisconsin) and would hail a court decision removing due process from public school teachers? Mitt Romney’s Secretary of Education (had he won) could have issued this press release:

Statement from U.S. Secretary of Education Arne Duncan Regarding the Decision in Vergara v. California:
JUNE 10, 2014

Contact:
Press Office, (202) 401-1576, press@ed.gov

“For students in California and every other state, equal opportunities for learning must include the equal opportunity to be taught by a great teacher. The students who brought this lawsuit are, unfortunately, just nine out of millions of young people in America who are disadvantaged by laws, practices and systems that fail to identify and support our best teachers and match them with our neediest students. Today’s court decision is a mandate to fix these problems. Together, we must work to increase public confidence in public education. This decision presents an opportunity for a progressive state with a tradition of innovation to build a new framework for the teaching profession that protects students’ rights to equal educational opportunities while providing teachers the support, respect and rewarding careers they deserve. My hope is that today’s decision moves from the courtroom toward a collaborative process in California that is fair, thoughtful, practical and swift. Every state, every school district needs to have that kind of conversation. At the federal level, we are committed to encouraging and supporting that dialogue in partnership with states. At the same time, we all need to continue to address other inequities in education–including school funding, access to quality early childhood programs and school discipline.”

Realcleareducation.com reports that the Gates Foundation favors a moratorium on the consequences of Common Core testing. Since the standards were bought and paid for by the Gates Foundation, it is only right that it should call the shots. Now we know who is in charge of American education. Perhaps the foundation hopes that a delay will defuse the growing movement against Common Core.

Realcleareducation writes:

“Good morning, it’s Tuesday June 10. This morning at RealClearEducation we have news, commentary, analysis, and reports from the education world. This morning Vicki Phillips, Director of Education, College Ready at the Gates Foundation, will call for delaying the attachment of any consequences to the new Common Core State Standards, bolstering the position of those calling for an accountability moratorium. Depending on your perspective that will help or hinder implementation of the new standards more than 40 states are adopting.”

Peter Greene asks: if you had your choice, which head of the hydra-headed reform monster would you lop off first?

Hint: one of those heads is essential for all the others. I agree with his choice.

Audrey Amrein Beardsley invited an economist to review Raj Chetty & Co’s effort to take down the statement of the American Statistical Assosociation.

Chetty and friends are the leading advocates for using test scores to rank teachers and fire those whose students have the lowest scores. The ASA report was inconvenient for their thesis, as it pointed out that teachers account for a small percentage of the variance in test scores, that what is observed is likely to be correlation all, not causal, and that putting so much weight on test scores was likely to cause bad effects.

There is also the inconvenient fact that VAM is no longer a neat theory, but has moved into the realm of reality. Some districts have used it, often with unfortunate results. One would think that academics would feel some obligation to see how their theory is working in practice rather than continue to market it on grounds of its theoretical elegance.

Bill Gates has loomed large in education for the past decade. The reason is obvious: his foundation is the largest in the world, and districts are more than willing to accept his conditions in return for his money.

When anyone asks Gates whether it is right that one man and one foundation should have so much influence, he says that the money he gives is minuscule compared to the hundreds of billions spent annually by American schools. But he is being disingenuous, and he knows it. Almost all of those billions are fixed costs, whereas his money is discretionary. A district with a huge budget–often facing budget cuts—will dance to Bill Gates’ tune. All he need do is dangle $50-100 million dollars, and district leaders will do as he asks.

But what happens when he is wrong? In the first decade of this century, he said that small high schools were THE answer, and districts lined up to get money and break up large high schools. It wasn’t a bad idea, but he decided that it wasn’t THE idea, and in 2008, he decided it wasn’t producing the miraculous results he wanted (ROI–return on investment), and he dropped it.

Since he can’t tolerate being without answers, he next placed his bets on raising teacher quality. A good idea poorly executed. Instead of changing working conditions or coming up with other ways to make teaching a rewarding profession, Gates chose to go the punitive route. He decided that all of American education was broken, and that teacher evaluation was the most broken part of it. For whatever reason, administrators were not weeding out the incompetents, and he decided to make that his mission. He never stopped to ask why 40% or so of new teachers left teaching within five years of starting.

How to evaluate millions of teachers? Gates had the answer. Use the test scores of their students to a significant degree to find out who was best and worst.

Given Gates’ unusual power, the U.S. Department of Education decided that he must be right, even though the research was thin and speculative. No need to conduct experiments to see if Bill was right. He is so rich, he must be right. So, Race to the Top required states to include Bill’s idea– judging teachers by their students’ test scores to a significant degree–if they wanted to be eligible for any part of the $4.35 billion prize, or later, if they wanted a waiver from NCLB’s punishments for failing to make 100% of their students proficient by 2014.

Some districts have now experimented with “value-added assessment” for four years, and no miracle is in sight. Most researchers say the methodology is flawed that it will never work. The most recent study, conducted by Andy Porter, dean at the University of Pennsylvania, and Morgan Polikoff of the University of Southern California, found little or no correlation between teacher quality and VAM ratings. This study was funded, ironically, by the Gates Foundation.

The question now is, will Bill Gates have the courage to admit he was wrong, as he did in 2008?

The American Statistical Association released a brief report on value-added assessment that was devastating to its advocates.

ASA said it was not taking sides, but then set out some caveats that left VAM with no credibility.

Can a school district judge teacher quality by the test scores of his or her students?

ASA wrote this:

“VAMs are generally based on standardized test scores, and do not directly measure potential teacher contributions toward other student outcomes.

o VAMs typically measure correlation, not causation: Effects – positive or negative – attributed to a teacher may actually be caused by other factors that are not captured in the model.

o Under some conditions, VAM scores and rankings can change substantially when a different model or test is used, and a thorough analysis should be undertaken to evaluate the sensitivity of estimates to different models.

• VAMs should be viewed within the context of quality improvement, which distinguishes aspects of quality that can be attributed to the system from those that can be attributed to individual teachers, teacher preparation programs, or schools. Most VAM studies find that teachers account for about 1% to 14% of the variability in test scores, and that the majority of opportunities for quality improvement are found in the system-level conditions. Ranking teachers by their VAM scores can have unintended consequences that reduce quality.”

Now, if teachers account for only1%-14% of the variability in test scores; and if the majority of opportunities for qualit improvemt are found in the system, not individuals, and if VAM ranking “can have unintended consequences that reduce quality,” then it is hard to read this statement as anything other than a warning about the danger of relying on VAM to rank teachers.

But our intrepid team of Harvard economists is unfazed!

What do Chetty, Friedman, and Rockoff say about the ASA statement? Do they modify their conclusions? No. Did it weaken their arguments in favor of VAM? Apparently not. They agree with all of the ASA cautions but remain stubbornly attached to their original conclusion that one “high-value added (top 5%) rather than an average teacher for a single grade raises a student’s lifetime earnings by more than $50,000.” How is that teacher identified? By the ability to raise test scores. So, again, we are offered the speculation that one tippy-top fourth-grade teacher boosts a student’s lifetime earnings, even though the ASA says that teachers account for “about 1% to 14% of the variability in test scores…”

On May 3, I received an email from Professor Raj Chetty of Harvard University, informing me that his famous paper on value-added assessment of teachers was being published by the American Economic Review. The paper has three authors: in addition to Chetty, the other authors include John Friedman and Jonah Rockoff, also at Harvard. When the paper was first released, it was reported on the front page of the New York Times, one of the authors discussed it on the PBS Newshour, and President Obama referred to it in his 2012 State of the Union address.

The New York Times story appeared on January 6, 2012. it began thus:

“WASHINGTON — Elementary- and middle-school teachers who help raise their students’ standardized-test scores seem to have a wide-ranging, lasting positive effect on those students’ lives beyond academics, including lower teenage-pregnancy rates and greater college matriculation and adult earnings, according to a new study that tracked 2.5 million students over 20 years.”

The reporter noted that the effect of a single “high-value” teacher was actually quite modest: “The average effect of one teacher on a single student is modest. All else equal, a student with one excellent teacher for one year between fourth and eighth grade would gain $4,600 in lifetime income, compared to a student of similar demographics who has an average teacher. The student with the excellent teacher would also be 0.5 percent more likely to attend college.” But think of the aggregate effect on an entire classroom: “Replacing a poor teacher with an average one would raise a single classroom’s lifetime earnings by about $266,000, the economists estimate. Multiply that by a career’s worth of classrooms.” President Obama cited the aggregate income gain for a classroom in his State of the Union address 18 days later.

This was the takeaway from the authors, as reported in the New York Times:

“The authors argue that school districts should use value-added measures in evaluations, and to remove the lowest performers, despite the disruption and uncertainty involved.

“The message is to fire people sooner rather than later,” Professor Friedman said.

“Professor Chetty acknowledged, “Of course there are going to be mistakes — teachers who get fired who do not deserve to get fired.” But he said that using value-added scores would lead to fewer mistakes, not more.

“Still, translating value-added scores into policy is fraught with problems. Judging teachers by their students’ test scores might encourage cheating, teaching to the test or lobbying to have certain students in class, for instance.”

The Chetty, et al, study supported VAM, which was the central feature of Race to the Top. Fire teachers sooner rather than later. One great teacher can produce lifetime gains.

Over the past few years, as more districts have implemented VAM, it has turned out to be far more complicated than the economists predicted to determine which teachers would produce great scores year after year, and which would not. Teachers were rated effective one year, ineffective the next year. Those who taught English learners, the gifted, and students with disabilities were less likely to get big gains. It turned out that VAM is affected by the composition of the classroom, since students are not randomly assigned.

But their paper continues to be the lodestar of VAM research.

Whereas it had originally appeared as a single paper published by the National Bureau of Economic Research, the editors suggested the paper was so important that it should be split into two papers and published separately. The last time this had happened was in 1971, for papers on taxation that had won two Nobel Prizes.

Here are the papers.

Click to access w19423.pdf

Click to access w19424.pdf

Professor Chetty’s email was addressed to me and Audrey Amrein-Beardsley, who has written extensively and critically about value-added assessment. In addition to her recently published book on VAM—Rethinking Value-Added Models in Education: Critical Perspectives on Tests and Assessment-Based Accountability—she writes a blog called VAMboozled that I often cite.

For the record, I have never met Raj Chetty, and I have met Beardsley once, when she interviewed me for an oral history archive.

I asked Beardsley if she would be willing to review the latest iteration of this now famous study of VAM, and she did, here on her blog.

Beardsley notes that there is a divide between econometricians, like Chetty, Friedman, and Rockoff, and educational researchers, who often feel some obligation to visit classrooms and see the effects of policies, not just analyze data from a great distance, without reference to context or something like reality.

Professor Chetty and I exchanged several emails. I asked for his permission to post our exchange. He said that he preferred that I not post his comments, which were invariably polite, but of course I was free to post my comments to him.

So here goes. This was my first response:

Dear Professor Chetty,

I certainly agree that teachers are valuable. I had some wonderful teachers
as I was growing up, also some mediocre ones, and a few really bad ones. I
went to an ordinary public school system in Houston, not an elite private
school.

I wish that this sentiment about the value of teachers was all that came
from your vast publicity machine.

Instead, we get more high-stakes testing, more test prep, more phony claims
that the work of my fourth grade or fifth-grade teacher was responsible for
my not getting pregnant when I was 15. Maybe my lifetime income was
increased by my sixth-grade teacher, though I doubt it. Funny, I was one of
eight children. We all had the same teachers, and we all turned out
differently. Some of us did well in school, others nearly flunked out. Was
it the fault of our teachers?

I know you love your celebrity–and hobnobbing with Obama and Duncan and
supporting their emphasis on testing and firing teachers sooner rather than
later—but think of the harm that you do to millions of children and their
teachers by the way you publicize your work. Do you feel good every time you
read about a teacher who is graded based on the work of children she never
taught? Or the “highly effective” teacher who was rated ineffective the next
year based on test scores? Or the precipitous decline in the number of
people who want to be teachers because of the non-stop attacks on teachers?
I don’t think your positive message is getting through. All people hear is
that you want those lousy teachers whose kids get low scores to be fired.
Now.

Diane Ravitch

On May 5, I wrote to both Raj and Audrey (we had reached a first-name basis):

Raj and Audrey,

I don’t know whether my thoughts advance or retard this informed discussion.

I look at the Chetty, etc. study as comparable to a pilot in a bomber
dropping a bomb on a city 30,000 feet below. He didn’t construct the bomb,
he doesn’t know how it hurts the people below, he can’t be held responsible
if his good intentions went wrong.

I invite you to read this blog by a teacher in Oklahoma:
http://bluecerealeducation.blogspot.com/2014/05/ms-bullens-data-rich-year.html

The odds are that he never heard of Raj Chetty. But look what Raj Chetty has
done to the quality of education, the students, and the teachers in
Oklahoma. Is this something to be proud of?

Your work–not yours alone, of course–has encouraged a technocratic
approach to education that would never be tolerated in our nation’s elite
private schools.

The pursuit of higher test scores on stupid multiple-choice standardized
tests does not improve education: it corrupts it.

Those who care deeply about humanistic education, about the life of the
mind, about deep learning, find your work–no matter how technically
perfect–utterly appalling. It drains education of joy and discovery and
makes everyone a slave to Pearson.

I would love to discuss this further with you over a glass of wine. I can’t
believe you do not understand the pernicious effects of your famous study,
featured on the first page of the New York Times, on the PBS Newshour, and
in President Obama’s State of the Union Address.

It seems to be my life work to insist that education is far, far more than a
score on a standardized test. Somehow, I suspect you agree. You are far too
intelligent not to.

Diane

Later on the same day, May 5, Raj responded, and I wrote:

Thanks, Raj,

A question and a comment.

My question: Could I publish our exchange on my blog? I get about 25,000-40,000 readers daily. But I would publish nothing without your permission.

My comment: Race to the Top has incentivized the use of VAM in most states. Your study has been cited by Obama and Duncan as evidence that they are on the right track, that it is “bad teachers,” not poverty, that cause low test scores.

Based on the real-world effects of VAM on real children and real teachers, I conclude that VAM has limited use, perhaps informative in looking at the effects of policies and programs (faithfully enacted, which they seldom are) in a school or a district, but of zero value in assessing individual teacher quality. As you must know by now, the ratings for individual teachers are unstable, and may change if a different test is used or unstable for no apparent reason at all. Teachers intuitively know that their ratings reflect the composition of the class, not their “quality” or efficacy as teachers. Even if VAM did work–and it does not–it would keep every teacher singularly focused on standardized tests, which narrow the curriculum, encourage schools and teachers to avoid the neediest students, promote test prep and cheating, and have other perverse effects.

At the end of the day, I as a mother and grandmother would not want my offspring to be enrolled in a school where standardized tests dominate teaching and learning. And that is precisely what VAM is doing to our nation’s public schools.

My third grandson enters third grade in a New York City public school next September. I hope by then that the opt out movement has grown so strong that teachers cannot be subjected to unfair and inaccurate VAMs. I will do whatever I can to encourage parents in every school district in the U.S. to keep their children home on testing day. That seems to be the only way that the giant standardized testing machine can be stopped.

Your work has been crucial in promoting standardized testing as the measure of teacher quality, even though major scholarly organizations disagree (the American Educational Research Association, the National Academy of Education, the American Statistical Association).

If you have modified your views (message: “fire teachers sooner rather than later”); if you have learned anything new since you first introduced your findings, I would love to know about it.

I repeat that I do not have the technical ability to argue algorithms with you. Your study may be technically brilliant. But its consequences for the quality of education and the lives of children and teachers have been disastrous. In its current application, it is Junk Science. Since I feel certain you don’t want to be remembered in history as the economist who sponsored Junk Science and treated children as data points, I hope you will give me reason to believe that you have rethought the conclusions of your study and provided clear warnings about the limitations and misuses of VAM.

Diane

We ended with the understanding that I would not quote his words or paraphrase them. I think I was true to that understanding.

Now, as I told him, I am not an economist, and I lack the technical proficiency to critique his paper. Maybe it will win two or three Nobel prizes. If all it says is that teachers are valuable, I agree. If it says that teacher affect eternity, I agree.

But if he really expects me to believe that my fifth-grade teacher (or was it my fourth-grade teacher) caused me to get higher test scores, and that because of her and my higher test scores, I did not get pregnant when I was 15, I think this is just plain silly.

This strikes me as the kind of study that brings huzzahs from economists for its technical precision, but is unrelated to the messiness of real life. The numbers may all add up, but there are no living, breathing students or teachers here, just data.

It is so incredibly frustrating to me to see economists and policymakers playing with the lives of children and teachers as if they were ants seen from a far distance or merely data points. I recommend to my new friend Raj a book by Yale Professor James C. Scott titled “Seeing Like a State: How Certain Schemes to Improve the Human Condition Have Failed.” It changed my life. Maybe it will change his too.

Bob Shepherd writes on the absurd demands now placed on teachers and principals by politicians, who expect to see higher test scores every year. Step back and you realize that the politicians, the policy wonks, the economists, and the ideologues are ruining education, not improving it. They are doing their best to demoralize professional educators. What are they thinking? Are they thinking? Or is it just their love of disruption, let loose on children, families, communities, and educators?

Bon Shepherd writes:

OK, you are sitting in your year-end evaluation session, and you’ve heard from every other teacher in your school that his or her scores were a full level lower this year than last, and so you know that the central office has leaned on the principal to give fewer exemplary ratings even though your school actually doesn’t have a problem with its test scores and people are doing what they did last year but a bit better, of course, because one grows each year as a teacher–one refines what one did before, and one never stops learning.

But you know that this ritual doesn’t have anything, really, to do with improvement. It has to do with everyone, all along the line, covering his or her tushy and playing the game and doing exactly what he or she is told. And, at any rate, everyone knows that the tests are not particularly valid and that’s not really the issue at your school because, the test scores are pretty good because this is a suburban school with affluent parents, and the kids always, year after year, do quite well.

So whether the kids are learning isn’t really the issue. The issue is that by some sort of magic formula, each cohort of kids is supposed to perform better than the last–significantly better–on the tests, though they come into your classes in exactly the same shape they’ve always come into them in because, you know, they are kids and they are just learning and teaching ISN’T magic. It’s a lot of hard work. It’s magical, sometimes, of course, but its’ not magic. There’s no magic formula.

So, the stuff you’ve been told to do in your “trainings” (“Bark. Roll over. Sit. Good Boy”) is pretty transparently teaching-to-the-test because that’s the only way the insane demand that each cohort will be magically superior to the last as measured by these tests can be met, but you feel in your heart of hearts that doing that would be JUST WRONG–it would short-change your students to start teaching InstaWriting-for-the-Test, Grade 5, instead of, say, teaching writing. And despite all the demeaning crap you are subjected to, you still give a damn.

And you sit there and you actually feel sorry for this principal because she, too, is squirming like a fly in treacle in the muck that is Education Deform, and she knows she has fantastic teachers who knock it out of the park year after year, but her life has become a living hell of accountability reports and data chats to the point that she doesn’t have time for anything else anymore (she has said this many times), and now she has to sit there and tell her amazing veteran teachers who have worked so hard all these years and who care so much and give so much and are so learned and caring that they are just satisfactory, and she feels like hell doing this and is wondering when she can retire.

And the fact that you BOTH know this hangs there in the room–the big, ugly, unspoken thing. And the politicians and the plutocrats and the policy wonks at the Thomas B. Fordham Institute and the Secretary of the Department for the Standardization of US Education, formerly the USDE, and the Vichy education guru collaborators with these people barrel ahead, like so many drunks in a car plowing through a crowd of pedestrians.

New Mexico recently released teacher ratings, 50% based on standardized test scores. The teachers are hopping mad, because they know that the evaluations do not truly measure their quality, and the tests are not good measures of what students know and can do.

In Taos, teachers burned their evaluation reports. Teachers in Albuquerque also burned their evaluations as a sign of protest.

During the Vietnam war, anti-war protestors burned their draft cards. Feminists burned their bras in protest at the Miss America contest in 1968.

This is a venerable protest activity against injustice.

Audrey Amrein-Beardsley, one of our nation’s pre-eminent experts on value-added assessment, here reviews a TED-X talk by Tennessee Commissioner of Education Kevin Huffman, boasting of the tremendous growth in test scores as a result of his policies. Beardsley points out the curious fact that Tennessee started using VAM in the 1990s with little to show for it. But, there were those Tennessee NAEP scores, proof positive, according to both Huffman and Se rotary of Education Arne Duncan that Race to the Top–or Huffman’s personal presence–was creating strong results. Nd in the end, results (test scores) are what matter most, right?

But what about those NAEP results that Huffman and Duncan tout?

Beardsley writes:

“While [William] Sanders (the TVAAS developer who first convinced the state legislature to adopt his model for high-stakes accountability purposes in the 1990s) and others (including U.S. Secretary of Education Arne Duncan) also claimed that Tennessee’s use of accountability instruments caused Tennessee’s NAEP gains (besides the fact that the purported gains were over two decades delayed), others have since spoiled the celebration because 1) the results also demonstrated an expanding achievement gap in Tennessee; 2) the state’s lowest socioeconomic students continue to perform poorly, despite Huffman’s claims; 3) Tennessee didn’t make gains significantly different than many other states; and 4) other states with similar accountability instruments and policies (e.g., Colorado, Louisiana) did not make similar gains, while states without such instruments and policies (e.g., Kentucky, Iowa, Washington) did. I should add that Kentucky’s achievement gap is also narrowing and their lowest socioeconomic students have made significant gains. This is important to note as Huffman repeatedly compares his state to theirs.”

Read the post. It is a very good demonstration of how data get used and misused for political purposes.