While cleaning up my files, I discovered this excellent article by Alan Ehrenhalt, contributing editor to Governing magazine (and formerly executive editor for 19 years). It was written in 2013, but remains pertinent today.
Ehrenhalt sees through the fraud in the high-stakes testing obsession of our day, in which scores on standardized tests are used to label children, rate teachers, and close schools.
He begins by writing about the Tony Bennett grade-rigging scandal in Indiana, then moves on to Florida, where Jeb Bush launched measurement mania.
The Tampa Bay Times newspaper lamented that “after grading schools for 15 years, Florida’s education leaders still cannot get it right.”
One might easily go further and argue that changing the results to make the picture look brighter, whether it involves outright cheating or not, is cause for embarrassment all by itself. If new test questions can have that much effect on a school’s overall performance grade, then why should anybody believe in the integrity of the system?
What’s especially humiliating is that Florida is the birthplace of the school testing movement, the state where former Gov. Jeb Bush decided in 1999 to begin awarding overall letter grades to individual schools to provide information for parents and help assess statewide educational performance. More than a dozen states have begun using a similar system since then, several of them just in the current year. Now they are being told that the Florida model they dutifully copied is too full of flaws to be trusted.
That matters a great deal because a lot more is riding on FCAT test scores than just local bragging rights. If a school receives repeated grades of D or F, it can be required by the state to take a variety of drastic measures, such as making the entire faculty reapply for their jobs, converting the school to a charter or closing it down altogether. So public confidence in the grading process is essential if the state is to have any credibility as a dispenser of draconian educational remedies.
States applying or adapting the Florida model have learned that changing the questions on the test, or switching to a new type of test altogether, can result in wildly fluctuating school grades. School officials in New Mexico this year were delighted to find out that the number of schools receiving A grades had more than doubled in comparison with the results from the year before. Was this the product of innovative new pedagogical techniques? Well, no. It was because the state had abandoned the federally designed No Child Left Behind test and switched to a new one designed by state education experts. Mississippi had a similar experience. Its school test scores went up dramatically because state officials took the expedient step of removing high school graduation rates from the list of test criteria for some schools.
The dramatically higher scores that resulted were a cause for initial state elation. But on further review, they raised another serious question. If the testing process is based on solid educational research, then the results from different tests ought to be reasonably congruent. If the results are dramatically disparate, there is a disturbing suggestion that the people writing the tests aren’t sure what it is they are supposed to be measuring.
Then he shifts his focus to Maine:
Maine is another state that has endured a season of controversy based on the introduction of its new school grading procedures. Gov. Paul LePage, a tireless advocate of school measurement, pushed through a new system this year based largely on the Florida model. Schools were evaluated on student test scores in reading and math; the percentage of students who had shown improvement in their scores during the past year, especially among the bottom 25 percent; graduation rates among upper-level students; and percentage of students who take the national SAT exam.
When the statewide results were tallied, Maine’s schools averaged a C grade—a reasonable enough sounding score. But when researchers in the state began looking at the results in greater detail, they found something that disturbed them. What the tests were really tracking was demographics. Schools in poorer communities around the state nearly all finished lower than their counterparts in affluent suburbs, regardless of academic methods. High schools that were graded A had an average of 9 percent of their students on free or reduced price lunch. Schools that got an F had 61 percent of their students receiving subsidized lunches. To a great extent, the test was simply a measure of poverty, not school quality.
He recognizes that testing has become a problem in itself:
It is hard not to conclude in the end that the school testing movement represents a popular fad in educational policy that is desperately lacking in either substantive methodology or common sense. Its fundamental assumption, underneath all the jargon, is that schools fail because they just aren’t trying hard enough, not because they are being asked to educate pupils who are culturally and socially unprepared to learn. Cooking the books on the tests won’t do anything to solve this problem. All it will do, when the extent of the mischief is revealed, is undermine public confidence in the entire enterprise of school testing.
We have gotten into the business of measuring school performance with precise testing numbers because it’s something we know how to measure. In doing so, we leave aside the subtler and more personal things that teachers and principals do all the time to make their schools function in an orderly way and disseminate as much learning as they possibly can. In the words of Roger Jones, a professor at Lynchburg College in Virginia, one of the states that enacted an A-F grading system this year: “We have gotten so caught up in testing that we have lost sight of a true education.”