This article on “The Costs of Accountability” appeared in The American Interest. It was written by Jerry Z. Muller, a professor of history at Catholic University of America in Washington, D.C. It is a long and thoughtful article, and I can offer just a few snippets. I urge you to read it. It is a five-star article that explains how much money and energy is wasted in pursuit of the Golden Fleece of “accountability.” It has become an industry unto itself.
He begins:
The Google Ngram Viewer, which instantly searches through thousands of scanned books and other publications, provides a rough but telling portrait of changes in our culture. Set the parameters by years, type in a term or phrase, and up pops a graph showing the incidence of the words selected from 1800 to the present. Look up “gender”, for example, and you will see a line that curves upward around 1972; the slope becomes steeper around 1980, reaches its peak in 2000, and afterwards declines gently. Type in “accountability” and behold a line that begins to curve upward around 1965, with an increasingly steep upward slope after 1985. So too with “metrics”, whose steep increase starts around 1985. “Benchmarks” follows the same pattern, as does “performance indicators.” But unlike “gender”, the lines for “accountability”, “metrics”, “benchmarks”, and “performance indicators” are all still on the upswing.
Today, “accountability” and its kissing cousins “metrics” and “performance indicators” seem to be, if not on every lip, then on every piece of legislation, and certainly on every policy memo in the Western world. In business, government, non-profit organizations, and education, “accountability” has become a ubiquitous meme—a pattern that repeats itself endlessly, albeit with thousands of localized variations.
The characteristic feature of the culture of accountability is the aspiration to replace judgment with standardized measurement. Judgment is understood as personal, subjective, and self-interested; metrics are supposed to provide information that is hard and objective. The strategy behind the numbers is to improve institutional efficiency by offering rewards to those whose metrics are highest or whose benchmarks have been reached, and by punishing those who fall behind relative to them. Policies based on these assumptions have been on the march for decades, hugely enabled in recent years by dramatic technological advances, and as the ever-rising slope of the Ngram graphs indicate, their assumed truth goes marching on.
The attractions of accountability metrics are apparent. Yet like every culture, the culture of accountability has carved out its own unquestioned sacred space and, as with all arguments from presumed authority, possesses its characteristic blind spots. In this case, the virtues of accountability metrics have been oversold and their costs are underappreciated. It is high time to call accountability and metrics to account.
That might seem a quixotic, if not also a perverse, aspiration. What, after all, could be objectionable about accountability? Should not individuals, departments, divisions, be held to account? And how to do that without counting what they are doing in some standardized, numerical form? How can they be held to firm standards and expectations without providing specific achievement goals, that is, “benchmarks”? And how are people and institutions to be motivated unless rewards are tied to measureable performance? To those in thrall to the culture of accountability, to call its virtues into question is tantamount to championing secrecy, irresponsibility, and, worst of all, imprecision. It is to mark oneself as an enemy of democratic transparency.
To be sure, decision-making based on standardized measurement is often superior to judgment based on personal experience and expertise. Decisions based on big data are useful when the experience of any single practitioner is likely to be too limited to develop an intuitive feel for or reliable measure of efficacy. When a physician confronts the symptoms of a rare disorder, for example, she is better advised to rely on standardized criteria based on the aggregation of many cases. Data-based checklists—standardized procedures for how to proceed under routine or sometimes emergency conditions—have proven valuable in fields as varied as airline operation, rescue squad work, urban policing, and nuclear power plant safety, among a great many.
Clearly, the attempt to measure performance, however difficult it can be, is intrinsically desirable if what is actually measured is a reasonable proxy for what is intended to be measured. But that is not always the case, and between the two is where the blind spots form.
Measurement schemes are deceptively attractive because they often “prove” themselves through low-hanging fruit. They may indeed identify and help to remedy specific problems: It’s good to know which hospitals have the highest rates of infections, which airlines have the best on-time arrival records, and so on, because it can energize and improve performance. But, in many cases, the extension of standardized measurement may suffer diminished utility and even become counterproductive if sensible pragmatism gives way to metric madness. Measurement can readily become counterproductive when it tries to measure the unmeasurable and quantify the unquantifiable, whether to determine rewards or for other purposes. This tends to be the case as the scale of what is being measured grows while the activity itself becomes functionally differentiated, and when those tasked with doing the measuring are detached organizationally from the activity being measured.
He writes specifically about education:
No Child, Doctor, or Cop Left Behind
In the public sector, the show horse of accountability became “No Child Left Behind” (NCLB), an educational act signed into law with bipartisan support by George W. Bush in 2001 whose formal title was, “An act to close the achievement gap with accountability, flexibility, and choice, so that no child is left behind.”
The NCLB legislation grew out of more than a decade of heavy lobbying by business groups concerned about the quality of the workforce, civil rights groups worried about differential group achievement, and educational reformers who demanded national standards, tests, and assessment. The benefit of such measures was oversold, in terms little short of utopian.
Thus William Kolberg of the National Alliance of Business asserted that, “the establishment of a system of national standards, coupled with assessment, would ensure that every student leaves compulsory school with a demonstrated ability to read, write, compute and perform at world-class levels in general school subjects.” The first fruit of this effort, on the Federal level, was the “Improving America’s Schools Act” adopted under President Clinton in 1994. Meanwhile, in Texas, Governor George W. Bush became a champion of mandated testing and educational accountability, a stance that presaged his support for NCLB.
Under NCLB states were to test every student in grades 3–8 each year in math, reading, and science. The act was meant to bring all students to “academic proficiency” by 2014, and to ensure that each group of students (including blacks and Hispanics) within each school made “adequate yearly progress” toward proficiency each year. It imposed an escalating series of penalties and sanctions for schools in which the designated groups of students did not make adequate progress. Despite opposition from conservative Republicans antipathetic to the spread of Federal power over education, and of some liberal Democrats, the act was co-sponsored by Senator Edward Kennedy and passed both houses of Congress with majority Republican and Democratic support. Advocates of the reforms maintained that the act would create incentives for improved outcomes by aligning the behavior of teachers, students, and schools with “the performance goals of the system.”
Yet more than a decade after its implementation, the benefits of the accountability provisions of NCLB remain elusive. Its advocates grasp at any evidence of improvement on any test at any grade in any demographic group for proof of NCLB’s efficacy. But test scores for primary school students have gone up only slightly, and no more quickly than before the legislation was enacted. Its impact on the test scores of high school students has been more limited still.
The unintended consequences of NCLB’s testing-and-accountability regime are more tangible, however, and exemplify many of the characteristic pitfalls of the culture of accountability. Under NCLB, scores on standardized tests are the numerical metric by which success and failure are judged. And the stakes are high for teachers and principals, whose salaries and very jobs depend on this performance indicator. It is no wonder, then, that teachers (encouraged by their principals) divert class time toward the subjects tested—mathematics and English—and away from history, social studies, art, and music. Instruction in math and English is narrowly focused on the skills required by the test rather than broader cognitive processes: Students learn test-taking strategies rather than substantive knowledge. Much class time is devoted to practicing for tests, hardly a source of stimulation for pupils.
Even worse than the perverse incentives involved in “teaching to the test” is the technique of improving average achievement levels by reclassifying weaker students as disabled, thus removing them from the assessment pool. Then there is out-and-out cheating, as teachers alter student answers or toss out tests by students likely to be low scorers, phenomena well documented in Atlanta, Chicago, Cleveland, Houston, Dallas, and other cities. Mayors and governors have diminished the difficulty of tests, or lowered the grades required to pass the test, in order to raise the pass rate and thus demonstrate the success of their educational reforms—and get more Federal money by so doing.
Another effect of NCLB is the demoralization of teachers. Many teachers perceive the regimen created by the culture of accountability as robbing them of their autonomy, and of the ability to use their discretion and creativity in designing and implementing the curriculum. The result has been a wave of early retirements by experienced teachers, and the movement of the more creative ones away from public and toward private schools, which are not bound by NCLB.
Despite the pitfalls of NCLB, the Obama Administration doubled down on accountability and metrics in K-12 education. In 2009, it introduced “Race to the Top”, which used funds from the American Recovery and Reinvestment Act to induce states “to adopt college- and career-ready standards and assessments; build data systems that measure student growth and success; and link student achievement to teachers and administrators.” This shows what happens these days when accountability metrics do not yield the result desired: Measure more, but differently, until you get the result you want.
Metric madness is not limited to education. Some of the problems evident in NCLB pop up in fields from medicine to policing.

post by Connie. You write: “Even worse than the perverse incentives involved in “teaching to the test” is the technique of improving average achievement levels by reclassifying weaker students as disabled, thus removing them from the assessment pool.”
I would submit that far worse than what you describe….is the policy being pushed by the administration that Special Education students be LEFT IN the testing pool except for the 1% of students who are most seriously disabled. This leaves countless disabled students IN the testing pool even though they may need more meaningful accommodations than are offered, and even though they may need an alternative assessment accommodating their disability. In states like Virginia where passing high stakes tests is tied DIRECTLY to graduation,this will result in many Special Education students being denied a diploma because they can’t pass the more rigorous high stakes tests. This is totally at odds with the letter and spirit of laws designed to protect disabled students that have been in place for decades. It is a travesty.
LikeLike
Likewise, many arbitrary and educational unsound decisions have been made regarding ELLs in this frenzy to collect data and stack rank everyone and every place.
LikeLike
“Measurement can readily become counterproductive when it tries to measure the unmeasurable and quantify the unquantifiable, whether to determine rewards or for other purposes. This tends to be the case as the scale of what is being measured grows while the activity itself becomes functionally differentiated, and when those tasked with doing the measuring are detached organizationally from the activity being measured.”
Hellooooooooooo, Wilson!
LikeLike
That is indeed a five star must-read. A good reminder that we in education are suffering from a disease that is widespread in the culture.
LikeLike
One of our favorite quotes: Einstein: Not everything that counts can be counted and not everything that can be counted counts.
That which counts that cannot be counted is the historical process by which young human beings act to shape their own potential. Genuine teaching is the task of guiding these young people to complete this fundamental task of becoming an adult with the most care, grace, intelligence and wisdom.
I would bet that everyone of us who learned from a great teacher knows exactly what I’m talking about. Count what you will, but this uncountable experience is essential in real education of the young.
Do we really want hedge fund overlords in charge of this basic part of childhood and adolescence?
I’ll also bet that these hedge funds folks send their kids to schools that embrace what I’m saying to the fullest. Their kids’ education shall not be subjected to foolish accountability–which wastes kids’ precious time growing up.
LikeLike
Reblogged this on Crazy Normal – the Classroom Exposé.
LikeLike
In other words, psychopaths don’t like to lose; they don’t want to be seen as wrong—-so they keep on running down the path of destruction even if it ends with the death of a democracy. And even after they have destroyed the United States and turned its public schools into some sort of vampire academies, they will still blame their failure on teachers, police, and doctors.
If we take a closer look at the details of history, I think we might discover what causes bloody, violent revolutions and the downfall of civilizations. To save our current civilization, what we really need is standardized tests that reveal who the psychopaths are and then block them from positions of leadership, influence and wealth. I understand that there are already DNA tests and brain scans that do this. We could borrow from the RheeFormers and label those tests BAMs because they will be used to save the rest of us from THEM.
LikeLike
It’s the same old story, with all of it’s glory…
The establishment rests on established voices. These voices “white knight” oppression
through a historically dominant and “officially” sanctioned fantasy. Problem is, sustaining
a country on the fantasy of wishful thinking, is wishful thinking, not reality, as defined by
the results.
We may repeat the fantasy, ’till the cows come home. We can “measure” the primary
function of establishing consciousness (pedagogy) through faux metrics. Meanwhile,
back at the ranch, the level of consciousness is exposed by the effectiveness of
marketing.
No doubt, poverty interupts the sequence. However, more than just the poor, are
“captured” by the marketing.
LikeLike
I have bookmarked this article. There is a wealth of ideas and evidence that can be used, and expanded upon, to help us to save our schools and fight back against the bad policies coming down from on high.
LikeLike
Followed by Matrix Mania.
The organizational mania of Danielson and rubrics galore.
LikeLike
Here is another reference, one my favorites, on the problem with big data.
“We are more susceptible than we may think to the “dictatorship of data”—that is, letting the data govern us in ways that may do as much harm as good. The threat is that we will let our-selves be mindlessly bound by the output of our analyses even when we have reasonable grounds for suspecting something is amiss. Or that we will attribute a degree of truth to data which it does not deserve.” Viktor Mayer-Schönberger & Kenneth Cukier. (2013). Big Data: A Revolution That Will Transform How We Live, Work, and Think. Boston: Houghton Mifflin. p. 166.
In Ohio, the metrics summarized in district report cards for schools are set up to forward the demolition of school boards and conventional districts in favor of CEO managed districts.
A new state law intended to restore funding for librarians, counselors, studies in the arts and other “services” was modified at the last minute to change that whole intent. The last minute substitute language enables the abolishment of school boards and districts in favor of take overs similar to those in New York, New Jersey, Philadelphia, New Orleans, Detroit.
This process will be aided and abetted by the Ohio State Department of Education through the use of new data reporting requirements and a host of unreasonable “performance” criteria beyond those required by the federal regime. The calculations of teacher, school, and district performance are significantly influenced by value-added measures and stack ratings against “comparable districts” as well as state-wide norms.
Here is a link to a 22 page Cincinnati Public School “report card” with not much discussion of the the letter grades are determined, but in full color with charts and graphs. The report card will be modified in 2016.
The takeover of Youngstown schools appears to be underway. Dayton could be next, so could Cincinnati and Columbus. In Dayton, the district has been seriously damaged by charters. These have gutted the budget and raided schools of the students and the parental support and students who are most eager and able to learn. The pressure to raise test scores among the remaining students has also drained the budget, made a lot of external trainers ver rich, all promising promising silver bullets while making teachers numb to an endless stream of “dog and pony shows.”
I have said more than once that the scores of standardized tests are the weapons of choice for demolishing public schools. The scores enable seemingly “objective” decisions.
This article is especially valuable for putting a corporate culture mania in perspective. Of course, the idea that measures of productivity, year-to-year “growth,” continuous improvement, and quests for “best value” should be the basis for judging nearlly everything is a myth perpetuated in no small degree by economists.
Click to access 043752_2012-2013_DIST.pdf
LikeLike
If you read the linked article, which is beyond excellent, you should look at the comment by wigwag at the end.
LikeLike