Daniel Koretz is one of the leading authorities on testing in the United States. A professor at Harvard University, he has written two important books about testing–its uses and misuses.
The first was Measuring Up: What Educational Testing Really Tells Us.
His latest is The Testing Charade: Pretending to Make Schools Better.
He wrote:
In December, we received more bad news about the achievement of American students: Our 15-year-olds made no significant progress in math and reading on PISA, the largest of the international tests. This followed on the heels of a new report from our National Assessment of Educational Progress (NAEP), which showed no real progress in reading or math for fourth or eighth grade students for the past decade, and longer for reading.
The routine debate is underway about how bad this news is, but such arguments mostly miss a core lesson: America’s school reform movement has plainly failed. It’s time to face up to this failure and think about new approaches for improving education.
The routine debate is underway about how bad this news is, but such arguments mostly miss a core lesson: America’s school reform movement has plainly failed.
There have been numerous reforms over the past two decades, but at the heart of them are efforts to pressure educators to raise test scores. The idea is deceptively simple. Tests measure important things we want students to learn. Hold educators accountable for raising scores, and they will teach kids more. And by focusing accountability on low-scoring groups — most often by setting uniform targets via state or federal laws, such as No Child Left Behind and the Every Student Succeeds Act — we will close achievement gaps.
Unfortunately, this concept has turned out to be more simplistic than simple, and it hasn’t worked. Even though the primary focus has been reading and math tests, reading hasn’t improved. Test-based accountability has contributed to math gains among younger students, but these improvements ended a decade ago, were achieved in part by taking time away from other subjects, and don’t persist until students graduate from school, making them of questionable value. The effort to improve equity has also failed.
As I showed decades ago, the gap between racial and ethnic minorities and non-minority students started to narrow before the rush into test-based accountability, but that progress has ground to a halt in recent years. At the same time, as Sean Reardon at Stanford University has shown, the gap between rich and poor students has widened on a variety of independent tests. The gap between high- and low-scoring students overall has also recently grown larger.
“Reform” that involves mandating high-stakes testing is a farce. Some politicians have claimed that the way to make kids smarter is to test them more frequently and to make the tests harder and harder. Koretz says this is nonsense. I say that the politicians should be required to take and pass the tests they mandate for helpless students.
It is good to have our concerns and doubts about the pernicious effects of high-stakes testing confirmed by one of the nation’s leading testing experts.
I still find the expectation that American student’s performance on student tasks (aka tests) should get better quite bizarre. What is the basis for this expectation? It seems to be be some sort of desire for there to be “progress.” But, even in an industrial process there are limitations: the quality of the starting materials, the equipment/process used to transform the starting materials into products and the abilities of the workers involved.
Improve the production process, improve the workers, improve the quality of the starting materials and we would all expect an improvement in the process, but not necessarily the quality of the product. The product might be made more economically, providing more profits or made more quickly, requiring less labor or … but an expectation that the product would be “better” is not necessarily built in.
So, in order for American students to get better and better, we should need for American students to be improved by, what, evolution? Better public health? Maybe we expect that teachers will somehow magically become more effective. How much effort/progress has been made on this front? (Little to none.) How about the educational process? Has that been demonstrably improved?
Yet politicians expect schools to get better and better and better at what they do, often with less and less being spent to do it. This is bizarre. They don’t even have a definition of “better.”
They also assume that the test “measures” progress toward or away from “better”.
The whole thing is an undefined farce masquerading as science.
There can never be consistent test score improvement because as soon as test scores improve, it is assumed that the test is faulty (too easy) if that many students are doing so well. So they change up the test and, voila!, test scores go back where they “should” be.
The test makers can make the scores do anything they want.
Want to make students look dumb? Drop the scaled scores.
Want to make them look smart or like they are “improving”? Raise the scores.
Such games are played all the time. Massachusetts played it with the MCAS and New York and other states played it with the Common Core “aligned” tests.
The key to raising or lowering test results is the “cut score,” i.e., the passing mark.
The cut score is easily manipulated, and no one is the wiser.
Raise the passing mark, and you manufacture failure.
Lower the passing mark, and the kids suddenly get higher scores.
This happened in New York State in 2009, the year that Bloomberg faced re-election. The State Education Department (and Regents) lowered the cut scores, and Bloomberg bragged about dramatic improvements during his time in office. After the election, the public learned that the tests got “easier” and the score increase was manufactured.
The biggest lie associated with “standardized” tests is that they are standardized because they are almost constantly being changed, which makes any comparison of scores over time suspect.
What they really are is normalized, which, though related, is not the same.
Biggest lie
The whole test making and test scoring process is actually quite comical.
Perhaps the most comical example of all I the “process” used by College Board. They completely remake their tests on an almost yearly basis and then have the audassity (not misspelled) to claim that they can compare scores over a period of years or even decades.
One does not have to be a test expert to see through such malarkey. In fact, one would have to be a complete idiot to believe their claim.
Diane Ravitch. To understand most test results you have to ignore the cut score(s). You have to look at both the scaled score and the range of the scale. All scales can be converted to a zero to 100 scale by calculating the percent of a perfect score. As a former teacher, you may have used the same math to calculate a student’s grade. Or you used a pre-made chart. And there are exceptions like the SAT, and every other assessment where zero does not start at zero. There you have to zero the scale range. Or write a formula. I use simple subtraction. And yes it is just that damn simple. Is it a perfect process, no. There are a few flaws, yes. But the translated scores better reflect student growth, and how many are left behind and how far groups of those students are left behind each year during birth through high school graduation years.
If you knew as much about standardized tests as I do, you would understand that standardized tests are crappy, not our education, nor our teachers, nor our schools. The fact that federal law has forced all of the above to live or die by the results of these ill-designed, useless, intrinsically biased instruments is what’s really crappy.
I apologize if I failed to make myself clear. The Total group scale scores are (on every assessment I have studied so far back to the 1967 national SAT assessment) below what a 70 would be as a percent of a perfect score. Maybe the test is the reason students average total group scores from 1967 through 2019 range from the high 40’s to the low 60’s as a percent of a perfect score. Or maybe not. I just know what the results are using the assessments from 1967 through 2019 are. And the PISA results are not much different. I translate results, and any score below a 70 is crappy as far as I am concerned. And if you know some expert that disagrees with me I would be happy to discuss my work and studies with them. And yes my work is so old school that it is unconventional.
bkendall527,
I served on the national testing board (that governs NAEP) for seven years. I reviewed questions. I found many that had been approved by multiple reviewers that were simply wrong: the questions were ambiguous and there was more than one “right answer.” I realized over time that the “Standardized Testing Mindset” is itself deeply damaging to young people–and to the adults who cite the results.
Most questions of importance do not have a single right answer.
Simple questions do, but we expect young people to think and make decisions, not to pick the one right answer to the questions they will confront in life.
Let me repeat: Because of the mindset that standardized testing encourages, promotes, generates, and rewards, such tests are crap.
I wish I had a nicer word with which to describe them. Try “garbage.” “Pointless.” “Damaging.” “Stupid.”
I will not argue technicalities with you. That is akin to a detailed discussion of whether the moon is made of green cheese (yellow cheese?) and how many angels can dance on the head of a pin.
THE great lesson of the quality control movement in business was that when you empower employees on the line to improve quality, quality improves. The secret sauce, there, is worker autonomy–the individual worker seeing that he or she can make a difference because the systems in place allow that to happen. The opposite theory of management is that of top-down command, coercion, and control based upon mandates issued by the few, often based, in turn, on flawed “data.” (Think of the quotas issued by authoritarian, fascist regimes for production of pig iron and pork bellies.)
Perhaps the most important article on business written in the 20th century was “The Balanced Scorecared: Measures That Drive Performance, by Kaplan and Norton, which appeared in the Harvard Business Review in 1992. Here it is: https://hbr.org/1992/01/the-balanced-scorecard-measures-that-drive-performance-2
The great lesson from Kaplan and Norton is that if management works from incomplete data (in the case of U.S. business, from financial measures only), it will fail. But it’s worse than that if the data is itself invalid and corrupted, as in test-based education “Reform.”
The Deformer theory was that “you get what you measure” and that if you measure performance, you will get it. But there are two major problems with this theory. The first: the measurements have to be accurate. You can’t work from flawed data from invalid tests. The second, even more important: that’s not how people actually tick. A great deal of research into human motivation has taught us that with regard to cognitive tasks, external punishments and rewards (e.g., test scores) are actually DEMOTIVATING.
Deformers claim to understand business and to want to apply lessons from business to education, but they missed entirely the lessons of the quality control movement. You get continuous improvement by BOTTOM UP means. Return control to teachers. Give them smaller classes. Give them a LOT more free time to devote to Japanese-style Lesson Study, in which they meet with one another to recap the week’s work, plan lessons, share materials, choose materials, discuss what is working and what isn’t. Give them autonomy because that’s how people tick.
The Prime Directive in education should be to create life-long, intrinsically motivated learners. Intrinsic motivation is THE KEY AND THE GOAL. And that comes by giving people a general goal and the autonomy to work toward it AS THEY SEE FIT.
You remind me of a comment I heard when I spoke at a Texas School Board Association meeting a few years ago.
A man in the crowd responded to my critique of the current testing regime this way. He said, “I’m an engineer. The factory I work for has a quality control system that spot-checks the items we produce. If we inspected every single item, we would spend all our time and energy inspecting and have no time to manufacture anything or work on improvements.
The Japanese learned from Deming, Juran, Shewhart, and others that you have to empower the workers to decide when there is a problem that has to be addressed. Very different from instituting a top-down inspection system, or worse yet, a fully automated one. The power has to lie with teachers. What madness, this suggestion, huh? That we work, for a change, the way people are built to work. Intrinsically motivated people with autonomy achieve amazing things. The opposite of that: the beatings will continue until morale improves.
Let me be absolutely clear that what I am NOT suggesting here is that we substitute for the current top-down, test-and-punish model one based upon a top-down “balanced scorecard” model. That, too, will fail as miserably as test-and-punish did. Any top-down system will.
So, what’s the alternative? Return control to teachers. Give them the time and autonomy that they need to choose materials and plan. Give them a national wiki, completely open and free, to which researchers and scholars and classroom practitioners can post assessments and lesson plans and vocabulary lists and reading lists and curriculum maps and from which teachers can freely choose. DO NOT PUT THIS INTO THE HANDS OF ANY CENTRALIZED GATEKEEPER. The last thing that education needs is some kind of centralized Thought Police. It needs a free forum for ideas and free teachers, ones who actually run things at the school level, to choose from the materials and ideas in that national forum.
The quality control movement was designed for manufacturing, for which the goal is standardization and the elimination of all variation in the process and product.
Demings observations regarding workers are certainly important, but the problem is that the whole business frame for education is fundamentally flawed.
Students are not products, but people.
I fully agree, SomeDAM. My point was not that we should take from that movement the idea that IS applicable–that one gets continuous improvement from bottom-up methods, ones that increase worker autonomy, ones that empower those on the line. Yes, it is another form of DEFORMY insanity to attempt to standardize. The last thing we need in education is a whole bunch of autocratic “black belt” Six Sigma types. Or some top-down “balanced scorecard.” Been there. Done that. Utter failure.
The argument against what I am suggesting–that teachers be given power over what and how they teach–is an old one: this would be the inmates running the asylum. But people who think that don’t understand the power of intrinsic motivation resulting from autonomy. And they don’t understand how social sanction generates stability with improvement over time in actual communities. We need teacher-run schools paired with the freedom to choose from among curricular and pedagogical ideas and methods that are continually added to and refined, in democratic fora, by researchers, scholars, and classroom practitioners.
I understand that this is a radical idea. And it has a name: democracy.
What you get when you have power at the school level, with teachers, is something like the Common Law (as opposed to Statutory law). You get stability with innovation to meet particular, changed circumstances. Social sanction within empowered communities of practices lends stability. Autonomy in those communities of practice allows for innovation and continuous but nondisruptive improvement. Back before we moved power from the school to the district then to the state and then to the feds, that’s the situation that prevailed. Walk into any 9th-grade English classroom in the US, and the teacher would be teaching Romeo and Juliet. Why? Social sanction, established practice, the habits of the tribe. But, publish an article in the English Journal about the value of the writing process (of prewriting and revision) or of sentence combining for increasing students’ syntactic fluency, and millions of teachers would adopt it? Why? Because they had the power to do that. They had autonomy. They could meet as a department and make this decision, and they didn’t have an administrator telling them, that’s not on Lord Coleman’s bullet list.
There are proper roles for administration–facilities maintenance, funding, ensuring compliance with federal and state laws requiring equity with regard to race, socioeconomic status, and disability, ensuring that teachers be properly credentialed. But the power to determine what should be taught and how has to rest with teachers. That autonomy is the real source of continuous improvement because of the power of intrinsic motivation. Intrinsically motivated, empowered people make positive change. Scripted people bots do not.
My nominee for most Orwellian statement ever made over the course of Education Disruption and Deform has to be Bill Gates, not only because he funded all this destructive nonsense but for this gem: his INSANE claim that you get innovation by standardization. Spoken like the monopolist he is. Ideas matter. Lord, what damage that idea has done!!!
The thesis of the Common Core is that standardization is necessary to create a national marketplace for new software and teaching materials, which in turn leads to innovation because of the large market.
The way you get innovation is from having small competitors, including new entrance, in the marketplace. What makes that possible? Well, when you have competing state standards or no state standards at all, that’s what happens. Let me give you a real-life example. Years ago, I was working as a baby editor at McDougal, Littell. The company was small enough, in those days, to sit down at a card game together, if we had wanted to. At the time, I was working on the big, national Literature program we were doing. Well, Texas did a peculiar call for a unique Literature program for challenged students. They wanted one that integrated literature study with everything else in the curriculum–writing, grammar and usage, spelling, speaking and listening, etc. The big publishers weren’t interested. Too small a market–one state. But McDougal was small and saw and opportunity. They did an Integrated Language Arts Literature program for that one state call. It was headed up by a very capable young woman named Julie Schumacher. Well, as it turned out, people in other states liked that program too. The weird program for the odd call from the little company BECAME THE BESTSELLING LITERATURE PROGRAM IN THE COUNTRY. National standards allow behemoths to monopolize. Innovation through standardization is like smoking as a cancer cure. That ought to be obvious enough. Gates knew
I’ve lived through an effective school improvement plan. It was a deliberate effort by both teachers and administrators to work together for the good of the students. It was an evolutionary process that involved training, research and mutual respect from all members of the team. Our result had** nothing** to do with testing or outside interests. We changed curricula and provided more supports and services to students.
Empowered people make positive change. EVERYONE gets up in the morning and wants to do that, unless he or she is so beaten down by external control as to be angry, frustrated, tired, resigned, over it.
Sticking to the industrial analogy, yes the factory needs materials to arrive in production quality form, not as a somewhat inconsistent processed raw material. It also needs consistent support from the material suppliers during the manufacturing process. The end goal is for the finished product to become a supplier of quality production material, and be better skilled at supporting the material during the manufacturing process. This needs to become a continuous forward cycle. Improving the process with each cycle. Just one answer to your questions.
One of the profound idiocies of education deform is the notion of a single track of college and career readiness for all students. Students differ. So do the needs of a profoundly diverse, pluralistic economy. Students aren’t widgets to be identically milled. We need cosmologists and cosmetologists, not a Procrustean bed where students are maimed in the name of standardization and uniformity.
Irony
“Kids ain’t learning”
Says the test
Testing’s burning
Up the best
Testing more
Is testing bane
Testing lore
Is most to blame
LORE. yes.
Such books are extremely important. It’s time to end this expensive boondoggle that has so trivialized and distorted U.S. curricula and pedagogy, stolen humane education from a generation of students, and driven those students to depression, despair, and to unprecedented levels of suicide.
Stopping by School on a Disruptive Afternoon | Bob Shepherd
after decades of test-driven education “reform”
Whose schools these are, I think I know.
His house is near Seattle though.
He will not see me stopping here
to watch what kids now undergo.
My better angels think it queer
to see a place so void of cheer
what with the tests and data chats,
the data walls with children’s stats.
Where are the joys of yesterday—
when kids would draw and sing and play?
The only sound I hear’s defeat
and pencils on the bubble sheets.
Disrupters say, unflappable,
“We’re building Human Capital!”
Such word goes out from their think tanks,
as they their profits build and bank.
“Music, stories, art, and play
won’t teach Prole children to obey
with servile, certain, gritful grace
and know their rightful, lowly place.”
The fog is heavy, dark and deep.
Where thinking tanks, Deformers creep
and from our children childhood steal
and grind them underneath the wheel.
Postscript:
Disruption of the Commonweal
is that in which Deformers deal
that they might thereby crises fake
as cover whereby they might take
(the smiling villains!) take and take
and take and take and take and take.
This poem may be shared freely. (Please do!) But please include the attribution. Thanks!
Thanks, Bob!
Thank you, RT. One posts these pieces and wonders, will anyone ever read them?
Outstanding!
Thank you, SomeDAM. That means a lot coming from you, with your incomparable skill at this.
So called reform has distorted curricula content and tested our young people ad nauseam. While the same misguided individuals continue to blame our teachers and schools, our problems are far more complex than raising scores. Our resulting narrowed curricula have denied too many students a well rounded comprehensive education. The education of our young people has become and political and economic football that has frustrated teachers, parents and students. To move forward we must make investments in our schools and return autonomy to our teachers who understand far better than politicians what our young people need. We must free our schools from political and corporate influences. The “gap” cannot be erased by testing. The gap is the result of political and economic factors over which public schools have no ability to change.
From reading the article, it’s hard to be certain Koretz is really on the right side for testing. Among the reasons he doesn’t like testing (including some absolutely correct ones that Diane highlights) is that teachers are able to game the system and increase test scores without increasing actual learning. He suggests that better tests would be good. He also gives the impression of being an accountability hawk. While he doesn’t agree with accountability by testing (partially because of the faulty tests), he does seem to believe that higher accountability for teachers is a key to future education improvement. Koretz creates cognitive dissonance for me, because he says some great things, but he also gives me an anti-teacher vibe that makes me think he’s also trying to appeal to an Ed Reform crowd.
Maybe because he IS a Professor at HAAHVAD and is well above the rank and file of “regular” teachers that he gives that impression? (sarcasm!)
How would we know whether or not students are “getting smarter”? PISA scores? Please. The fact that record numbers of young people are organizing and joining activist organizations like DSA and actually fighting back against the 1% tells me that there are a hell of a lot of smart young people these days.
I’d say that as a group, the young people are much smarter than many (if not most) of the adults, many of whom basically stick their heads in the sand and pretend that most problems like climate change don’t exist.
It’s very ironic that the idiots are judging who is smart.
It is, indeed. As you doubtless know, even the oh-so-limited IQ tests have had to be recalibrated, over the course of a century, many times, to account for the overall increases in IQ in the population. The Flynn Effect: https://en.wikipedia.org/wiki/Flynn_effect
It is doubly ironic that the laws about testing are passed by people who couldn’t pass the tests they mandate.
Here is the link to the article:
https://www.nbcnews.com/think/opinion/american-students-aren-t-getting-smarter-test-based-reform-initiatives-ncna1103366
H/T: Stephen Ronan
I can’t help thinking that what tests mostly show is how angry students are about the whole process. “Why are you pushing me through this garbage again and again?. I’ m smarter than you are” may well be what students are revealing in their response to tests. From my point of view as a retired and successful school principal, education is really the opportunity for students to become involved with work that is part of their future; difficult but also meaningful and enjoyable. What they need are classrooms and outside activities that focus on their interests and needs as emerging adults. They are not idiots who must submit to textbook garbage, meaningless assignments, and the endless torture of tests.
I saw this every time we gave a standardized test. Almost the universal reaction. Anger and frustration of the idiocy of the adults giving this garbage.
Give me the student who writes across the answer sheet, “Sorry, but my mind is too deliciously nonstandard for your standardized test to even begin to measure its capabilities and too uncommon and original for your oh-so common Common Core.”
Somewhere in my effects there is a picture one of my artistic students drew on a practice PARCC test we gave several years ago. It was a beautiful self portrait looking wistfully out of the pages of the test booklet with something written about the difficulty of teaching a fish to fly.
That’s quite the story you told about the T-shirt salesperson, RT! So terribly sad.
Like most responders I wrote before I read. But fortunately I agreed with all of them. Can’t we replace the idiots wasting everybody’s time and money on testing? We’d do a much better job of teaching and also be cheaper.
While the average scale scores of students have remained relatively stagnate since 1967, the numbers, and possibly the percentages of students testing have increased. And while the average scale scores are crappy (failing [below 70] as a percent of a perfect score) more students are being tested. That is evidence of both growth and a better measure of where we are.
One example of positive change and growth is my youngest. Evaluated as autistic before he was two, it was recommended that he be institutionalized. My wife and I disagreed. Currently, he is in graduate school. He scored a 32 out of a possible 36 in reading on the ACT. As a percent of a perfect score that would be an 89 out of a possible 100 in reading.
And yes, one example out of millions may not be meaningful. But I also know he is not alone.
More students are getting a half-assed birth through high school education today than ever before. The problem is that very few people in the U.S. know (or admit it publicly) that it has been half-assed for a supermajority of students for more than a half-century.
I suspect more folks would know and discuss the crappy education if Education Wonks did a little mathematical thinking and used their elementary and middle school math skills to figure it out for themselves.
Sounds like your son did not get a “crappy education.”
How did the U.S. become a world power if 90% of us had a “crappy education”?
Do you think the kids who get vouchers to go to Evangelical schools where they learn that humans and dinosaurs co-existed get a better education?
Or do you like the no-excuses charters where kids are punished if they don’t walk in a straight line or dare to speak to a friend without permission in the hallway?
Considering PISA results for as far back as they go, education has been crappy for most of the planet for longer than most of us have been alive. And yes I know this is hard to believe. When I figured out how to translate the various scaled scoring systems to a common scale of zero to 100 I did not believe it either. At first. What made it even harder to believe was that converting the scoring systems to a zero to 100 scale only required a little Elementary and Middle school math, and some applied mathematical thinking. When one realizes that every scale score is also a fraction of the whole scale range. And all you have to do to find the percent of the whole is divide it by the whole. Teachers do the same thing to calculate grades for their students when 100 is a perfect score.
Diane.
We became a world power in part because our infrastructure was not repeatedly destroyed since the mid-1860’s, unlike other countries around the world and again during WWI, and especially WWII. And we became an attractive mostly peaceful place for people to move to, bringing skills and knowledge with them, effectively draining from other countries’ pool of human resources.
The benefit of an incomplete military education and 20-years of practice (U.S. Army, Infantry) where one learns to defeat the enemy by eliminating their capacity to fight by reducing their resources, human and otherwise, and to diminish their motivation to continue through attrition and reduced moral. And at home, your country continues to build upon its infrastructure, before, during, and after. I give you the Industrial Military Complex as an example. And we (U.S.) have had a penchant for sending American’s abroad to places seldom thought of so Americans can continue their World Geography lessons for the rest of their lives.
My wife and I attended public schools, and all of our children attended both public and Department of Defense schools.
As to the types of schools you asked about, I have reservations and concerns. And, I am no fan of them.
I always appreciate it when I am asked questions.
It certainly benefitted the US, as compared to Europe and Japan, that our country’s infrastructure was not destroyed during the world wars of the 20th Century. However your claim that our public education system has crippled our nation simply doesn’t hold up to scrutiny. First of all, this country has an incredibly productive workforce. Second, our public investments in low-cost higher education (until recently) created a thriving middle-class. Third, our public investment in basic and applied research in science (until recently) created a very successful and advanced technological and medical sector. Fourth, our vigorous private sector wisely used these government investments to generate medical, scientific, and technological advances. Fifth, the countries that were crushed by wars in the 20th Century have risen to become our equals because we paid to help them rebuild. Most of them have learned from us to invest in science, education, and technology and to build a strong middle class. Meanwhile, we have a government that is attacking science, underfunding education, putting higher education out of reach of many striving youth, and allowing predatory entrepreneurs to get fat while the middle class is hollowed out.
“Many young teachers — who are trying to do right by their students — are told explicitly that bad test prep is good teaching. Having never seen any other way, they believe it, and their students suffer.” –From the Koretz article toward the end.
I have had conversations with young teachers that validate this point. If you think that good teaching consists of tailoring your instruction to specific lists of ideas (which are maliciously termed “standards”), then you have been victimized by a false narrative. Teaching far more resembles a cornfield, which sprouts good corn after years of good soil management and fertilization techniques. Education is not widget production.
Some time ago I posted show me a written test that shows how valuable a person is to society by that test. Someone wrote about a Chinese test that did indeed help society but that observation was about finances exclusively. Important of course but there are MANY things which have shaped society for the better. The great moral leaders have led the way.
Again, show me a written test or test score which shows the value to society of a student or adult.
“It is good to have our concerns and doubts about the pernicious effects of high-stakes testing confirmed by one of the nation’s leading testing experts.”
It was confirmed by THE world’s leading test expert even before the high-stakes testing in the US devolved into what it is now. See Noel Wilson’s never refuted nor rebutted 1997 dissertation “Educational Standards and the Problem of Error” found at https://epaa.asu.edu/ojs/article/viewFile/577/700
Brief outline of Wilson’s “Educational Standards and the Problem of Error” and some comments of mine. (updated 6/24/13 per Wilson email)
1. A description of a quality can only be partially quantified. Quantity is almost always a very small aspect of quality. It is illogical to judge/assess a whole category only by a part of the whole. The assessment is, by definition, lacking in the sense that “assessments are always of multidimensional qualities. To quantify them as unidimensional quantities (numbers or grades) is to perpetuate a fundamental logical error” (per Wilson). The teaching and learning process falls in the logical realm of aesthetics/qualities of human interactions. In attempting to quantify educational standards and standardized testing the descriptive information about said interactions is inadequate, insufficient and inferior to the point of invalidity and unacceptability.
A major epistemological mistake is that we attach, with great importance, the “score” of the student, not only onto the student but also, by extension, the teacher, school and district. Any description of a testing event is only a description of an interaction, that of the student and the testing device at a given time and place. The only correct logical thing that we can attempt to do is to describe that interaction (how accurately or not is a whole other story). That description cannot, by logical thought, be “assigned/attached” to the student as it cannot be a description of the student but the interaction. And this error is probably one of the most egregious “errors” that occur with standardized testing (and even the “grading” of students by a teacher).
Wilson identifies four “frames of reference” each with distinct assumptions (epistemological basis) about the assessment process from which the “assessor” views the interactions of the teaching and learning process: the Judge (think college professor who “knows” the students capabilities and grades them accordingly), the General Frame-think standardized testing that claims to have a “scientific” basis, the Specific Frame-think of learning by objective like computer based learning, getting a correct answer before moving on to the next screen, and the Responsive Frame-think of an apprenticeship in a trade or a medical residency program where the learner interacts with the “teacher” with constant feedback. Each category has its own sources of error and more error in the process is caused when the assessor confuses and conflates the categories.
Wilson elucidates the notion of “error”: “Error is predicated on a notion of perfection; to allocate error is to imply what is without error; to know error it is necessary to determine what is true. And what is true is determined by what we define as true, theoretically by the assumptions of our epistemology, practically by the events and non-events, the discourses and silences, the world of surfaces and their interactions and interpretations; in short, the practices that permeate the field. . . Error is the uncertainty dimension of the statement; error is the band within which chaos reigns, in which anything can happen. Error comprises all of those eventful circumstances which make the assessment statement less than perfectly precise, the measure less than perfectly accurate, the rank order less than perfectly stable, the standard and its measurement less than absolute, and the communication of its truth less than impeccable.”
In other words all the logical errors involved in the process render any conclusions invalid.
The test makers/psychometricians, through all sorts of mathematical machinations attempt to “prove” that these tests (based on standards) are valid-errorless or supposedly at least with minimal error [they aren’t]. Wilson turns the concept of validity on its head and focuses on just how invalid the machinations and the test and results are. He is an advocate for the test taker not the test maker. In doing so he identifies thirteen sources of “error”, any one of which renders the test making/giving/disseminating of results invalid. And a basic logical premise is that once something is shown to be invalid it is just that, invalid, and no amount of “fudging” by the psychometricians/test makers can alleviate that invalidity.
Having shown the invalidity, and therefore the unreliability, of the whole process Wilson concludes, rightly so, that any result/information gleaned from the process is “vain and illusory”. In other words start with an invalidity, end with an invalidity (except by sheer chance every once in a while, like a blind and anosmic squirrel who finds the occasional acorn, a result may be “true”) or to put in more mundane terms crap in-crap out.
And so what does this all mean? I’ll let Wilson have the second to last word: “So what does a test measure in our world? It measures what the person with the power to pay for the test says it measures. And the person who sets the test will name the test what the person who pays for the test wants the test to be named.”
In other words it attempts to measure “’something’ and we can specify some of the ‘errors’ in that ‘something’ but still don’t know [precisely] what the ‘something’ is.” The whole process harms many students as the social rewards for some are not available to others who “don’t make the grade (sic)” Should American public education have the function of sorting and separating students so that some may receive greater benefits than others, especially considering that the sorting and separating devices, educational standards and standardized testing, are so flawed not only in concept but in execution?
My answer is NO!!!!!
One final note with Wilson channeling Foucault and his concept of subjectivization:
“So the mark [grade/test score] becomes part of the story about yourself and with sufficient repetitions becomes true: true because those who know, those in authority, say it is true; true because the society in which you live legitimates this authority; true because your cultural habitus makes it difficult for you to perceive, conceive and integrate those aspects of your experience that contradict the story; true because in acting out your story, which now includes the mark and its meaning, the social truth that created it is confirmed; true because if your mark is high you are consistently rewarded, so that your voice becomes a voice of authority in the power-knowledge discourses that reproduce the structure that helped to produce you; true because if your mark is low your voice becomes muted and confirms your lower position in the social hierarchy; true finally because that success or failure confirms that mark that implicitly predicted the now self-evident consequences. And so the circle is complete.”
In other words students “internalize” what those “marks” (grades/test scores) mean, and since the vast majority of the students have not developed the mental skills to counteract what the “authorities” say, they accept as “natural and normal” that “story/description” of them. Although paradoxical in a sense, the “I’m an “A” student” is almost as harmful as “I’m an ‘F’ student” in hindering students becoming independent, critical and free thinkers. And having independent, critical and free thinkers is a threat to the current socio-economic structure of society.
Don’t know why my paragraph breaks don’t come through when I post here.
Thanks for this re-posting. It reminded me that one of the things that made me keep coming here for the ideas was this exact post. Thanks Duane.
Yay, Duane! & keep posting this for the many new readers Diane always have (& will certainly have even more after they’ve read Slaying Goliath).
Thanks for the link. Downloaded and put in my current reading file. Since I have my own list of things that can be wrong on any assessment, it should be an interesting read.
& bkendall–if you’re still reading here (This comment is late to the party)–read some of the info. about the gross inaccuracies of “standardized” test scoring–Todd Farley’s 2009 book (unfortunately, what he wrote about is still the case), Making the Grades: My Misadventures in the Standardized Testing Industry.
Diane posted an interview w/Todd on Nov. 27, 2012. Finally, Mr. Farley
has posted on Huffington, esp. about computer-scored tests, so you might want to read that.
Welcome to the blog!
Thanks for the suggestion. And I will add it to my growing reading list. I have been building a list of the things that are detrimental to assessing academic knowledge in an academic environment, and hope to add to the list.
I am not anti-assessment, even though I except all academic testing as potentially flawed, even those assessments created by classroom teachers. Yet, what can replace it, that itself is not flawed? It is a question for which I do not know the answers.
My work, studies, and nitch are focused on making published standardized assessment results understandable for anyone interested. And the general reaction is disbelief or anger, or both. And it is not unexpected.
Since standardized tests are accurate barometers of wealth and poverty, why don’t you publicize that fact?
I would like to hear your thoughts on Wilson’s work after you read them. Please feel free to contact me at dswacker@centurytel.net. Please put Wilson’s work in the subject. Gracias.
For a simpler read on invalidity see Wilson’s review of the standards and testing bible: A Little Less than Valid: An Essay Review
http://edrev.asu.edu/index.php/ER/article/view/1372/43
Thanks, Duane.