Peter Greene thinks you should know the truth. There is no objective way to set passing marks (known as “cut scores”) for standardized tests. The line between “excellence,” “proficient,” “basic,” and “failing” is arbitrary.
Peter writes:
If you are imagining that cut scores for the high-stakes accountability tests are derived through some rigorous study of exactly what students need to know and what level of proficiency they should have achieved by a certain age– well, first, take a look at what you’re assuming. Did you really think we have some sort of master list, some scholastic Mean Sea Level that tells us exactly what a human being of a certain age should know and be able to do as agreed upon by some wise council of experty experts? Because if you do, you might as well imagine that those experts fly to their meetings on pink pegasi, a flock of winger horsies that dance on rainbows and take minutes of the Wise Expert meetings by dictating to secretarial armadillos clothed in shimmering mink stoles.
Anyway, it doesn’t matter because there are no signs that any of these people associated with The Test are trying to work with a hypothetical set of academic standards anyway. Instead, what we see over and over (even back in the days of NCLB), is educational amateurs setting cut scores for political purposes. So SBAC sets a cut score so that almost two thirds of the students will fail. John King in New York famously predicted the percentage of test failure before the test was even out the door– but the actual cut scores were set after the test was taken.
That is not how you measure a test result against a standard. That’s how you set a test standard based on the results you want to see. It’s how you make your failure predictions come true. According to Carol Burris, King also attempted to find some connection between SAT results and college success prediction, and then somehow graft that onto a cut score for the NY tests, while Kentucky and other CCSS states played similar games with the ACT.
Actually, both of the federally-funded testing consortia (PARCC and SBAC) agreed to align their cut scores with those of the National Assessment of Educational Progress, so that “proficient” would be the same as NAEP proficient. The problem there is that Massachusetts is the only state in the nation where as many as 50% of students reached NAEP Proficient. In fact, NAEP proficient is a very high standard. NAEP has four achievement levels: advanced (which is about 8-10%) of students, truly superior performance; proficient (which is “solid academic achievement,” which I consider to be akin to an A or A-); basic (which is akin to the range of B or C); and below basic (failing).
Writing in Education Week, Catherine Gewertz wrote that:
It’s one thing for all but a few states to agree on one shared set of academic standards. It’s quite another for them to agree on when students are “college ready” and to set that test score at a dauntingly high place. Yet that’s what two state assessment groups are doing.
The two common-assessment consortia are taking early steps to align the “college readiness” achievement levels on their tests with the rigorous proficiency standard of the National Assessment of Educational Progress, a move that is expected to set many states up for a steep drop in scores.
After all, fewer than four in 10 children reached the “proficient” level on the 2013 NAEP in reading and math.
By aligning with NAEP proficient, the two consortia assured that the majority of students would not pass either test. That is entirely predictable, because less than 40% typically reach NAEP proficient. Although the U.S. has seen significant improvement over the past 20 years, especially from “below basic” to “basic,” most students have not reached NAEP proficient. Thus, unless the two consortia change their cut scores, we can anticipate that most students in the U.S. will “fail” the Common Core tests for the foreseeable future. “Reformers” think this will set off popular demand for charter schools and voucher schools, but in New York, the charter schools performed no better on the Common Core tests than the public schools, and voucher schools have never outperformed public schools in any city. The likely outcome of this absurd decision about cut scores will be to encourage the anti-testing movement.

“There is no objective way to set passing marks (known as “cut scores”) for standardized tests. The line between “excellence,” “proficient,” “basic,” and “failing” is arbitrary.
Bingo, Bangle, Boingo!!
As Noel Wilson states in his never refuted nor rebutted complete destruction of educational standards and standardized testing “So what does a test measure in our world? It measures what the person with the power to pay for the test says it measures. And the person who sets the test will name the test what the person who pays for the test wants the test to be named.”
“Educational Standards and the Problem of Error” found at: http://epaa.asu.edu/ojs/article/view/577/700
Brief outline of Wilson’s “Educational Standards and the Problem of Error” and some comments of mine. (updated 6/24/13 per Wilson email)
1. A description of a quality can only be partially quantified. Quantity is almost always a very small aspect of quality. It is illogical to judge/assess a whole category only by a part of the whole. The assessment is, by definition, lacking in the sense that “assessments are always of multidimensional qualities. To quantify them as unidimensional quantities (numbers or grades) is to perpetuate a fundamental logical error” (per Wilson). The teaching and learning process falls in the logical realm of aesthetics/qualities of human interactions. In attempting to quantify educational standards and standardized testing the descriptive information about said interactions is inadequate, insufficient and inferior to the point of invalidity and unacceptability.
2. A major epistemological mistake is that we attach, with great importance, the “score” of the student, not only onto the student but also, by extension, the teacher, school and district. Any description of a testing event is only a description of an interaction, that of the student and the testing device at a given time and place. The only correct logical thing that we can attempt to do is to describe that interaction (how accurately or not is a whole other story). That description cannot, by logical thought, be “assigned/attached” to the student as it cannot be a description of the student but the interaction. And this error is probably one of the most egregious “errors” that occur with standardized testing (and even the “grading” of students by a teacher).
3. Wilson identifies four “frames of reference” each with distinct assumptions (epistemological basis) about the assessment process from which the “assessor” views the interactions of the teaching and learning process: the Judge (think college professor who “knows” the students capabilities and grades them accordingly), the General Frame-think standardized testing that claims to have a “scientific” basis, the Specific Frame-think of learning by objective like computer based learning, getting a correct answer before moving on to the next screen, and the Responsive Frame-think of an apprenticeship in a trade or a medical residency program where the learner interacts with the “teacher” with constant feedback. Each category has its own sources of error and more error in the process is caused when the assessor confuses and conflates the categories.
4. Wilson elucidates the notion of “error”: “Error is predicated on a notion of perfection; to allocate error is to imply what is without error; to know error it is necessary to determine what is true. And what is true is determined by what we define as true, theoretically by the assumptions of our epistemology, practically by the events and non-events, the discourses and silences, the world of surfaces and their interactions and interpretations; in short, the practices that permeate the field. . . Error is the uncertainty dimension of the statement; error is the band within which chaos reigns, in which anything can happen. Error comprises all of those eventful circumstances which make the assessment statement less than perfectly precise, the measure less than perfectly accurate, the rank order less than perfectly stable, the standard and its measurement less than absolute, and the communication of its truth less than impeccable.”
In other word all the logical errors involved in the process render any conclusions invalid.
5. The test makers/psychometricians, through all sorts of mathematical machinations attempt to “prove” that these tests (based on standards) are valid-errorless or supposedly at least with minimal error [they aren’t]. Wilson turns the concept of validity on its head and focuses on just how invalid the machinations and the test and results are. He is an advocate for the test taker not the test maker. In doing so he identifies thirteen sources of “error”, any one of which renders the test making/giving/disseminating of results invalid. And a basic logical premise is that once something is shown to be invalid it is just that, invalid, and no amount of “fudging” by the psychometricians/test makers can alleviate that invalidity.
6. Having shown the invalidity, and therefore the unreliability, of the whole process Wilson concludes, rightly so, that any result/information gleaned from the process is “vain and illusory”. In other words start with an invalidity, end with an invalidity (except by sheer chance every once in a while, like a blind and anosmic squirrel who finds the occasional acorn, a result may be “true”) or to put in more mundane terms crap in-crap out.
7. And so what does this all mean? I’ll let Wilson have the second to last word: “So what does a test measure in our world? It measures what the person with the power to pay for the test says it measures. And the person who sets the test will name the test what the person who pays for the test wants the test to be named.”
In other words it attempts to measure “’something’ and we can specify some of the ‘errors’ in that ‘something’ but still don’t know [precisely] what the ‘something’ is.” The whole process harms many students as the social rewards for some are not available to others who “don’t make the grade (sic)” Should American public education have the function of sorting and separating students so that some may receive greater benefits than others, especially considering that the sorting and separating devices, educational standards and standardized testing, are so flawed not only in concept but in execution?
My answer is NO!!!!!
One final note with Wilson channeling Foucault and his concept of subjectivization:
“So the mark [grade/test score] becomes part of the story about yourself and with sufficient repetitions becomes true: true because those who know, those in authority, say it is true; true because the society in which you live legitimates this authority; true because your cultural habitus makes it difficult for you to perceive, conceive and integrate those aspects of your experience that contradict the story; true because in acting out your story, which now includes the mark and its meaning, the social truth that created it is confirmed; true because if your mark is high you are consistently rewarded, so that your voice becomes a voice of authority in the power-knowledge discourses that reproduce the structure that helped to produce you; true because if your mark is low your voice becomes muted and confirms your lower position in the social hierarchy; true finally because that success or failure confirms that mark that implicitly predicted the now self evident consequences. And so the circle is complete.”
In other words students “internalize” what those “marks” (grades/test scores) mean, and since the vast majority of the students have not developed the mental skills to counteract what the “authorities” say, they accept as “natural and normal” that “story/description” of them. Although paradoxical in a sense, the “I’m an “A” student” is almost as harmful as “I’m an ‘F’ student” in hindering students becoming independent, critical and free thinkers. And having independent, critical and free thinkers is a threat to the current socio-economic structure of society.
By Duane E. Swacker
LikeLike
http://www.commondreams.org/views/2015/01/12/more-evidence-public-beats-private-education
LikeLike
Good article. The public should learn that “public schools are failing” is false narrative.
LikeLike
Here’s how cut scores are usually set by professional test people – psychometricians. It’s basis is called the Angoff Method. It generally involves teachers.
Click to access passing_scores.pdf
LikeLike
And it doesn’t make any difference what method is used the cut-off scores are still arbitrary and the whole process has so many epistemological and ontological errors that any results are COMPLETELY INVALID.
YES, IT’S THAT SIMPLE, COMPLETELY INVALID.
LikeLike
Daniel Koretz, MEASURING UP: WHAT EDUCATIONAL TESTING REALLY TELLS US (2009 paperback edition).
Pages 185-188 and 191 explain the Angoff/modifed Angoff method.
There is nothing objective and error-free about it. It involves lots of human judgment and subjectivity, albeit one may have more confidence in it when “experts” are involved.
Or one may not have any confidence in it at all.
Use a coin. Flip it. Heads, confidence; tails, no confidence.
I think y’all get my drift…
😎
LikeLike
No setting of cut scores can be totally “objective”, but some are far less subjective.
After all most teachers set cut scores on every teacher made test they give; even if they stick to some grading scale, they usually choose the questions.
Searching for an objective way to determine grades or whatever is a fool’s errand.
LikeLike
From the ETS guide to setting cut scores:
It is important to list the reasons why cut scores are being set and to obtain consensus among stakeholders that the reasons are appropriate. An extremely useful exercise
is to attempt to describe exactly how the cut scores will bring about each of the desired outcomes. It may be the case that some of the expected benefits of cut scores are unlikely to be achieved unless major educational reforms are accomplished. It will become apparent that cut scores, by themselves, have very little power to improve education. Simply measuring a child and classifying the child’s growth as adequate or inadequate will not help the child grow.
For example, if one of the reasons for setting cut scores
is to improve the quality of instruction in the schools, it should become clear that cut scores by themselves would not have the desired effect. The use of cut scores may point out that certain schools, curricular areas, demographic groups, regions, and so forth are more in need of improvement than others, but the cut scores alone will not improve education. Unless the infrastructure needed to improve education is put in place, setting cut scores will be futile for that purpose.
It is also necessary to consider the potential negative effects of setting cut scores. What will happen to students who fail? Will they be stigmatized and ignored or will they be helped? What will happen to schools with large proportions of failing students? Will the institutions be punished or assisted? What will happen to teachers with large numbers of failing students? Will the teachers be punished or will they receive additional help?
Click to access Cut_Scores_Primer.pdf
LikeLike
And right from the mouths of the same crowd that is designing college-and-career ready standardized tests for kindergartners!
Thank you for extracting a note of reality from those that usually project Rheeality Distortion Fields.
😎
LikeLike
NAEP is as guilty as PARC in being arbitrary. Convening teachers to help set the arbitrary scores has some wisdom, but which teachers and how and why they make their decisions doesn’t make them “objective”. Mission Hill School tried another approach–tape recording (or whatever we can it these days) youngsters reading text at various levels of difficulty and then collectively agreeing to a scale of measurement. Even then we stopped with “fluent at material age appropriate” – and simply noted if they were also fluent at more difficult material. The interviewer asked them some questions about what they had read as well. We did it for every child every year. It might be a good exercise for all schools–maybe on a sampled basis, and for psychometricians, and PARC and NAEP.
LikeLike
Deb,
NAEP achievement levels are not set by panels of teachers but by panels of people from various backgrounds including teachers. They are as arbitrary as other standard setters. Why would a business person know what a student in 4th grade should know?
LikeLike
Because when it comes to self-styled “education reform” the less expertise you have, the more expertise you have.
😳
Case in point: John Deasy, former LAUSD Superintendent. Came to the job waving his rheephorm credentials.
Today’s LATIMES, even if very politely, outlines what an “expert” amateur does with the nation’s second largest public school district when armed with all the latest ideas and tools he secured from the Broad Academy and the Gates Foundation.
Link: http://www.latimes.com/local/education/la-me-ipad-report-20150113-story.html
Hint: it ain’t pretty except if, like denizens of Bizarro World, you think that “education reform” means “pretty [awful].”
😎
LikeLike
” the less expertise you have, the more expertise you have.” Missouri has the lowest tax on cigarettes, on gasoline, and after the state board of education chose a white woman, (the runners-up were four non urban white men) whose only teaching experience was in private schools, the board vice president, Michael Jones explained…..“As long as the state capital is in Jefferson City,” he said in an interview “and the state pays what it pays, attracting top-flight minority talent is always going to be a problem for Missouri.”
LikeLike
What is assured is a “failure by design” model. I predict that when middle class parents understand this there will be an extreme backlash. This will put the corrupt politicians in the difficult position of deciding whether ticking off voters is worse or better than ticking off corporate overseers.
LikeLike
I think that this is one of the most important posts I have seen…..unfortunately…education media in most areas will either refuse to read it at all, are label it hogwash…..I am not talking about parents…..I am talking about major newspaper education writers. My niece in law, Sarah Reckhow, is writing about school takeovers…..my nephew is a professor who writes about politics. I do not think she will be focusing on St. Louis, but if she did, she would recognize a pattern in 2007 St. Louis, 2012 Normandy and Kansas City….politicians stepping in or up for charter schools….sometimes it is loss of accreditation, sometimes it is restoration of accreditation, based on the handiest stats available to use. In St. Louis, an elected board was standing in the way of charter business deals, but the appointed board that replaced them had accreditation restored five years later…..to protect them from the transfer law that caused Normandy to go bankrupt….and be taken over. As a bonus…commissioner Nicastro nvented special accreditation after the takeover, to stop the transfers……it was ruled unconstitutional by the courts. She is now spending more time with her family. Specific schools in slps were targeted by something called cee-trust a year ago…..there seems to be reluctance, even by the state appointed board to go along with it…..be careful guys. Do not make the politicians angry.
LikeLike
How is this different from mandating that every race horse in the U.S. has to match or beat the performance of Secretariat, probably the greatest race horse in U.S. history?
And when all those other thoroughbreds fail to beat Secretariat’s record, all the trainers must be fired and barred from ever training a Thoroughbred to race again.
NCLB started out with the mandate that 100% of all American children (about 75 million)—even children with severe disabilities, both mental and physical, in addition to children who live in poverty (and every country faces challenges teaching children who live in poverty. See quotes below from Stanford on this)—had to all be proficient by 2014 based on the NAEP that was designed to only recognize students who were the highest achievers.
Here’s what The Stanford Graduate School of Education has to say about children who live in poverty:
“There is an achievement gap between more and less disadvantaged students in every country; surprisingly, that gap is smaller in the United States than in similar post-industrial countries, and not much larger than in the very highest scoring countries.
“Achievement of U.S. disadvantaged students has been rising rapidly over time, while achievement of disadvantaged students in countries to which the United States is frequently unfavorably compared – Canada, Finland and Korea, for example – has been falling rapidly.”
https://ed.stanford.edu/news/poor-ranking-international-tests-misleading-about-us-performance-new-report-finds
What the Bill Gates funded Common Core agenda that uses VAM to rank and fire teachers demands is that U.S. teachers must be 100% successful with every student at the highest levels or be labeled failures. Public schools that work with high rates of children who live in pvoerty are punished be being closed and then those same children are turned over to corporate Charters that then often gets rid of the children who are the most difficult to teach.
U.S. public school teachers are being told that they have to turn out children to be all Secretariats or teachers who fail to achieve this impossible mandate will be labeled a failure, lose their teaching job, and maybe see their school closed for good.
LikeLike
“Cutting to the Quick”
Let’s cut to the quick:
The cut score is slick
Purporting to pick
The prepped with a trick
It’s set after test
At governor’s behest
To pass just the best
And fail all the rest
It’s really for nailing
For teacher VAMpaling
And generally railing
‘Bout schools that are “failing”
So schools can be closed
And teacher’s deposed
And charters imposed
While public is dozed
So markets can bloom
And business can boom
And profits can zoom
On public-school doom
LikeLike
Between 30% and 40% of students score as “proficient” on the NAEP exams, and you’re saying that’s equivalent to an A or A-?
LikeLike
FLERP, I have reviewed NAEP test questions. Proficient represents “solid academic achievement.” Advanced is super-duper smart. We will never have 100% at proficient. Yes, I consider proficient on NAEP to be equivalent to an A or A-. At worst, a strong B+.
If our legislators took the NAEP 8th grade math test, I expect very few would score proficient.
LikeLike
Have you always had this view of what proficient on NAEP means? proficiency? I ask because I know that when you were in the ed reform camp, you made some harsh assessments of US education based in part on NAEP scores. If I had a class in which 40% of the students were getting As or A minuses, I would conclude either that I had a very talented class or that I was a real softie on grading.
LikeLike
Excuse the inexplicable “proficiency?” in the middle of that comment. I may have lost consciousness for a moment there.
LikeLike
In case you missed the question I asked yesterday, I’ll pose it again: Has your view of what it means to score as “proficient” on NAEP changed in recent years? Has the test become more difficult? If not, why did you believe students were performing horribly when 30%-40% were getting As and A-minuses on “America’s report card”?
LikeLike
dese.mo.gov
The Department has issued the following news releases:
Achievement Cut Scores Set for Missouri Assessment Program
http://goo.gl/l28G9W
LikeLike
thank you so much….I incorporated it into my Post Dispatch thread….http://interact.stltoday.com/forums/viewtopic.php?f=6&t=1088448&p=14525817#p14525817
LikeLike