Howard Schwach taught for more than 25 years, developed
test items for the state, and worked on curriculum development for
special education students. He
recently reviewed sample items from New York’s Common Core tests
and professed astonishment.
He wrote:
From the
first moment that I looked at some practice tests for the English
Language Arts tests that were given recently, I knew that the kids
and their teachers were in trouble. In his long
experience as a teacher and test writer and curriculum developer,
he said, “there was one guiding principal: never test
students on skills or material that you have not taught and
practiced. To do so not only would have been
unfair to the students, but it would have made the tests unreliable
and downright useless at a measure of student ability and
knowledge. That is why, when I looked at the
practice test, my first thought was that the questions were in the
deep end of the pool when the kids were just learning how to
swim. One that stuck in my mind was a passage
from a 1920’s magazine about aspirin. Because
the source article was written nearly 100 years ago, it contained
some archaic language and syntax that would have been confusing to
today’s adults, nonetheless eleven-year-olds.
So the kids were at a disadvantage right away, trying to
figure out the words they had never seen before, working them out
through context. Then, the question called for skills that have
never been tested before, nor taught by the teacher who showed me
the sample questions. She admitted that she had been “teaching to
the old test” for the past several years, trying to keep her kid’s
all-important test scores up while trying to keep her
job. “Education has nothing to do with what we
have been doing for the past couple of years,” the teacher admitted
with a nervous laugh. “It has been all about the
numbers.”
He found questions that had two right answers.
He found questions that would send the kids into tears. And he
wondered, “What in the world was the state thinking?” Indeed, what
were state officials when they tested students on material they had
not been taught, using unfamiliar vocabulary, having ambiguous
answers, with the certainty that most students would fail? Was it
John King’s inexperience that led him to align the state cut scores
with NAEP’s proficiency levels? Did he not understand that NAEP
proficiency is not a “passing” mark but a measure that connotes
“solid academic performance”?
What were they thinking?
You ask, “What were they (state officials) thinking” when they approved the new CCS assessments?
They were thinking: “What questions can be placed on the assessments so we have only 30% of NYS students attain a level 3 or 4?”
What were/are they thinking?
Smash and grab…
I have to wonder, what ARE they “measuring”? Are a few self-selected Mensa members trying to force their unique learning abilities and styles on the students in order to “prove” that the rest of us are stupid, lazy, unworthy idiots? That is how it feels, anyway. They seem to have no recognition of learning styles or child development. They seem to dismiss educational research as pure bunk. They have no respect for teachers, professors, students, or, for that matter, human beings. There seems to be an obsession to prove that educators are failing the students and to do so, they create “instruments” to “measure” information out of students’ and teachers’ realms of understanding in order to prove their point. It is a “win” for their bank accounts only. This is not making America better for Americans. We should be singing “God Save America” since our blessings are being stolen.
Deb, I have wondered the same. How does a standardized test (intended to make us standard) attempt to measure affective domains like perseverance (which varies by person?)
Maybe all those who pick choice (a) are concrete-sequential and those who pick (b) are abstract-random. And if the real goal is to measure these new skills being taught to our children, is the test merely to sort them out, and every now and then test a fact or two?
“. . . is the test merely to sort them out (?). . .”
That’s all it is for and it does a piss poor job at that. See below comment to the next post for why. And it’s all part of the “Doing the wrong thing righter” problem that the edudeformers can’t get around.
The proliferation of educational assessments, evaluations and canned programs belongs in the category of what systems theorist Russ Ackoff describes as “doing the wrong thing righter. The righter we do the wrong thing,” he explains, “the wronger we become. When we make a mistake doing the wrong thing and correct it, we become wronger. When we make a mistake doing the right thing and correct it, we become righter. Therefore, it is better to do the right thing wrong than the wrong thing right.”
Our current neglect of instructional issues are the result of assessment policies that waste resources to do the wrong things, e.g., canned curriculum and standardized testing, right. Instructional central planning and student control doesn’t – can’t – work. But, that never stops people trying.
The result is that each effort to control the uncontrollable does further damage, provoking more efforts to get things in order. So the function of management/administration becomes control rather than creation of resources. When Peter Drucker lamented that so much of management consists in making it difficult for people to work, he meant it literally. Inherent in obsessive command and control is the assumption that human beings can’t be trusted on their own to do what’s needed. Hierarchy and tight supervision are required to tell them what to do. So, fear-driven, hierarchical organizations turn people into untrustworthy opportunists. Doing the right thing instructionally requires less centralized assessment, less emphasis on evaluation and less fussy interference, not more. The way to improve controls is to eliminate most and reduce all.
Former Green Beret Master Sergeant Donald Duncan (Viet Nam) did when he noted in Sir! No Sir! that:
“I was doing it right but I wasn’t doing right.”
And from one of America’s premier writers:
“The mass of men [and women] serves the state [education powers that be] thus, not as men mainly, but as machines, with their bodies. They are the standing army, and the militia, jailors, constables, posse comitatus, [administrators and teachers], etc. In most cases there is no free exercise whatever of the judgment or of the moral sense; but they put themselves on a level with wood and earth and stones; and wooden men can perhaps be manufactured that will serve the purpose as well. Such command no more respect than men of straw or a lump of dirt.”- Henry David Thoreau [1817-1862], American author and philosopher
Will PARCC, Smarter Balanced or ACT Aspire tests be any better?
NO! By definition of being a standardized test they have all the inherent errors in construction, in the giving of and in the disseminating the supposed “results” that Wilson has identified which render the whole process invalid. Using any “results” from this educational malpractice is “vain and illusory” and UNETHICAL. To completely understand why see Noel Wilson’s “Educational Standards and the Problem of Error” found at: http://epaa.asu.edu/ojs/article/view/577/700
There has been no rebuttal whatsoever that I have found of Wilson’s work. NONE! See below for summary.
Brief outline of Wilson’s “Educational Standards and the Problem of Error” and some comments of mine. (updated 6/24/13 per Wilson email)
1. A quality cannot be quantified. Quantity is a sub-category of quality. It is illogical to judge/assess a whole category by only a part (sub-category) of the whole. The assessment is, by definition, lacking in the sense that “assessments are always of multidimensional qualities. To quantify them as one dimensional quantities (numbers or grades) is to perpetuate a fundamental logical error” (per Wilson). The teaching and learning process falls in the logical realm of aesthetics/qualities of human interactions. In attempting to quantify educational standards and standardized testing we are lacking much information about said interactions.
2. A major epistemological mistake is that we attach, with great importance, the “score” of the student, not only onto the student but also, by extension, the teacher, school and district. Any description of a testing event is only a description of an interaction, that of the student and the testing device at a given time and place. The only correct logical thing that we can attempt to do is to describe that interaction (how accurately or not is a whole other story). That description cannot, by logical thought, be “assigned/attached” to the student as it cannot be a description of the student but the interaction. And this error is probably one of the most egregious “errors” that occur with standardized testing (and even the “grading” of students by a teacher).
3. Wilson identifies four “frames of reference” each with distinct assumptions (epistemological basis) about the assessment process from which the “assessor” views the interactions of the teaching and learning process: the Judge (think college professor who “knows” the students capabilities and grades them accordingly), the General Frame-think standardized testing that claims to have a “scientific” basis, the Specific Frame-think of learning by objective like computer based learning, getting a correct answer before moving on to the next screen, and the Responsive Frame-think of an apprenticeship in a trade or a medical residency program where the learner interacts with the “teacher” with constant feedback. Each category has its own sources of error and more error in the process is caused when the assessor confuses and conflates the categories.
4. Wilson elucidates the notion of “error”: “Error is predicated on a notion of perfection; to allocate error is to imply what is without error; to know error it is necessary to determine what is true. And what is true is determined by what we define as true, theoretically by the assumptions of our epistemology, practically by the events and non-events, the discourses and silences, the world of surfaces and their interactions and interpretations; in short, the practices that permeate the field. . . Error is the uncertainty dimension of the statement; error is the band within which chaos reigns, in which anything can happen. Error comprises all of those eventful circumstances which make the assessment statement less than perfectly precise, the measure less than perfectly accurate, the rank order less than perfectly stable, the standard and its measurement less than absolute, and the communication of its truth less than impeccable.”
In other word all the logical errors involved in the process render any conclusions invalid.
5. The test makers/psychometricians, through all sorts of mathematical machinations attempt to “prove” that these tests (based on standards) are valid-errorless or supposedly at least with minimal error [they aren’t]. Wilson turns the concept of validity on its head and focuses on just how invalid the machinations and the test and results are. He is an advocate for the test taker not the test maker. In doing so he identifies thirteen sources of “error”, any one of which renders the test making/giving/disseminating of results invalid. As a basic logical premise is that once something is shown to be invalid it is just that, invalid, and no amount of “fudging” by the psychometricians/test makers can alleviate that invalidity.
6. Having shown the invalidity, and therefore the unreliability, of the whole process Wilson concludes, rightly so, that any result/information gleaned from the process is “vain and illusory”. In other words start with an invalidity, end with an invalidity (except by sheer chance every once in a while, like a blind and anosmic squirrel who finds the occasional acorn, a result may be “true”) or to put in more mundane terms crap in-crap out.
7. And so what does this all mean? I’ll let Wilson have the second to last word: “So what does a test measure in our world? It measures what the person with the power to pay for the test says it measures. And the person who sets the test will name the test what the person who pays for the test wants the test to be named.”
In other words it measures “’something’ and we can specify some of the ‘errors’ in that ‘something’ but still don’t know [precisely] what the ‘something’ is.” The whole process harms many students as the social rewards for some are not available to others who “don’t make the grade (sic)” Should American public education have the function of sorting and separating students so that some may receive greater benefits than others, especially considering that the sorting and separating devices, educational standards and standardized testing, are so flawed not only in concept but in execution?
My answer is NO!!!!!
One final note with Wilson channeling Foucault and his concept of subjectivization:
“So the mark [grade/test score] becomes part of the story about yourself and with sufficient repetitions becomes true: true because those who know, those in authority, say it is true; true because the society in which you live legitimates this authority; true because your cultural habitus makes it difficult for you to perceive, conceive and integrate those aspects of your experience that contradict the story; true because in acting out your story, which now includes the mark and its meaning, the social truth that created it is confirmed; true because if your mark is high you are consistently rewarded, so that your voice becomes a voice of authority in the power-knowledge discourses that reproduce the structure that helped to produce you; true because if your mark is low your voice becomes muted and confirms your lower position in the social hierarchy; true finally because that success or failure confirms that mark that implicitly predicted the now self evident consequences. And so the circle is complete.”
In other words students “internalize” what those “marks” (grades/test scores) mean, and since the vast majority of the students have not developed the mental skills to counteract what the “authorities” say, they accept as “natural and normal” that “story/description” of them. Although paradoxical in a sense, the “I’m an “A” student” is almost as harmful as “I’m an ‘F’ student” in hindering students becoming independent, critical and free thinkers. And having independent, critical and free thinkers is a threat to the current socio-economic structure of society.
Well, I’ve never agreed with all the high stakes testing and I certainly don’t think that students’ results on such tests should play a part of teacher evaluations. I will simply continue to do what I can to spread the word about the Opt Out option.
“They” meaning NYSED, the Regents, the Governor and the Commissioner were thinking “How can be sure that our prediction comes true?” Then, they were thinking, now that the public BELIEVES that most students are failures, we can convince them that the teachers and principals are failures too. THEN, we can swoop in with our “death penalty” comment and start taking over all the schools in NYS!
Notice that no where in “what they were thinking” is what’s appropriate for students, how to deal with childhood poverty, the effect of teacher cuts and larger class sizes. It doesn’t occur to them because their children don’t suffer through this, because, as Commissioner King said “The school my children attend already are doing these things.” So, if you are wealthy, your children aren’t subjected to the very things that are killing public schools – how very noble of them…….NOT!
It feels like a “mob mentality” …running over anyone who stands in the way of achieving their goals. But in the end how can this make things better? Scary.
It’s very scary………and I have to say from this NYer’s perspective everything the Governor has done is his attempt to get to the White House. He bullies his way over anyone and everyone in his path with only HIS end goal in mind.
Rest assured that education crazies are not only in NY.
Schwach mentions the “archaic aspirin ad” and it reminded me of observing a second grade classroom in Nagadoces, Texas decades ago for a nationwide study on reading, where they using phonics methods and read a book written in the 1930s. This story was about a girl’s “frocks” which they all shouted together in a community read. I asked a boy near me what was a frock, and he said, I don’t know. But he did know how to sound it out.
Will Common Core offer that kind of standardized language arts methodology? But after my university class last term, using Diane’s book as core, actually did study the new math, reading and comprehending old aspirin ads is easy in comparison.
That 90 year old article on aspirin suggests that the test question writers are following in David Coleman’s mold: hold every student to the highest imaginable bar.
Implicitly, this is a callous attitude to all others.
How is that article “holding every student to the highest imaginable bar”?
I reject that idea of curriculum that they are pushing and I will tell all my friends in Albany about it. The method they are using for cut scores is not appropriate, fair or valid.
Working with classroom teachers who know the curriculum and who know their students We have performed standard setting in the past.
For example, standard setting done with the Stanford Reading Test at Grade 3.. (using standard scores not NCE)
Contrasting Groups method 12% of students fall below standard
B G Method 35% fall below
Teacher Rating Method 12% fall below
Report Card Method (using average “d” score) 6%
—————-
On a higher grade level there would be more students falling below the standard because more D’s are given as students know more of the curriculum.
Working with teachers in a Rhode Island workshop on social studies the advice we gave was “treat them all as gifted” and go on from there. It is good to be realistic but we also have the Rosenthal Jacobson effect when teachers are told “these students are failing” and it sets up a self-fulfilling hypothesis.
——————————
This is done within the school building level at various grades and the leaders are expected to put the information together and then report what their standard is. That is wholly different from scientists, engineers, or corporate business types telling the school what a “standard” should be .
—————————-
The quote below is from an ETS standard setting document used for training in Massachusetts.
quote: “Standard-setting study is an official research study conducted to determine a cutscore for the test. To be legally defensible in the USA and meet the Standards for Educational and Psychological Testing, a cutscore cannot be arbitrarily determined, it must be empirically justified. For example, the organization cannot merely decide that the cutscore will be 70% correct. Instead, a study is conducted to determine what score best differentiates the classifications of examinees, such as competent vs. incompetent.”
Another example at the Community College level; the Middlesex C.C. in Bedford MA as for a review of their curriculum based on reading levels of their students. Giving a test like the Nelson Denny reading test can begin the process. However, the teachers who know their curriculum and who know their students are involved in the next steps to verify any “standards” that are set For example: reading levels of CC students (average age can be as high as 25 but all have completed GED or diploma). The data are not from the entire college population but those selected to participate in the labs for
intense instruction. These data are used by the faculty to set standards and to look at the retention of students until they graduate. I don’t think that NAEP studies on the national or state level provide any additional useful data and they are just being used to
discourage students and faculty.
Critical Need: 6%
Severe Need 7%
Moderate/ developmental needs 13%
At or Above standard. about 75%
Sorry to take up so much space but if anyone knows of better resources I would appreciate any comments …. The NAEP and test scores at the state level
need to provide ITEM analysis for the schools and teachers.
These data need to be at the individual school level.
For example, in reading or math: Here is a listing of Objectives where students show under 50% mastery…. priorities can be set. (End of Grade 8)
Vocabulary — multimeaning of words 33% mastery
Meaning Affixes 42 % mastery
Writing Techniques 33% mastery
Decimals/Fractions 25% mastery
Integers 0% mastery
Problem solving 33% mastery
————–
For one entire year a middle school used computers in science and the students were expected to develop study skills and problem solving. The progress made in study skills was insufficient with only about 4 new skills learned in a year by the middle school students. This is what the teachers would need to examine to improve curriculum. (TABS — technology and basic skills study Massachusetts).
If the bully is picking on another child, tell your child to point out to the bully that his or her
behavior is unacceptable and is no way to treat another person.
The best thing you can do to find a good and reliable air duct cleaning services is to rely on recommendations and testimonials about
the company you are considering on hiring. Los
Angeles, California, Underground, teen, community, Latino, Hispanic, Chicano,
Hip Hop, Reggaeton, Clubs,flier parties, fiestas, clubs, fiestas privadas y
mucho mas.