Many people have wondered how the New York State Education Department permitted the nonsensical story about the pineapple and the hare to get onto the state test.
This is not the first time a really bad reading passage got onto the test and it won’t be the last.
State Commissioner John King was quick to issue a defensive statement saying that people were reading the story “out of context,” as if the full story made sense (it didn’t). And he was quick to pin the blame on teachers, who supposedly had reviewed all the test items. It was the teachers’ fault, not his. In an era where Accountability is the hallmark of education policy, King was quick to refuse any accountability for what happened on his watch. These days, the ones at the top never accept accountability for what goes wrong, that’s for the “little people” like teachers and students, not for the bigwigs. No one holds them accountable, and they never accept any. None of them ever says, as President Harry S Truman did, “the buck stops here.”
So this is the reason that even a stupid, pointless story like the pineapple story–so thoroughly bowdlerized that it was disowned by Daniel Pinkwater, its original author–got past the review panel. I know about this process because I spent seven years as a member of the National Assessment Governing Board and served on a committee where we read every single question that would appear on a national test. When the review committee gets the items, with questions and answers, you are told that this particular item has been thoroughly field-tested. It has appeared in a children’s magazine; it has been used in a state assessment. Here are the results with all the accompanying statistics for this item. You are also told that the publisher’s own technical reviewers approved the item; so did the publisher’s bias and sensitivity reviewers.
By the time the item reaches the teachers or external panel, it has been vetted, you are told, by many others. There is tremendous implicit pressure to go along with the judgment of others whom you assume are very professional. They all agreed it was fine. Who are you to raise a question or complaint?
Since I am by nature a skeptic, I always read test passage and their questions and answers as if no one else had. And on more occasions than I can count, I said, “Stop. Wait. This doesn’t make sense. The question isn’t clear. None of the answers fits the question. There are two good answers,” or words to that effect.
But I understand the social pressure, the social consensus, that discourages questioning and criticism.
And that is how bad questions get onto standardized tests, and why the Pineapple question was not the first and will certainly not be the last to slip past the review panels.
The best remedy for this problem is to publish the questions and answers when the tests are finished. That way, everyone can see them. After all, as Mayor Bloomberg and Governor Cuomo and Secretary Duncan often remind us, when speaking of teacher evaluation ratings, “The public has a right to know.”
Since the tests are the linchpin of every national education policy today, the public has a right to know if the tests are fair, valid, reliable and reasonable ways to assess student learning.
Diane

Congratulations on your new blog! Thanks for providing this inside perspective. It certainly helps put King’s comments about the “teacher review” in broader context. It also sounds like it could inspire a terrific project for some enterprising journalist to fully document the life of a test question. Good luck with this new endeavor!
LikeLike
I’m curious about their item review process. I’ve heard/seen some processes that start with/give priority to item statistics (difficulty/discrimination scores). If that’s the starting place, it may make more sense how some items without “face validity” end up on large scale tests? Some psychometricians (in my opinion) focus too much on items’ ability to “separate” students (spread the scores out) and not enough on “does this item really measure anything real?” Jim Popham has written effectively (and passionately!) about this. Maybe test authors should start with a philosophy/vision about learning, rather than a psychometric theory, and the philosophy/vision about learning is the 1st “filter” items have to pass through even before the item statistics are examined?
LikeLike
1) Items don’t make it to field testing to generate the psychometric statistics until after they’ve already gone through layers of review and revision. Items that are bad on their face should not even get to the point that stats can be used to argue for them.
If that happens, the items writers, reviewers, editors and committees have all fallen down on the job.
2) Ravitch keeps saying that all items should be released after the test.
a) How do you link tests across years if you don’t have achor items to try to equate the scales?
b) Item development is expensive. Releasing all the items every time would significantly increase costs. Are you willing to devote more money to test development to pay for that?
LikeLike
Congratulations on the new blog. I look forward to reading Bridging Differences and I’m sure I will this blog as well. I just read what I think is an excellent commentary on what we say we need our children to learn vs what we test for; it is accessible at http://www.kentuckyteacher.org/kentucky-teacher-of-the-year/2012/04/21st-century-skills-need-21st-century-assessment/ .
Again, congratulations and thank you for letting so many of us know we are not alone.
LikeLike
@rmcenta Until this year, the item analysis process was very transparent. CTB McGraw had a posted process that was available and a call to field was sent out from SED looking for teachers to participate in each stage. It wasn’t until last week that I realized I didn’t see a single one of those calls this year. Hopefully, once this is over, Pearson will announce their process and engage teachers more openly in the conversation and development.
@ceolof I suspect the response to b), given the climate today, will be a call to eliminate the tests entirely.
LikeLike
@DataDiva But we know that these tests are NOT going away. There are complaints about test length and bad passages items. They are complaints about high stakes decision making. Yes.
But policy-makers want the data. Politicians and education officials want the data. Data-based decision-making for building leaders and those above them — which is all the rage in a million ways and backed by B-schools and those who have attended them — is not going away.
The tests are not going away.
If they did, then Ravitch’s repeated call to release all items every year wouldn’t make any sense. So, we know we are talking about a world in which the tests persist.
And so, my questions for Ravitch and others who would have all items released every year remain. There are reasons why they are not released, and consequences to their release. How do they propose to deal with those consequences? Are they ready for them? Have they even considered them?
LikeLike
Wow! Thanks for the information, Jennifer.
LikeLike
I’m not convinced that teachers are particularly good at reviewing items. They are notoriously bad at construct isolation. Too often, they are — in my experience — locked in how THEY think about the construct, how THEY teach the lesson and what THEY look for in student work.
In their own practice, teachers need to think broadly about context, about multiple lessons/issues that interrelate and appear in student work together. Teachers need to make connections and give students credit for the work they do and the learning they show.
That is a VERY different view than the kind of focused and precise thinking that standardized assessment usually calls for. Student work should be more authentic, but standard based assessment works quite differently.
Moreover, much of the item development process is about politics and constituencies. It is often about giving the appearance of many voices having contributed.
I am all for the inclusion of classroom teachers who are able of thinking through the lens of standardized assessment being included in the test development process. They would likely have the best ability to understand how students will think about the items and thereby can be the best judges of validity.
But I’ve seen so many bad teacher-made assessments that I do not think it is necessarily a bad thing that SED did not include teachers in every step this past year.
LikeLike
Don’t you see the contradictions in your own reply?
ceolaf said:
“In their own practice, teachers need to think broadly about context, about multiple lessons/issues that interrelate and appear in student work together. Teachers need to make connections and give students credit for the work they do and the learning they show.
That is a VERY different view than the kind of focused and precise thinking that standardized assessment usually calls for. Student work should be more authentic, but standard based assessment works quite differently.”
Why teach kids one way then test another? Do you have any idea how much instructional time is lost to these tests?
FYI, teachers are in-serviced left and right about cross-curricular instruction, differentiated instruction, following IEPs and 504s, positive behavior intervention and increasing rigor and relevance. Yet every year we are given fewer resources and more paperwork.
Testing will not be around forever. It is hurting our kids and killing creativity. They need to know how to THINK, not choose the best answer or find information in a passage.
LikeLike
@ceolaf: I’m curious how you know teachers are “notoriously bad at construct isolation.” You talk a lot about your experience, and while I don’t discount personal experience, it is limited. My own personal experience is quite different. I teach in a school where we regularly meet as grade-level and content area teams (and recently as vertical teams) to discuss common assessments – those that are district-created and those we create. We assess writing and projects using rubrics – again some that are district-created and some that we create. We analyze data to the question level, to determine what an answer, particularly a wrong answer, might tell us about a student’s thinking and understanding. We compare our results and our lessons, analyzing what worked and what didn’t. This is an example of how testing, and the data it provides, can be useful in improving both instruction and student achievement.
You seem to imply that there is an inherent difference between “authentic” and “standard based” assessments. It sounds like you’re equating standards-based assessment with standardized testing, and while today’s standardized tests purport to test the standards required at a grade level, they are certainly not the only way to assess attainment of those standards. I reject the premise that standardized tests require “focused and precise” thinking, unless that’s the new euphemism for convergent thinking. If what we want to test is convergent thinking, well, okay, I guess. But it can’t be the most important thing we do.
LikeLike
Maryanne,
1) I know they are notoriously bad at it because I hear people talking about it. That is what notoriously means. As DataDiva said, James Popham talks about this a lot. Dan Koretz talks about it, too. Those are big time assessment people. But most people with decent amounts of experience in the assessment industry speak of this, too.
I know they are notoriously bad because the folks at teacher prep programs that I talk to say that they know that they need to do a better job preparing their students for assessment, that they and the field has done a poor job, and classroom assessment has therefore been generally lower quality than it should be.
In those ways, I am not talking about my personal experience with teachers. Rather, I am depending on the experience of greater experts than me.
2) But I also have my own experience with my colleagues. I have my own experience when I visit other schools and I look at the tests students take. When I see a teacher grading a stack of something, I ask if I can take a look. When I hear kids complaining about tests, I ask if I can see them. That’s just my own experience, of course. But I would not use the word “notoriously” just to describe my own experience.
3) I do not think that TAKING standardized tests requires more focused and precise thinking than other tests. But CREATING high quality standardized tests *does* require more focused and precise thinking. As a teacher, could create a pretty good test in less time than I expected my students to need to take the test. But a high quality multiple choice item? One where are all the distractors are not only plausible-but-wrong, but also require ONLY the indicator to be differentiated from the key? That takes a lot more time. Those question can be answered in VERY little time, but they require far more time to create than you realize. I prided myself on my students walking out of my tests with a greater understanding than they walked in with. But standardized assessment is not at all about teaching or support further learning. It is entirely focused on assessing current knowledge and/or skill.
4) I really wish divergent thinking was more rewarded in school. But it quite rarely is. All to often what is thought to be inclusive of divergent thinking is really just a well-defined range of acceptable answers. Really challenging assumptions and understanding implications? Not bloody common. We don’t do experiments in science class, we do demonstrations. Until we understand that those are NOT experiments, we will not even approach understanding how much we limit real learning.
5) I don’t see anything in your story about construct isolation. Can you tell me a bit more about how you assure that, or even if you care about it?
LikeLike
Rougeau,
What’s my contradiction? The needs of assessment are different than the needs of teaching?
1) There’s a move afoot in separate the role of evaluating teachers from that of supporting teachers. They are different tasks, and the require different things. There are any number of areas in which we recognize that teaching is very different thing that performing or judging. Why are you so certain that k12 ed is so different.
2) Why teach one way and test another? Well, because teaching and testing are different tasks with different goals. That could be a reason. It could be because we are not willing to devote the resources to testing that we are willing to devote to teaching. There are lots of plenty of reasons.
3) Yes, I know exactly how much instructional time these tests directly take up. And I know how much more time many teachers and schools devote to test prep. But I can also acknowledge that a great deal of time is taken up with other assessments. At least at the high school level — where I am far more expert — I am comfortable saying that state-mandated standardized assessment constitutes a minority of testing time in school. If each class gives just one full-period test a month, that’s two full weeks of tests. If there are little quizzes along the way in addition to that, that’s even more time. When we think about testing, we need to think about all of this.
4) State standardized assessment is killing kids’ creativity? It’s the assessments that are doing it? It’s not the convergent questions that too many teachers ask? It’s not the lack of connections between the curricular areas? It’s not seating kids in rows and lecturing to them? It’s not making kids copy down what’s on the board and regurgitate it aloud, in writing and on tests? It’s not notebook checks? It’s not seat time requirements? It’s just the state assessments? That’s THE culprit? Come on! Look around you. Our schooling does a TON of things to kill creativity, and state mandated testing is probably not even in the top ten.
5) The state tests are lousy. No question. Too often they confuse construct isolate with simply being reductive. They distort learning standards to make them cheaply assessable. Too often, they are designed and implanted to report with out providing educational benefit. They are a major contributor to the dumbing down of education, when it should be focusing on things like Ted Sizer’s Habits of Mind.
But I can voice those criticisms while also acknowledging that they are here, they are a fact of our lives, and we need to deal with them at least for the medium term. They are not going away this year, next year or the year after.
LikeLike
Teachers and parents can call to eliminate the standardized tests until the cows come home, and they will be ignored.
The testing industry is a multi billion dollar enterprise, and their lobbyists never sleep. They are in DC and in major state capitols.
They are leeches. Or zombies.
LikeLike
Isn’t it amazing how many of the current crop of reformers went to independent schools where they never took a standardized test or send their own children to such schools? Isn’t it amazing that our country became the most powerful in the world before the age of NCLB? Why are we so dependent on these ways of measuring children? Why so distrustful of the professionals who see them every day? Why does Pearson know more about their abilities than their teachers?
LikeLike
Ms. Ravitch, I’m sure you know that answer to all the questions you’ve posed in this last comment. It all boils down to money. I’m convinced that the purpose of the testing is not to test what the children have learned, but what they have been taught. In other words, the instructors are being measured. In this crumbling economy, citizens want to know that they are getting their money’s worth. In some distorted way, the general public (in some cases) think that teachers are no more than glorified, overpaid babysitters. It is a way for those not in the classroom to get teachers out. We’ve all read the teacher-bashing comments on many blogs and in the papers. It’s all connected.
LikeLike
If that were the purpose of testing, wouldn’t they ask for tests that are designed to do that?
What’s really going on is that they have decided to add an additional purpose for the testing that already exists.
LikeLike
Thanks, Diane, for the perspective. Glad to hear you have a regular blog here!
LikeLike
Congrats Diane for hitting the nail on the head, again! Releasing the test items would be one way to keep the test makers accountable. Note to ceolaf: SAT, ACT and AP release test items after a reasonable period of time. In fact, many of their test prep books are composed of old test items. Their tests are widely considered to be both valid and reliable and the companies are profitable. Just a thought: how many years are anchor items kept in order to keep the tests consistent? Wouldn’t they have to be changed occasionally due to linguistic, cultural, and technological changes that might make them obsolete?
LikeLike
Tests are tedious. I hated them. Felt they were unfair. Yet, my country, Israel has second most traded companies on NASDAQ and I ended at Tufts at Harvard. In the US I work with gangs and am beyond worried about school failure. It literally causes death. I don’t want to blame kids, parents, teachers or Republicans.
I want us to be ambitious, relentless, and successful or we are at great peril. Teny. Providence, RI
LikeLike
Israel has an incredibly successful economy but does not do well on international tests. So?
LikeLike
I DO NOT “understand the social pressure, the social consensus, that discourages questioning and criticism.” Any teacher who doesn’t have the backbone to question and criticize should resign. First and foremost we are advocates for children. Any teacher who can’t summon the courage to do so should be ashamed of herself/himself and get out of the profession immediately.
LikeLike
Making the least-bad choice from a limited range of options that have little or nothing to do with an absurd situation is a real-life, job oriented skill. Looking at a situation and saying “What the hell is this all about?” is an everyday experience in the working world. Questions such as the pineapple race help prepare our students for adult life.
LikeLike
This is a beautiful letter from an 8th grade teacher to her students…lets face it the test is the test and we can and will all work to improve it, but this is what teachers should be telling their wonderful students.
‘A Test You Need to Fail’: A Teacher’s Open Letter to Her 8th Grade Students
by Ruth Ann Dandrea
Dear 8th Graders,
I’m sorry.
(llustration: David McLiman)
I didn’t know.
I spent last night perusing the 150-plus pages of grading materials
provided by the state in anticipation of reading and evaluating your
English Language Arts Exams this morning. I knew the test was
pointless—that it has never fulfilled its stated purpose as a
predictor of who would succeed and who would fail the English Regents
in 11th grade. Any thinking person would’ve ditched it years ago.
Instead, rather than simply give a test in 8th grade that doesn’t get
kids ready for the test in 11th grade, the state opted to also give a
test in 7th grade to get you ready for your 8th-grade test.
But we already knew all of that.
What I learned is that the test is also criminal.
Because what I hadn’t known—this is my first time grading this
exam—was that it doesn’t matter how well you write, or what you think.
Here we spent the year reading books and emulating great writers,
constructing leads that would make everyone want to read our work,
developing a voice that would engage our readers, using our
imaginations to make our work unique and important, and, most of all,
being honest. And none of that matters. All that matters, it turns
out, is that you cite two facts from the reading material in every
answer. That gives you full credit. You can compose a “Gettysburg
Address” for the 21st century on the apportioned lines in your test
booklet, but if you’ve provided only one fact from the text you read
in preparation, then you will earn only half credit. In your
constructed response—no matter how well written, correct, intelligent,
noble, beautiful, and meaningful it is—if you’ve not collected any
specific facts from the provided readings (even if you happen to know
more information about the chosen topic than the readings provide),
then you will get a zero.
And here’s the really scary part, kids: The questions you were asked
were written to elicit a personal response, which, if provided, earn
you no credit. You were tricked; we were tricked. I wish I could
believe that this paradox (you know what that literary term means
because we have spent the year noting these kinds of tightropings of
language) was simply the stupidity of the test-makers, that it was not
some more insidious and deliberate machination. I wish I could believe
that. But I don’t.
I told you, didn’t I, about hearing Noam Chomsky speak recently? When
the great man was asked about the chaos in public education, he
responded quickly, decisively, and to the point: “Public education in
this country is under attack.” The words, though chilling, comforted
me in a weird way. I’d been feeling, the past few years of my
30-plus-year tenure in public education, that there was something or
somebody out there, a power of a sort, that doesn’t really want you
kids to be educated. I felt a force that wants you ignorant and
pliable, and that needs you able to fill in the boxes and follow
instructions. Now I’m sure.
It’s not that I oppose rigorous testing. I don’t. I understand the
purpose of evaluation. A good test can measure achievement and even
inspire. But this English Language Arts Exam I so unknowingly
inflicted on you does neither. It represents exactly what I am opposed
to, the perpetual and petty testing that has become a fungus on the
foot of public education. You understand that metaphor, I know,
because we have spent the year learning to appreciate the differences
between figurative and literal language. The test-makers have not.
So what should you do, my beautiful, my bright, my intelligent, my
talented? Continue. Continue to question. I applaud you, sample
writer: When asked the either/or question, you began your response,
“Honestly, I think it is both.” You were right, and you were brave,
and the test you were taking was neither. And I applaud you, wildest
8th grader of my own, who—when asked how a quote applied to the two
characters from the two passages provided—wrote, “I don’t think it
applies to either one of them.” Wear your zeroes proudly, kids. This
is a test you need to fail.
I wondered whether giving more than 10 minutes of every class period
to reading books of our own choosing was a good idea or not. But you
loved it so. You asked for more time. Ask again; I will give you
whatever you need. I will also give you the best advice I can, advice
from the Nobel Prize-winning writer, Juan Ramón Jiménez. Ray Bradbury
thought this was so important, he used it as the epigraph at the
beginning of Fahrenheit 451: “When they give you lined paper, write
the other way.”
It is the best I have to offer, beyond my apologies for having taken
part in an exercise that hurt you, and of which I am mightily ashamed.
© 2012 Rethink Schools
LikeLike