Study: NAEP Does Not Assess Critical Thinking Skills: No Standardized Test Can Do That

The link was left off. It is here.

Valerie Strauss reports on an important new study by a group at Stanford University led by historian Sam Wineburg.

NAEP supporters say that the tests are able to measure skills that other standardized tests can’t: problem solving, critical thinking, etc. But this post takes issue with that notion. It was written by three Stanford University academics who are part of the Stanford History Education Group: Sam Wineburg, Mark Smith and Joel Breakstone.

Wineburg, an education and history professor in the Graduate School of Education, is the founder and executive director of the Stanford History Education Group and Stanford’s PhD program in education history. His research interests include assessment, civic education and literacy. Smith, a former high school social studies in Iowa, Texas and California, is the group’s director of assessment; his research is focused on K-12 history assessment, particularly on issues of validity and generalizability. And Breakstone, a former high school history teacher in Vermont, directs the Stanford History Education Group. His research focuses on how teachers use assessment data to form instruction.

The National Assessment of Educational Progress is considered the “gold standard” of education testing because it is the only national longitudinal measure that goes back to 1970; no one can practice for it; no one knows which students will take the test; no single student takes the entire test; samples of students in every state take portions of the tests.

But when it comes to standardized testing, there is no gold standard. It is all dross, especially now that almost all standardized tests are delivered online. Online testing is popular because it is cheap and supposedly fast. But online testing by its nature allows no room for demonstrating thoughtfulness or for divergent thinking or for creative responses. It is the enemy of critical thinking.

Wineburg’s group tried to determine whether NAEP actually tested critical thinking, and they found that it did not.

But what would happen [they asked] if instead of grading the kids, we graded the test makers? How? By evaluating the claims they make about what their tests actually measure.

For example, in history, NAEP claims to test not only names and dates, but critical thinking — what it calls “Historical Analysis and Interpretation.” Such questions require students to “explain points of view,” “weigh and judge different views of the past,” and “develop sound generalizations and defend these generalizations with persuasive arguments.” In college, students demonstrate these skills by writing analytical essays in which they have to put facts into context. NAEP, however, claims it can measure such skills using traditional multiple-choice questions.

We wanted to test this claim. We administered a set of Historical Analysis and Interpretation questions from NAEP’s 2010 12th-grade exam to high school students who had passed the Advanced Placement (AP) exam in U.S. History (with a score of 3 or above). We tracked students’ thinking by having them verbalize their thoughts as they solved the questions.

What we learned shocked us.

In a study that appears in the forthcoming American Educational Research Journal, we show that in 108 cases (27 students answering four different items), there was not a single instance in which students’ thinking resembled anything close to “Historical Analysis and Interpretation.” Instead, drawing on canny test-taking strategies, students typically did an end run around historical content to arrive at their answers.

Their analysis is fascinating.

It is past time that we relinquished our obsession with standardized testing.

drext727 says:

September 20, 2017 at 10:02 am

Reblogged this on David R. Taylor-Thoughts on Education.

LikeLike

dienne77 says:

September 20, 2017 at 10:06 am

Paging Dr. Swacker. Paging Dr. Swacker.

tultican says:

September 20, 2017 at 11:57 am

It’s like Diane is channeling my man, Duane. He keeps writing about how useless standardized testing is and he has convinced me. It was through Dr. Swacker I was introduce me to Noel Wilson’s 1998 paper on standardized testing. It pointed to the same results as this new study.

LikeLike

- Duane E Swacker says:
  
  September 21, 2017 at 9:09 pm
  
  See reply below/above to Dienne.
  
  LikeLike
Duane E Swacker says:

September 21, 2017 at 8:58 pm

Ain’t no Dr. Swacker. There’s a Doktor Swackenstein, but that’s a long story from many decades ago that took place in a small town in rural Illinois. I plead the fifth on that one!

LikeLike

Duane E Swacker says:

September 21, 2017 at 9:09 pm

Been out of commission, that is, away from my computer and am now catching up with all the internet stuff-son got a deer and I went to help him with that and then I had a doc appt today. Even heard something about how the SSM vowed to wipe NK off the face of the earth-very presidential, eh!

But you can probably guess what caught my eye:

“NAEP supporters say that the tests are able to measure skills that other standardized tests can’t:”

Neither the NAEP or any other standardized test measures anything. NOTHING! Those piss poor assessments may give an inkling at most of what a student did in interacting with the test on a given day, at a given time and in a given location. And that’s all one can say about the result of the test. NOTHING MORE.

I know it’s not Wilson but it’s my take on this measurement nonsense:

The most misleading concept/term in education is “measuring student achievement” or “measuring student learning”. The concept has been misleading educators into deluding themselves that the teaching and learning process can be analyzed/assessed using “scientific” methods which are actually pseudo-scientific at best and at worst a complete bastardization of rationo-logical thinking and language usage.

There never has been and never will be any “measuring” of the teaching and learning process and what each individual student learns in their schooling. There is and always has been assessing, evaluating, judging of what students learn but never a true “measuring” of it.

But, but, but, you’re trying to tell me that the supposedly august and venerable APA, AERA and/or the NCME have been wrong for more than the last 50 years, disseminating falsehoods and chimeras??
Who are you to question the authorities in testing???

Yes, they have been wrong and I (and many others, Wilson, Hoffman etc. . . ) question those authorities and challenge them (or any of you other advocates of the malpractices that are standards and testing) to answer to the following onto-epistemological analysis:

The TESTS MEASURE NOTHING, quite literally when you realize what is actually happening with them. Richard Phelps, a staunch standardized test proponent (he has written at least two books defending the standardized testing malpractices) in the introduction to “Correcting Fallacies About Educational and Psychological Testing” unwittingly lets the cat out of the bag with this statement:

“Physical tests, such as those conducted by engineers, can be standardized, of course [why of course of course], but in this volume , we focus on the measurement of latent (i.e., nonobservable) mental, and not physical, traits.” [my addition]

Notice how he is trying to assert by proximity that educational standardized testing and the testing done by engineers are basically the same, in other words a “truly scientific endeavor”. The same by proximity is not a good rhetorical/debating technique.

Since there is no agreement on a standard unit of learning, there is no exemplar of that standard unit and there is no measuring device calibrated against said non-existent standard unit, how is it possible to “measure the nonobservable”?

THE TESTS MEASURE NOTHING for how is it possible to “measure” the nonobservable with a non-existing measuring device that is not calibrated against a non-existing standard unit of learning?????

PURE LOGICAL INSANITY!

The basic fallacy of this is the confusing and conflating metrological (metrology is the scientific study of measurement) measuring and measuring that connotes assessing, evaluating and judging. The two meanings are not the same and confusing and conflating them is a very easy way to make it appear that standards and standardized testing are “scientific endeavors”-objective and not subjective like assessing, evaluating and judging.

That supposedly objective results are used to justify discrimination against many students for their life circumstances and inherent intellectual traits.

C’mon test supporters, have at the analysis, poke holes in it, tell me where I’m wrong!

I’m expecting that I’ll still be hearing the crickets and cicadas of tinnitus instead of reading any rebuttal or refutation.

Because there is no rebuttal/refutation!

LikeLike

Chiara says:

September 20, 2017 at 10:08 am

There IS a role for standardized tests if the goal is equity though. I see it every single year where I live. Every year we have a few lower income students who get into prestigious or selective colleges. They ALWAYS have really high ACT scores. That is the number that gets them in the door for an interview.

Would they be passed over without that score, if there are thousands of applicants? I think they might. The score is like a shortcut. It allows them to be seen – make the first round.

What would replace that?

MamaJJ says:

September 20, 2017 at 10:31 am

Many colleges are test-optional for admission now, realizing the extreme limitations of any standardized test to tell them anything of value about a student. http://www.fairtest.org/university
http://www.fairtest.org/university-testing-bias

LikeLike

- Chiara says:
  
  September 20, 2017 at 11:29 am
  
  Thanks. But don’t you have to give some thought to equity or maybe more importantly, bias?
  
  The whole point of an objective measure is to remove subjectivity- people introduce their own biases into selection processes and test scores can mitigate that because it’s a number.
  
  For every example of a person who is unfairly judged by a test score, there is perhaps an example of a person who was helped by a test score.
  
  For kids in ordinary public schools without access to a lot of the bells and whistles the score is what allows them to compete with higher income kids. It can be a real savior.
  
  LikeLike
NYC public school parent says:

September 20, 2017 at 6:17 pm

Chiara,

I know exactly what you mean.

If the only standardized exams that were given to US students were taken by every student in a public, private, parochial, home school, etc., the exams would 1. be better and 2. not given the weight they are given. 3. Not be used to make some sweeping judgement about how terrible US students or public schools are and how lousy kids think.

Top colleges regularly admit overprivileged students with lower SAT/ACT scores from connected private schools and reject perfect scoring students from public schools. Do you know why? Because if you are very rich, your private school will tell colleges that the grades they gave you prove that you are much better than those public school students with perfect standardized test scores. When it comes to privileged kids not scoring at the top, their private school will always certify they are much more than their test score.

Judging students – or their teachers – by their test scores is for the little people, as Leona Helmsley would say.

Just look at the history of AP classes/exams. They were started in the schools that educated the most privileged and affluent students. Those schools crowed about how wonderful AP classes were.

As soon as lots of the little people — public school students — started taking those same exams and outscoring some of their students, those very same schools for the privileged announced that they had no value anymore! Their advanced classes were better than AP classes and no test on an AP Exam can prove the worth of their students.

I do think that where standardized tests help low-income students is that a few truly brilliant students who might have been otherwise overlooked can be found. There are still other brilliant students who aren’t particularly good test takers who won’t be found, but it is one opportunity to find a few.

But I think there is a lot of hypocrisy in how standardized tests are viewed. A student should be judged on his test score unless he is rich and privileged in which case he is much more than his test score.

A teacher should be judged on how his students do on standardized exams unless it is a private school teacher in which case of course we understand that no exam can capture how superb his students are due to his teaching.

LikeLike

Duane E Swacker says:

September 21, 2017 at 9:17 pm

“The whole point of an objective measure is to remove subjectivity- people introduce their own biases into selection processes and test scores can mitigate that because it’s a number.”

NO!, NO!, NO! and another NO!

Standardized tests are not objective in any fashion.

Putting a number to something does not make it “scientific” and/or objective. Many times putting a number to something in the fashion that it is done in standardized testing is used for “gussying up”, ginning up (in a false fashion) the importance of the supposed result.

All selection processes have a built-in bias. That cannot be avoided.

For an explanation of the false usages of numbers in discourse may I suggest you read “Proofiness: The Dark Arts of Mathematical Deception” by Charles Seife.

LikeLike

Steve Ruis says:

September 20, 2017 at 10:30 am

And, I assume, that report will be behind a pay wall. If it is not, boy, do I want to see that. If it is can you get somebody to report out on it? P l e a s e !

Duane E Swacker says:

September 21, 2017 at 9:21 pm

Try here: http://beyondthebubble.stanford.edu/

LikeLike

Michael Fiorillo says:

September 20, 2017 at 10:33 am

We do have tests of our critical thinking skills; they occur every two and four years, are called “elections,” and we usually fail.

SomeDAM poet says:

September 20, 2017 at 11:09 pm

Whether a college admissions officer relies on standardized tests to decide who gets in or even who has a chance of getting in is also a test of critical thinking skills.

Anyone who thinks that something that explains a very small fraction of the variance in freshman grades is a reliable indicator of who will succeed in college (and who will not) needs to have his or her head examined to see if there is anything in it.

These people are either incapable of understanding statistics or knowingly perpetrating a fraud on the public.

LikeLike

- Duane E Swacker says:
  
  September 21, 2017 at 9:22 pm
  
  I’ll go with the second half of your last statement.
  
  LikeLike

retired teacher says:

September 20, 2017 at 10:52 am

Reading, writing and thinking are all interconnected. A staple of my college experience was writing either in the form of research or through a ‘blue book’ test. Today many colleges have stopped using these two forms of assessment in favor of bubble tests because they are cheaper and easier to use. I think we are short changing young people through this type of reductionist testing. Academic demands that require students to demonstrate a deeper level of understanding are more meaningful, in my opinion, especially in the humanities and social sciences.

Marian Cruz says:

September 20, 2017 at 10:55 am

Based on my own experience, standardized testing was never a positive experience. As an elementary school teacher, the results were the same.

Usually Right says:

September 20, 2017 at 11:22 am

The ERB also claims to measure critical thinking skills without releasing the tests. The CTP4 is a sham that Independent Schools use to prove their worth. They compare themselves to public schools without even given public schools the questions. There is no gold standard today in the US. The elite (includes education communities) want to stay that way. You can’t have educated with uneducated. You can’t have poor without rich. No one will answer a question truthfully because it means losing their position. Just saying.

NYC public school parent says:

September 20, 2017 at 6:22 pm

I posted elsewhere before I read this. Yes, one issue is that privileged students are exempt from being judged on their standardized test scores. That’s because if they were, they wouldn’t be able to claim such a disproportionate number of spots in selective colleges.

LikeLike

September 20, 2017 at 11:34 am

I do think too one has to make an effort to hear the civil rights argument for tests.

The fear is their kids will be PASSED OVER for reasons other than merit unless there is a “blind” measure that is objective – that sees to me to be a justified concern.

I mean we would like if each individual was judged fairly by a group of really committed people, but that’s not gonna happen. Bias enters and skews results. The tests can be a kind of check on that.

dienne77 says:

September 20, 2017 at 11:59 am

But their kids are passed over with the test. The tests almost invariably rank kids by socio-economic status, so poor and minority kids are going to come in at the bottom of the scale anyway. And now it’s worse because the tests are “objective” (ahem), so you can’t even argue that there was bias (even though the bias is baked into the test).

LikeLike

dianeravitch says:

September 20, 2017 at 12:39 pm

The tests don’t prevent bias. They reinforce it.

They are normed on a bell curve. Half the students will be in the bottom half.

The tests reflect socioeconomic status, specifically, parent income and education.

The children who have the most will dominate the top half of the bell curve.

The children who start life with the least family income and education will dominate the bottom half of the bell curve.

Standardized tests reinforce inequality and make it appear to be objective.

LikeLike

- jake jacobs says:
  
  September 20, 2017 at 9:30 pm
  
  Today I was called in to proctor “baseline” testing that replaced most of my art classes for the day. I was put in a “mod” room, which means all the kids there have IEPs, but for very different reasons. Some were halfway done with the 62 multiple choice questions, while others had 10 or less bubbles filled. One kid had one bubble filled. They were staring into space, tapping pencils, looking around the room, asking me random questions, asking for bathroom passes, anything to avoid the test.
  
  I had some great student-centered lessons planned for the day, but instead became the heavy, forcing kids to refocus and take the test, sitting in silence. I became the principal in The Breakfast Club, warning kids that they would face consequences if they didn’t stop making eye contact and giggling.
  
  After an hour of this, I was rotated into an 8th grade “regular ed” room where all but one kid had finished. The others had to bide their time quietly. They could read, draw or sleep. When the last kid was finished, some kids played Uno. I taught one table how to play Boggle.
  
  After another hour of this, the kids left and they brought in all the kids that had still not finished from the “mod” room. They were told they could rejoin their regular classes as they were done. So the kids started filling bubbles. I didn’t see them reading the passages.
  
  As these are baseline tests, they will be administered again later in the year. The same will happen in math, with tests killing morning classes for two days. Then there will be the state Common Core testing, mandated by the federal government cancelling classes for four days. Then, some teachers will be pulled out of class for 4-6 days to score the tests.
  
  On top of this, there could be tests (a baseline and an end-of-year) given in other subjects like Science or Social Studies if the school decides to. For the first time, NYC schools this year can also opt to give a test in Art for 8th grade or high school. It will also be two days. Next year this will include 5th grade as well.
  
  Parents should think long and hard when it comes time to vote in elections – do we really get valuable data from these tests, or is it a money making scam for firms like Questar ( who made today’s tests).
  
  LikeLike
SomeDAM poet says:

September 21, 2017 at 7:21 am

“Objective Measures”

Coleman is objective
He doesn’t give a s&!t
His SAT directive
Is Darwin’s test of fit

LikeLike

- Duane E Swacker says:
  
  September 21, 2017 at 9:30 pm
  
  And yours from a couple of months ago:
  
  Whatever is measured counts
  Whatever counts is measured
  And counting whatever measures
  Is measuring whatever counts
  
  LikeLike
Duane E Swacker says:

September 21, 2017 at 9:29 pm

“I do think too one has to make an effort to hear the civil rights argument AGAINST tests.”

Tell me, Chiara, Why is it okay for the state to discriminate against students, rewarding a few and punishing many via standardized test scores?

That is THE civil rights issue currently confronting public education these days.

It’s okay to discriminate against some students in regards to mental capabilities/abilities? How are mental characteristics any different than other personal characteristics like gender, skin color, handicap that have been adjudicated to be unjust and unconstitutional?

LikeLike

September 20, 2017 at 11:46 am

Here’s an example of how the “portfolio” approach can be gamed by high income parents:

“Of course, if Portfolio students do happen to aim for the Ivies, many years from now, they will be ready—perhaps even at an advantage.

“Our approach of building impressive student portfolios from the age of 5 is preparing them for admissions,” says Leibowitz, who notes that top schools, including MIT, now review portfolios of student work alongside essays and other application materials.”

Building impressive portfolios from age 5. The catch is this private school costs 35k a year.

The lower income student needs a REALLY high test score to compete with the Portfolio School kid. They don’t have anything other then the score. Take away the score and what happens? They submit their ordinary, working class public high school transcript to compete with children who have been building portfolios since kindergarten? They’ll get killed.

https://www.fastcompany.com/40417817/can-techie-parents-reinvent-school-for-everyone-or-just-their-rich-kids

dienne77 says:

September 20, 2017 at 12:00 pm

Sure. But high-income parents game the tests too. What do you think Kaplan and Kumon and all those other high price “tutoring” places are all about?

LikeLike

- NYC public school parent says:
  
  September 20, 2017 at 7:12 pm
  
  dienne77,
  
  You can only game the standardized tests so much. We have a system where scores only count up to the point they can be gamed.
  
  If you look at the British system, you will see that privilege helps, but only up to a point. Just weeks ago, there was a huge scandal among a few of the most privileged “public” schools (which are the elite private schools in England). They were cheating on the standardized exams to help their students get better scores (giving advanced knowledge of the questions). That’s because the results of those exams is what determines university admissions.
  
  In the US, there’s no need to do this. If you are a privileged private school student and your SATs or ACTs aren’t top notch, your school just certifies you as being much better than the less privileged applicants from public schools who outscored you on standardized tests. And you will be admitted while many students with higher test scores aren’t.
  
  But that can’t happen in England. Cheating the system can, but at least it is recognized as cheating when you are caught.
  
  LikeLike

September 20, 2017 at 11:55 am

Could you even do it on 7200 a year per kid? That’s the Ohio funding,plus or minus.

If you’re a high school math teacher with 5 classes a day 30 kids each, can you put together a portfolio for each student that will allow them to compete with kids from private schools where they spend 35k a year per kid? You have to give them something and without test scores they’d just have grades.

it’s sort of a scale problem. How many kids apply to Ohio State each year? They’ll examine a portfolio for each one? I doubt it. They’ll start culling using other factors.

Roy Turrentine says:

September 20, 2017 at 3:44 pm

Chaira: your argument is like the argument which the wilsonian progressives had in the early twentieth century. They saw the aristocratic sons and daughters of the Guilded age making themselves able to compete with a good education acquired by their status. Test were seen as proof of ability to differentiate between real excellence and just wealth. Your concerns are real.

We can game a test, this we have seen. We can game other evaluation techniques too, as was pointed out above. How then, do we allow special abilities within impoverished populations to rise to the best institutions and programs, that they may be a boon to society and personally successful? Your concerns are real.

My suggestion is that we move toward a model of instruction that reduces the emphasis on testing and widens the personal involvement between teachers and their students. It is difficult for a teacher who sees 140 students a day to have any real idea of how they might perform in any particular capacity. Unlike other reform suggestions, this one will be expensive. We will have to invest in twice the number of teachers we have today. Your taxes will go up.

I would also focus on the basic knowledge necessary for society to better, not necessarily the technical education that will result in a good job. I consider that to be the job of industry.

This is all expensive.
.
But the kids deserve it.

NYC public school parent says:

September 20, 2017 at 6:43 pm

Here is the research I would really like them to do:

Have the 535 US Congress representatives (House + Senate) sit for the same standardized test that is given to students all over the country. Let’s see how well they do.

GregB says:

September 20, 2017 at 11:47 pm

It occurs to me after being part of this community for the past year and a half that it’s actually not impossible to assess critical thinking skills. Provide a healthy mix of small classes sizes, teachers with autonomy who are treated as professionals, a broad based curriculum that includes the arts and physical education activities, proper nutrition and health care, administrators who pay active attention to their teachers and students, communities that value their schools with nurturing support and funding, and robust remedial and special needs programs that incorporate all of the above points. I have a gut feeling that if these things were realized, just about everyone but the most closed minded would be able to assess critical thinking skills with ease. No standardized testing required. No profiteers admitted.

dianeravitch says:

September 20, 2017 at 11:59 pm

GregB,

That is what Pasi Sahlberg calls small data as opposed to Big Data. The latter is analytics and algorithms. Small data is personal knowledge of individual children.

LikeLike

Duane E Swacker says:

September 21, 2017 at 9:32 pm

YEP!

(But cut it out, way too logical)

LikeLike

Cabot Pyle says:

September 21, 2017 at 6:00 pm

Thanks for sharing, recruiting talent is the most imperfect of imperfect sciences. I spoke with someone today whose son has just not tested well and is depressed he will not get into his college of choice. Great kid, knows exactly what he wants to study in college, knows exactly which college he wants to attend for very well thought out reasons, probably working harder than any of his HS senior peers right now.

Cabot Pollard Pyle
Executive Director
Dugas/Turner Family Foundations
138 Second Avenue North, Suite 200
Nashville, Tennessee 37201
phone and fax: 615.846.2053
cell: 615.476.2818

Duane E Swacker says:

September 21, 2017 at 9:33 pm

“recruiting talent is the most imperfect of imperfect sciences.”

Ummmm. . . . . It’s not a science in any meaning of that word.

LikeLike

September 21, 2017 at 6:05 pm

We do have to test, though, we have to have some measures and baselines.

Duane E Swacker says:

September 21, 2017 at 9:36 pm

“We do have to test, though, we have to have some measures and baselines.”

And I’ve got some great white sand ocean front beach property for sale over at Lake of the Ozarks in Central Missouri. Cheap, dirt cheap, just like those “measures and baselines”. Complete falsehoods.

See my response at the top of the comments concerning that “measurement”. Hint, there is no measuring going on at all.

LikeLike

Study: NAEP Does Not Assess Critical Thinking Skills: No Standardized Test Can Do That

40 Comments Post your own or leave a trackback: Trackback URL

Leave a comment Cancel reply

Search All Posts

Previous posts

Recent posts

Top posts

Follow blog via email

Follow blog via RSS reader

Blog Stats

Study: NAEP Does Not Assess Critical Thinking Skills: No Standardized Test Can Do That

Diane Ravitch's Blog

40 Comments Post your own or leave a trackback: Trackback URL

Leave a comment Cancel reply

Search All Posts

Previous posts

Recent posts

Blog Topics

Top posts

Follow blog via email

Follow blog via RSS reader

Blog Stats