John Thompson: The Misinterpretation of NAEP Data

John Thompson writes here about the negative consequences of shallow reporting on NAEP data. Reporters are sensitive to whether scores are up or down, but tend to ignore contextual factors that may play a role in student performance.

He writes:

Despite the problems with education metrics, the decline in the nation’s 2022 math and reading scores on the National Assessment of Educational Progress (NAEP) test is worrisome – if we look at the big picture.

As Diane Ravitch explained, the decline in scores during the pandemic was a “duh” moment. Rather than publishing panicky headlines, these predictable drops in scores should be seen in the broader context of the decade of declines which followed the implementation of rushed and simplistic corporate school reforms. And, as we should have done previously, we must acknowledge what reformers should have previously understood – meaningful increases in learning require inter-connected, holistic team efforts, as opposed to metric-driven instructional shortcuts.

And we should also listen to Peggy Carr, commissioner of the National Center for Education Statistics (NCES), which administers the tests. “The new data, she said, ‘reinforces the fact that recovery is going to take some time.” Carr and other experts also warn that the “academic decline is part of a broader picture that includes worsening school climate and student mental health.”

For example, “Oklahoma NAEP results reflect pandemic-fueled decline in math and reading scores.” Eighth grade reading in Oklahoma (which reopened schools more quickly than most states) declined by 7 points, compared to a three-point average national decline. Our Eighth grade math scores declined by 12 points, compared to a nationwide decline of eight points. And the state’s and the nation’s “plunge” in history scores has been worse.

But the story behind those numbers is complicated. So, before we can understand the mixed messages of short- and long-term NAEP findings, we how they have often been misrepresented by the non-education press.

Chalkbeat properly quoted Peggy Carr, “There is nothing in this data that tells us there is a measurable difference between states and districts based solely on how long schools were closed.” And Education Week appropriately explained that all but the top-performing students saw declines, but the biggest drops were for the lowest-performing students, who were more likely to have parents who were “essential workers” who were disproportionately exposed to Covid, who were more likely to live in multi-generational households, and had the least access to medical care. Moreover, it further explained, “Reading scores for students in cities (where schools tended to be slower to reopen) stayed constant, as did reading scores for students in the West of the country.”

Yes, Covid closures led to an unprecedented decline in test scores, but many commentators should look more deeply at public relations spin dating back to the Reagan administration that inappropriately used NAEP test scores when arguing that public schools are broken. They stressed low levels of “proficiency” claiming that it correlated with grade level. And Jan Resseger explained:

A common error among journalists, critics, and pundits who misunderstand the achievement levels of the National Assessment of Educational Progress (NAEP). “Proficient” on NAEP is not grade level. “Proficient” on NAEP represents A level work, at worst an A-. Would you be upset to learn that “only” 40% of 8th graders are at A level in math and “only” 1/3 scored an A in reading?

On the other hand, the admittedly unprecedented (but expected) fall in NAEP scores during Covid followed a decade of stagnating or declining NAEP scores. Moreover, the recent release of falling history scores should lead to an open discussion about why the U.S. History scores have declined by 9% since 2014.

And Chalkbeat stresses the need for conversations about the last two years, when “nearly every state has considered a bill that would limit how teachers can discuss racism and sexism in their classrooms, and 18 states have bans or other restrictions in place, according to a tracker compiled by Education Week.” For reasons I explain later, I’m especially impressed with its recommendation regarding the need for “weaving the (historical) material into other places in their (classrooms’) schedule.”

I began teaching History at John Marshall H.S. in the early 1990s during the crack and gangs crisis and after the standardized testing of the 1980s peaked. For the next 1-1/2 decades, outcomes improved at Marshall and in the nation as a whole. Marshall had serious problems, but I couldn’t believe how many great teachers it had. We had the autonomy necessary to teach in a holistic inter-connected, cross-disciplinary manner. When I saw students carrying copies of Ralph Ellison’s The Invisible Man, I had the freedom to deviate from the curriculum schedule, and teach about Ellison’s childhood in Oklahoma City, and how it informed his novel. We took fieldtrips to the Capitol, and had regular classroom visits by legislators and local leaders. And we watched excellent programs on OETA (which our Gov. Kevin Stitt recently tried to defund.).

Rather than teach to the test, I’d post the day’s State Standards, and History in the News topic. Students would drop by before class to peek at the day’s History in the News question. They quickly learned how to “weave” historical narratives into contemporary issues.

Marshall improved more than any other OKCPS neighborhood high school until the No Child Left Behind Act of 2001’s and Race to the Top’s test-driven mandates became dominant. By the time I retired in 2010, my students who came from the poorest neighborhoods complained that they had been robbed of an education. When guest teaching up to 2020, I saw young teachers who wanted to offer culturally meaningful instruction but it was hard for educators and students to do something that they rarely saw in a 21^st century classroom.

Getting back to the type of solutions discussed in Chalkbeat and Education Week, Education Watch’s Jennifer Palmer wrote a hopeful piece about a pilot program at F.D. Moon Middle School. It uses “a social studies curriculum built on encouraging students to engage in civil discourse and celebrate American ideals while also examining darker chapters of history.” The program was created by iCivics, founded by retired U.S. Supreme Court Justice Sandra Day O’Connor. Its U.S. History curriculum is “based on the Roadmap to Educating for American Democracy, a joint project with iCivics, Harvard, Tufts and Arizona State universities.”

Palmer witnessed the energy displayed by Beatrice Mitchell’s 8th grade social studies class. All of them “passed the U.S. naturalization test, a new graduation requirement starting this school year.” This stands in contrast to a recent survey which “found just 1 in 3 adults can pass the exam … Oklahoma’s passing rate was even lower at 1 in 4 adults.”

It is unclear whether this nonpartisan program will clash with the Oklahoma Board of Education’s special report on “diversity, equity and inclusion programs at the request of State Superintendent Ryan Walters.” As Palmer noted, “Walters, a former history teacher, claimed such programs are ‘Marxist at its core.’” At any rate, it’s not just history that must be woven into other subjects. If we hope to teach critical thinking and 21^st century skills, schools must abandon their test-driven silos, and teach students to be independent thinkers who listen, and learn how to learn. And, holistic instruction must be restored, as one part of serving each whole child. A first step, however, should be the non-education press shifting from alarmist headlines to meaningful solutions reported in the education press.

rcharvet says:

June 27, 2023 at 11:40 am

“And, holistic instruction must be restored, as one part of serving each whole child.” I know this works; I did it with my students (going rogue when the doors were locked) and they learned in their “whole child way.” I was always a “how’s and why’s” teacher, but was always told, “…you have to teach like the others. You need to follow the standards. You need to ‘double up on your standards…” Blah, blah, blah CRAP. I have stated this before, but I said, “We are the eyes and ears of the child; we LISTEN to their needs…” Once again, I was told “not going to happen.” Yes, school needs to be a magical and meaningful place where WE ALL LEARN TOGETHER. I just saw a few more of my “lost soul” former students and all are doing well despite going to the “crap school for losers with the crap teachers.” Many have taken their unique talents and are in construction, repairing my car, installed my water heater, entrepreneurs, poets, and artists. They are healthy, happy, and communicate well because we “did it our way.” It was conceptual and applicable instruction to help students sustain their lives.I digress.

SomeDAM Poet says:

June 27, 2023 at 12:04 pm

I find all the discussion of NAEP scores puzzling.

What’s the point?

If NAEP scores actually went down as a result of the pandemic, so what?

what is supposed to be done about it?

Can anything actually be done about it?

And if we (and more precisely our teachers) can’t do anything about it, what’s the point of even discussing it?

Many (though obviously not all) of us get how the NAEP scores should not be used.

But I think the much more important question (one that i have never seen discussed here or anywhere else) is how precisely, should they be used?

For example, What difference would there be (if any) if there were no NAEP test and scores?

Id like to see a detailed description of why the scores are important and specifically how they should be used (eg, to set education policy, curriculum, etc

Perhaps someone at NAEP would be so kind as to write something to be posted here that describes the detailed practical reasons for and proper uses of the scores — which teachers could use to inform their teaching.

RageAgainstTheEdumeddlers says:

June 27, 2023 at 12:22 pm

All of that and luck much more is just a click away:

https://nces.ed.gov/nationsreportcard/

Unfortunately, most education parrots, I mean reporters, fail to do their homework.

- SomeDAM Poet says:
  
  June 27, 2023 at 3:58 pm
  
  I already visited that site.
  
  That’s why I asked the questions.😊
  
  I’m looking for a straightforward explanation of why the test is as important as some believe and why it is useful.
  
  I think it was Einstein who said “if you can’t explain something simply, you don’t really understand it”
  
  Maybe Diane could provide a simple explanation that people could read here without having to wade through a mountain of statistical mumbo jumbo
- RageAgainstTheEdumeddlers says:
  
  June 27, 2023 at 4:55 pm
  
  The NAEP is perceived to be important by some thanks in large part to self promotion, it is billed as the “Nation’s Report Card” after all. It is the only common standardized test administered nationally at the elementary/secondary levels, which implies that it provides an apples to apples comparison of states and between various sub-groups in any tested year (despite variations in standards). Comparisons across the decades does not factor in the big switch to Common Core standards in 2012. NAEP also administers tests in 10 different subject areas, a level of thoroughness that may also imply (self) importance.
  
  “Useful” to parents, administrators, teachers, BOEs, or (god forbid) students? Ha! Less than useless, thanks to the negative press produced by incompetent reporting. Useless to all of these stakeholders because 97.5% of students in any one of the tested cohorts – are not tested. And for the 2.5% who are tested, individual scores are never revealed. And the testing of seniors in their last semester is an absolute fool’s errand. Note they don’t even bother reporting these ridiculously useless scores.
  
  Useful to the media for generating attention grabbing headlines and sound bites that misrepresent, mislead, and gaslight the general public.
  
  Just my dos centavos
- dianeravitch says:
  
  June 27, 2023 at 6:06 pm
  
  There are no individual scores on NAEP. No student takes the entire test.
- RageAgainstTheEdumeddlers says:
  
  June 27, 2023 at 6:26 pm
  
  NAEP testing should be important (and useful) because it provides a more acceptable model for standardized testing that would be a significant improvement over federal ESSA testing requirements.
  
  NAEP uses grade span testing at the 4th, 8th, and (foolishly) the 12th grades – and it uses representative sampling methods that only affect/inconvenience 2.5% of each cohort tested.
  
  NAEP also administers tests in 10 different subject areas which would encourage a more well-rounded curricula and theoretically alleviates any testing pressure on those same two subjects (math and ELA) that are pounded relentlessly.
  
  NAEP remains a no-stakes, non-threatening, non-punitive battery of tests that are relatively innocuous compared to the disaster of RTTT and the CC waiver programs of the Obama/Duncan era.
  
  Congress would do us all a favor by adopting the NAEP model of standardized testing in the next ESSA re-write.
  Ha!
- bethree5 says:
  
  June 29, 2023 at 6:37 pm
  
  Rage– agree 100% with your reasoning. Bottom line: Congress should remove NCLB/ ESSA testing entirely; NAEP is already in place.
Ohio Algebra II Teacher says:

June 27, 2023 at 12:37 pm

And how would these always anti-schools/anti-teachers commentators have responded if the scores didn’t go down? Perhaps something like: “See, the schools are worthless. Scores aren’t any lower during the pandemic than under perfect conditions.” They are so disingenuous, and the one thing you can count on from them is dishonest, one-sided rhetoric.

- YVonne says:
  
  June 27, 2023 at 12:48 pm
  
  TRUE! Thank you, Ohio Alegra II Teacher.
NoBrick says:

June 27, 2023 at 12:45 pm

No tickey, no washey.

No scores,
no score based avatars.

No avatars, no reason
to capitulate to badges
and names, to large
societies and dead
institutions.

Puzzling…
Former overseers, undoing
the magic, that happened
during/under their watch.

NOW you’ve got a
brake pedal to stop
the shit???

bethree5 says:

June 29, 2023 at 7:57 pm

SDP—you have a lot of good posts on this subject, but I keep coming back to this one. I don’t think anyone has answered your question, so I’ll give it a stab.

My understanding of the raison d’être for NAEP testing [started 1969] goes back to the original mission of the Dept of Ed in 1867. [It was promptly demoted to Office of Ed, and sometimes name got changed; part of Dept of Interior, then HEW before becoming a separate Cabinet agency.] Its mandate in the 1860’s was “education fact-finding,” i.e., collecting data on the status of US K12 schooling (both public and private).

The data was collected as fodder for developing federal education policy. Data-collection designed for purposes of comparing US K12 ed in various swaths of the country. NAEP fits right into that picture: always used to compare regions, now used for state-to-state comparisons as well. For the last 20+ yrs, they also collect data on large urban schools via TUDA, comparing their scores to national averages.

My take is, NAEP data was always intended to be broad-brush, giving general indication of trends, highlighting geographical areas &/or social strata that might need attention/ intervention from fed govt in this individual-state-provided enterprise.

Very much like the American Statistical Association’s description of what stdzd tests are (& are not) good for (see its seminal 4/8/14 criticism of VAM), the data is not suited for indiv vs indiv comparison, nor district vs district nor school vs school.

How is the data used? All states that receive Title I funding are reqd to participate in the NAEP testing sample, so it’s apparently used in determining where to direct that funding. Data results trigger various studies and projects at the fed level. It is also used at the state level to develop policies/ innovations, and more. See https://nces.ed.gov/nationsreportcard/about/policy_practice.aspx

You’ve said “if we (and more precisely our teachers) can’t do anything about it, what’s the point of even discussing it?” So, fed & state ed projects/ funding et al are developed, in part, based on NAEP stats. Obviously nothing teachers can do about it; they’re on the receiving end. I think it’s good for us (& parents, & taxpayers) as background info. To understand the system we’re swimming in, & be alert to its misuse.

RageAgainstTheEdumeddlers says:

June 27, 2023 at 12:17 pm

A thought experiment, if you will. You spent your 6th and 7th grade school years without the all important structure provided by formal, everyday in-school instruction. Your plunked into 8th grade, spending several months trying to simply reset and get back to your normal school rhythm as best you can remember from your 5th grade year. January of 2022 rolls around, your just coming off holiday break and you are asked to sit for the NAEP math test. Before it is administered you ask it it “counts” and you are told that no it does not, but it is important that you do your best. You then ask when will you get your score and are politely informed by your teacher that you will never get your score. But, you are reminded, it is an important test and are urged to do your best. If you teach 8th grade, you know the rest.

Threatened Out West says:

As a social studies teacher, I can definitely agree with Mr. Thompson. Right now it is terrifying to attempt to teach history and civics and geography. We cannot teach current events in most cases, and we certainly cannot teach racism, LGBTQ issues, and ethnocentrism. Just before COVID, I showed students a political cartoon that had a distorted map of the world on “how Americans view the world.” It had some stereotypes that Americans sometimes use. It was a great discussion point and students loved discussing it. This was a cartoon I had used for more than five years with no issues. I would preface the discussion that these are not true statements but that these stereotypes are present in our national discourse. A grandparent (not even a guardian) saw the cartoon, as we had to post notes on our Google Classroom, and freaked out, accusing me of hating America, and teaching the kids distorted geography, not getting it was not intended for that purpose. The principal made me get rid of it and now the discussion of ethnocentrism is barely a minute long discussion. I don’t dare do more.

Bob Shepherd says:

June 27, 2023 at 12:39 pm

This is obscene, and your principal is a moron.

retired teacher says:

June 27, 2023 at 1:05 pm

That’s the problem with having non-educators looking over teachers’ shoulders and plans and totally missing the point. The same can be said about knee jerk responses to a single personal complaint that can pull important literature from the shelves and deny other students access to the book. Right wing extremists are doing everything they can to put public schools and teachers on the defensive.

Bob Shepherd says:

June 27, 2023 at 2:12 pm

Truly horrifying, TOW! I wish you were telling this to the United States Senate.

- Threatened Out West says:
  
  June 27, 2023 at 4:52 pm
  
  I would if the US Senate would listen.
- Bob Shepherd says:
  
  June 27, 2023 at 4:53 pm
  
  I know. But this is exactly the kind of thing that I wish they knew.

Bob Shepherd says:

June 27, 2023 at 12:32 pm

About “Misinterpretation of NAEP Scores”

Let’s assume just for giggles that these tests actually measure what they purport to measure—reading and math ability.

Possible scores on a NAEP test range from 0 to 500. These are 500-point tests. Average NAEP scores for Grade 9 students in 2022 declined 5 points in reading and 7 points in mathematics compared to 2020. So, we’re talking declines of 1 percent and 1.4 percent.

1 to 1.4 percent. Barely a tick off the dial. A TINY blip. I mean, declines SO SMALL that they might be well within the margin of error of the testing.

MUCH ADO ABOUT ALMOST NOTHING.

To put this into perspective, suppose that you had a grading scale like this for a classroom test:

A+ (97–100), A (93–96), A- (90–92), B+ (87–89), B (83–86), B- (80–82), C+ (77–79), C (73–76), C- (70–72), D+ (67–69), D (65–66), D- (below 65)

A decline of 1 percent would not even, typically, move you down a portion of a letter grade. Oh, gosh, I dropped from a 99 to a 98 (from an A+ to a slightly lower A+), from an 88 to an 87 (from a B+ to a slightly lower B+). Or, worst case, from an 87 to an 86 (from a B+ to a B).

Oh, the horror!!! Where are the smelling salts? The sky is falling! This is the end!!!! Quick, call Bill Gates! He has the solution to every problem, even ones this dire!!! Maybe ChatGPT can solve this biggie? Or Clippy the Paperclip! It’s surely going to take a long time to recover! The sky is falling! The sky is falling! Aie yie yie. Ridiculous.

Yet journalists and pundits keep on talking about this as though this “decline” (oh, the horror!) were significant.

It’s not.

Duane E Swacker says:

June 27, 2023 at 7:08 pm

Correction:

“MUCH ADO ABOUT ABSOLUTELY NOTHING.

- SomeDAM Poet says:
  
  June 28, 2023 at 7:31 am
  
  There are a few things we can be certain of: time will keep passing. The Earth will keep going around the sun. And mathturbators will continue to mathturbate.
SomeDAM Poet says:

June 28, 2023 at 7:24 am

Even if the scores actually mean something with regard to educational achievement, comparing scores over the short term is a fools errand because of what amounts to “noise” on any possible signal.

It’s like trying to track climate change by looking at changes in the global mean temperature from one year to the next (which can swing wildly due to “weather noise” — eg, from el Nino and volcanic eruptions — that have nothing to do with the climate signal one is interested in). It’s just dumb. But lots of otherwise intelligent people engage in such mathturbation.

If there is any meaningful information to be had from NAEP, it can only be gleaned from long term trends — certainly over a period longer than from one test administration to the next.

- Bob Shepherd says:
  
  June 28, 2023 at 2:51 pm
  
  Well observed, SomeDAM

June 27, 2023 at 12:47 pm

From the article:

“Eighth grade reading in Oklahoma (which reopened schools more quickly than most states) declined by 7 points, compared to a three-point average national decline. Our Eighth grade math scores declined by 12 points, compared to a nationwide decline of eight points.” Again, these are 500-point tests. So,

7/500 = 1.4%
3/500 = 0.6%
12/500 = 2.4%

The 12-point decline is a LITTLE BIT meaningful, PERHAPS, but the others simply are not large enough to be significant.

Bob Shepherd says:

June 27, 2023 at 1:03 pm

8/500 = 1.6%

Susan L Osberg says:

June 27, 2023 at 4:23 pm

Thank you, Bob, for doing the math here. You are correct that so often even the slightest “blip” is made to be a giant abyss that must be overcome. This is especially true with a 500-point test. Most folks think of a test as 100 points so the 12-point drop would be 12% and a “grade of 88; as you show, on the 500 point test it is a smaller percentage.
Had I had the time today, the mathematics teacher in me would have done the math as well.
Thank you again.

- Bob Shepherd says:
  
  June 27, 2023 at 4:23 pm
  
  Thanks, Susan

John Thompson says:

June 27, 2023 at 12:52 pm

Bob, when you look at the decade of decline with seeing how test-driven corporate reforms undermined meaningful instruction, isn’t that important? Otherwise, aren’t you saying teach-to-the-test caused minimal damage?

Bob Shepherd says:

June 27, 2023 at 1:19 pm

I think that teaching to the test caused enormous damage. I put little stock in these tests as valid measures of reading and writing ability. But even if I did, NAEP scores have remained pretty mostly flat, barely unchanged (very tiny blips up and down), for decades now. Immediately after NCLB, people started doing lots of test prep, and scores improved INITIALLY because kids had learned the formats of the test questions. When kids are familiar the formats of test questions, they do better than if they are not familiar with them. That’s why you can always improve scores by doing practice tests. But after that initial improvement, NAEP scores, as you know, pretty much went flat, which meant that the standards-and-testing-based “reforms” had NO EFFECT by the “reformers'” own preferred measure–the test scores.

There is no question that the decades of test and punish had dramatic negative effects on K-12 learning. Because the puerile Gates/Coleman “standards” bullet list and the tests that purportedly (but could not possibly) tested for achievement of those became so important–because school grades and educator bonuses and jobs became dependent upon them–they became all people cared about. All other subject areas except math and ELA started to be pretty much ignored, certainly deemphasized. And in ELA, coherent curricula were replaced by curricula that strung together random exercises on random “standard” from the CC$$. And, since those ELA “standards” were so vague and broad, they could not be validly tested by the state tests (or the NAEP), as in this was an impossibility, like a square circle or squaring the circle or a perpetual motion machine. And, of course, much of attainment in ELA involves learning of content, including both descriptive content (what is the genre of Frankenstein) and procedural content (how to format a Works Cited page), and since the tested “standards” are almost content free (they are a list of vaguely and broadly stated skills so vaguely and broadly stated that they are untestable by the means used), our curricula and pedagogy became so as well. In other words, there was a DRAMATIC DEVOLUTION of curricula in ELA as a result of the testing regime, which I witnessed up close and personal as a textbook writer and editor.

- Bob Shepherd says:
  
  June 27, 2023 at 1:22 pm
  
  I detail these problems with the testing regime and its horrific effects on ELA curricula and pedagogy here:
  
  Combating Standardized Testing Derangement Syndrome (STDs) in the English Language Arts
Bob Shepherd says:

June 27, 2023 at 1:26 pm

So, no, I’m not saying that teach-to-the-test caused minimal damage. It caused a dramatic devolution in ELA curricula and pedagogy. It led to breathtakingly superficial and incoherent ELA curricula because educational publishers started taking the test questions and the Gates/Coleman bullet list as the de facto curriculum outline!!!!! And, the tests are extremely poor measures of achievement in ELA.

Bob Shepherd says:

June 27, 2023 at 1:34 pm

John, I have enormous respect for you. I always look forward to your articles. They are typically really insightful. But these NAEP scores aren’t significant. They are barely a blip down.

- Catherine Blanche King says:
  
  June 27, 2023 at 1:43 pm
  
  John FWIW, I also endorse everything Bob says about the “turn” that occurred in education in ELA and other related curricula (history, the arts, philosophy, even the social sciences). It’s nothing less than a tragic turn of events in education, especially in a democracy where the education of persons as knowledgeable, critical, thoughtful, and respectful people, and not as replicating copies of Bill Gates or (argh!) Elon Musk, and their absence of depth (Plato’s “flat souls”), is essential for living with others in civilized and vibrant communities, from local to world. CBK
- Bob Shepherd says:
  
  June 27, 2023 at 1:52 pm
  
  I was working as an Executive VP at a well-known educational publishing house. We were told by our head office that we had to start every project, going forward, by making a list of the CC$$ in a spreadsheet and noting in other columns where that was “covered” in the program. And exercises and activities in the books and online materials were redone to make them look like the test questions. Where before, we would sit down and plan a coherent unit on some topic, now we were to turn every bit of instruction and every exercise into something that treated some content-free CC$$ “standard.” Many of the finest editors and writers I knew became so angry about this that they quit. They literally left the profession rather than be part of its destruction. And the morons who foisted the testing regime and those ridiculous “standards” on us, have NO CLUE, NO IDEA WHATSOEVER, that this even happened. Idiots.
- Catherine Blanche King says:
  
  June 27, 2023 at 3:51 pm
  
  Bob About the “standards,” you get to the point where there are no words (pun intended).
  \
  However, I think this is where Diane has been a shining star . . . for decades. CBK
- Bob Shepherd says:
  
  June 27, 2023 at 3:52 pm
  
  true that
- Bob Shepherd says:
  
  June 27, 2023 at 1:58 pm
  
  The CC$$ is barely distinguishable from the state “standards” that preceded it. Here’s the difference: Before the CC$$, educational publishers pretended to correlate to all those state standards. This was marketing smoke and mirrors. But when CC$$ appeared, it became THE DEFAULT CURRICULUM OUTLINE IN ELA, with profoundly deleterious effects. And Gates and Coleman, ofc, are clueless about this.
leftcoastteacher says:

June 27, 2023 at 3:09 pm

I agree, John, and I’m sure Bob does too, that it’s important we tell edu-corporatists: Hey, before you go tearing other people’s hair out over any pandemic score drop, insignificant though it is, don’t forget that any decline in the quality of public education is your fault in the first place, not the fault of teachers unions. Pay no attention to statistically insignificant changes in irrelevant test scores, but if you insist on freaking out over them, at least take ownership of your responsibility, and stop blaming teachers unions for anything and everything. Over-testing, budget cutting, and Competency-Based Education don’t work, so quit it, and get out of the blasted way so that teachers can do what needs be done to restore the steady improvement that was associated with the ESEA before it was destroyed by the NCLB.

The NAEP says the scores are intended to improve education. Well, the scores suggest that we should stop collecting scores and instead fight a war on poverty.

SomeDAM Poet says:

June 28, 2023 at 7:46 am

The problem with the whole testing regime is that “increased test scores” are the be all and end all.

If scores go up, everyone pats themselves on the back and says “what a good job we are doing”.

But of course, if scores don’t go up — remain flat or even (gasp) decline, — everyone has a Hindenburg moment (“oh, the humanity!), pulls their hair out trying desperately to figure out what can be done to boost the scores.

Boosting the scores is the focus because high scores equals educational achievement.

What do the tests measure? Educational achievement

What is educational achievement? What the tests measure

Any other questions?

- bethree5 says:
  
  June 29, 2023 at 3:52 pm
  
  BINGO, SDP!
  
  NAEP scores in math, reading, history all move by increments of a few points [say, 3 – 5 pts on a 500-pt test (which = 0.6% – 1% on a 100-pt test)], either up or down, resulting in a roughly flat line over two to three decades of testing. Yet uninformed score observers apparently expect to see a consistently upward trend– even though the test-takers are in the same grade level every time.
  
  Just imagine what THAT would look like. We DEMAND a 0.8% increase annually! Hey, we’ve been measuring these scores since the early ‘90s, and you mean to tell me we’re still getting the same results? Why in 30 years those scores should have increased by 24% — by more than two letter grades! In another 30 yrs those 8th-graders should be doing college-level work while they’re finishing middle school!

SteveA says:

June 27, 2023 at 2:30 pm

Ohio Algebra II teacher asks, “And how would these always anti-schools/anti-teachers commentators have responded if the scores didn’t go down?”

They would have said, “See, computer-based learning works. Students at home computers during covid-19 lockdowns learned just as much as students in classrooms. There is no loss of learning. And (possibly unsaid), it demonstrates that we should let Gates and Big IT run our education system.”

Lloyd Lofthouse says:

June 27, 2023 at 3:10 pm

Don’t shoot the messengers, the reporters. Shoot the CEOs and department editors, if you must.

If you do not know how a newsroom works, then it is understandable to blame the journalists, the reporters.

Still, most if not all stories are assigned to reporters/journalists by department editors and approved by those editors before publication. Those editors may also cut the size of a story, revise it, et al. changing what the grunt wrote.

The grunt is the reporter, the journalist that gathers the data, does the research, for a story. Most if not all of those reporters work on what they were assigned and have deadlines they must meet or risk being fired.

Most reporters are also not paid a living wage for the areas where they live. They are like classroom teachers, doing a job they thought they’d love. Those reporters like teachers started out with passion and energy until they were burned out but a flawed system run by CEO’s and managers that have never taught a day in their lives or been in a newsroom reporter.

Many newsrooms also get their stories from the Associated Press (AP) or its competition – editors pick what they want to run and may also revise and edit the original: “The Associated Press is an American not-for-profit news agency headquartered in New York City. Founded in 1846, it operates as a cooperative, unincorporated association, and produces news reports that are distributed to its members, U.S. newspapers and broadcasters.”

The AP also has competition:

https://craft.co/associated-press/competitors

To understand the news media, one must also know that 90% of the traditional media is owned and controlled by seven corporations. At the top of each corporation is a CEO and those CEOs may set the agenda for an entire news organization, hiring and firing, until they have a staff of department editors that will do what they want.

What happens when a reporter is assigned to a story like this one and they don’t know much of anything about the NAEP or public education. Does that reporter turn down the assignment and risk losing their job?

How much time does each reporter have to write a story? Not much. Most stories come with short time periods before the deadline, and reporters may have more than one story to work on at the same time.

Organizations like ALEC and M4L that may understand how the media works, for sure, uses that knowledge to manipulate the system, so their cherry-picked, misleading message gets reported first. Trump, for all his shortcoming and malignant narcissism, clearly understands how to manipulate the media and that helped him win the 2016 election. That has also helped him keep his BIG LIE alive and spread, for free.

Lloyd Lofthouse says:

June 27, 2023 at 3:14 pm

How do you beat manipulating, cherry-picking, lying organizations like ALEC and M4L when it comes to getting their message out first?

Do you know anyone with a crystal ball that reveals the future?

GregB says:

June 27, 2023 at 3:20 pm

If there’s one thing I’ve learned here over the years, it’s to be skeptical, but not suspicious, of data until proven otherwise. When we apply data to human activity and foibles, it’s a matter of perception and interpretation, and that, as we see, is debatable to many even if they experience and see it. After the fact they can lie about it.

On the other hand, scientifically observable data needs to interpreted by a citizenry that respects and understands both the processes and conclusions one can draw from application of the scientific method. We obviously have anywhere from 70-90% of the population for whom that is asking too much. It includes all republicans and a vast majority of everyone else. They interpret “data” however they want as long as it fits the conclusions of their bigotries and ignorances. These are the people who have never been out of the country, have never read beyond the t.v. listings and sports pages, think they live in the greatest place on earth, which, according to them, this is the freest country in the world…as long as you think like they do. Which brings us back to data.

June 27, 2023 at 4:01 pm

I think we can agree that state high-stake test scores aren’t reliable guest-imates of student learning. Often increases in them occur when they drove learning down. And now it should be clear that they did more harm than good. Naep is the best guestimate and it shows, I believe, what I saw throughout my career, we’d be making incremental improvements and some top-down CYA mandates would wipe them out. The worst unforced errors came when stakes were attached to test scores. That’s why this week’s New Yorker article by Alec MacGillis upset me so much. He only cites four ideology-driven corporate reformers and no balanced social scientists, passes on falsehoods, and pushes quick fixes for covid learning loss, ignoring balanced appraisals on how to get our poorest children of color back on track. Of course, they blame teachers and unions.. This may be the time when it’s most important to tell educators to ignore test results and focus on the whole child.
https://www.newyorker.com/magazine/2023/06/26/what-can-we-do-about-pandemic-related-learning-loss

Bob Shepherd says:

June 27, 2023 at 4:26 pm

Well said, John!

Duane E Swacker says:

June 27, 2023 at 7:17 pm

More mental masturbation about standardized test scores. What is new?

What would be new is if everyone here and in the public schools would refuse to implement the standards and testing malpractice regime. . . rescuing all students from the harms inherent in that malpractice regime.

RageAgainstTheEdumeddlers says:

June 27, 2023 at 7:38 pm

What if we held a test and nobody agreed to administer it?

Back in the early years of Common Core testing I tried to get my state legislators to sponsor a bill that would allow teachers to claim a “Conscientious Objector” status that would excuse them from administering the federally mandated tests. It fell on deaf ears as the push for teacher and school district accountability through test scores was all the rage, and no politician wanted to open that can of worms.

- John Thompson says:
  
  June 27, 2023 at 8:56 pm
  
  Great idea
- Bob Shepherd says:
  
  June 27, 2023 at 11:06 pm
  
  GREAT IDEA. We need this at the school level.
- SomeDAM Poet says:
  
  June 28, 2023 at 8:46 am
  
  Testientious Objector
John Thompson says:

June 27, 2023 at 8:56 pm

As testing and choice drove our mid-high to the bottom of the state, I urged our excellent English dept to help us resist. A great Gen xer said, “you’re just like my parents. You Boomers could have unions and autonomy. We can’t.” Sure enough, even as the system got angrier and angrier at Boomers, they looked the other way when Boomers like me refused to comply and doubled down on coercing younger teachers to teach to the test.

democracy says:

June 28, 2023 at 5:09 am

Here’s Dana Goldstein, NY Times reporter, just a week ago, on NAEP scores:

“The math and reading performance of 13-year-olds in the United States has hit the lowest level in decades, according to test scores released today from the National Assessment of Educational Progress, the gold-standard federal exam…The last time math performance was this low for 13-year-olds was in 1990. In reading, 2004…the downward trends reported today began years before the health crisis, raising questions about a decade of disappointing results for American students…The federal standardized test, known as NAEP, was given last fall, and focused on basic skills. The 13-year-olds scored an average of 256 out of 500 in reading, and 271 out of 500 in math, down from average scores of 260 in reading and 280 in math three years ago.”

Nowhere in her article does she note the many problems with NAEP proficiency levels.

For example, a General Accounting Office study of NAEP standards reached this conclusion:

“We conclude that the NAEP scores selected through NAGB’s procedures are incomplete and somewhat misleading representations of the achievement levels. Student performance at the NAEP scores selected cannot validly be interpreted in terms of NAGB’s definitions or of the item judgments. The NAEP scores cannot be used to find the percentage of students who have met the content mastery and readiness criteria defined for each level.”

The National Academy of Sciences called the NAEP proficiency standards “fundamentally flawed.” The makers of the ACT test say a score of 22 of the math portion indicates readiness for college math, yet only 21 percent of those students who score between 21-25 on the ACT math are deemed “proficient” on NAEP.

One thing NAEP seems to measure fairly well is income inequality. Or, to put it a bit more precisely, research has found that between half and two-thirds of the variance in student academic performance on NAEP is explained by a cumulative family risk factor, which includes family income, the educational attainment of parents, family and neighborhood housing conditions, and the ability to speak and read English.

RageAgainstTheEdumeddlers says:

June 28, 2023 at 8:24 am

Another education parrot, I men reporter who didn’t do their homework.

“The federal standardized test, known as NAEP, was given last fall, and focused on basic skills.”

Not quite . . .

The scores reported here were from the Long-Term Trend assessments – not the main NAEP assessments?

It should be noted that the LTT assessments are different from the main NAEP assessments in reading and mathematics. Because the instruments and methodologies of the two assessment programs—LTT and main NAEP—are different, direct comparisons between the long-term trend results presented in this report and the main assessment results presented in other main NAEP reports are not possible.

“The LTT assessments were last updated in 2004 to better align the assessments with more *current practices.”

Hmmmmm . . . ?

*Common Core standards were implemented on a national scale in 2012.

Duane Edward Swacker says:

June 28, 2023 at 3:51 pm

“One thing NAEP seems to measure fairly well is income inequality.”

No, no, no, ten thousand times NO!

NAEP is not designed to “measure . . . income inequality.” One of the primary caveats of standardized testing is that one should not, cannot use the results of a test as indicators of something it wasn’t designed to assess. It’s a major falsehood to claim otherwise.

NAEP doesn’t “measure” anything!

The most misleading concept/term in education is “measuring student achievement” or “measuring student learning”. The concept has been misleading educators into deluding themselves that the teaching and learning process can be analyzed/assessed using “scientific” methods which are actually pseudo-scientific at best and at worst a complete bastardization of rationo-logical thinking and language usage.
There never has been and never will be any “measuring” of the teaching and learning process and what each individual student learns in their schooling. There is and always has been assessing, evaluating, judging of what students learn but never a true “measuring” of it.
But, but, but, you’re trying to tell me that the supposedly august and venerable APA, AERA and/or the NCME have been wrong for more than the last 50 years, disseminating falsehoods and chimeras??
Who are you to question the authorities in testing???
Yes, they have been wrong and I (and many others, Wilson, Hoffman etc. . . ) question those authorities and challenge them (or any of you other advocates of the malpractices that are standards and testing) to answer to the following onto-epistemological analysis:
The TESTS MEASURE NOTHING, quite literally when you realize what is actually happening with them. Richard Phelps, a staunch standardized test proponent (he has written at least two books defending the standardized testing malpractices) in the introduction to “Correcting Fallacies About Educational and Psychological Testing” unwittingly lets the cat out of the bag with this statement:
“Physical tests, such as those conducted by engineers, can be standardized, of course [why of course of course], but in this volume , we focus on the measurement of latent (i.e., nonobservable) mental, and not physical, traits.” [my addition]
Notice how he is trying to assert by proximity that educational standardized testing and the testing done by engineers are basically the same, in other words a “truly scientific endeavor”. The same by proximity is not a good rhetorical/debating technique.
Since there is no agreement on a standard unit of learning, there is no exemplar of that standard unit and there is no measuring device calibrated against said non-existent standard unit, how is it possible to “measure the nonobservable”?
THE TESTS MEASURE NOTHING for how is it possible to “measure” the nonobservable with a non-existing measuring device that is not calibrated against a non-existing standard unit of learning?????
PURE LOGICAL INSANITY!
The basic fallacy of this is the confusing and conflating metrological (metrology is the scientific study of measurement) measuring and measuring that connotes assessing, evaluating and judging. The two meanings are not the same and confusing and conflating them is a very easy way to make it appear that standards and standardized testing are “scientific endeavors”-objective and not subjective like assessing, evaluating and judging.
That supposedly objective results are used to justify discrimination against many students for their life circumstances and inherent intellectual traits.

- democracy says:
  
  June 29, 2023 at 4:40 am
  
  Psst.
  
  “…research has found that between half and two-thirds of the variance in student academic performance on NAEP is explained by a cumulative family risk factor, which includes family income, the educational attainment of parents, family and neighborhood housing conditions, and the ability to speak and read English.”

June 28, 2023 at 12:57 pm

I don’t think its fair to harshly criticize Dana Goldstein, who has a long history of excellence. I’d also say that I don’t know enough about the differences in the LTT and the main NAEP, but I always try to put the results in context. The value of NAEP is that it doesn’t focus on basic skills that state’s high stakes tests focus on or inappropriate Common Core tests. And that brings me back to something I wish Goldstein had followed up on. Over the decades, which tests have maintained a better focus on more meaningful learning – NAEP or state tests measuring the worksheet-style instruction that was incentivized by corporate reforms?

democracy says:

June 29, 2023 at 4:39 am

John, I don’t think I “harshly’ criticized Dana Goldstein.

But your comment begs the question:

If Goldstein is has such a “long history of excellence,” then why hasn’t she taken the time to investigate and report on what NAEP really is, and isn’t, and WHY does she accentuate the “doom and gloom” approach to NAEP scores when that is completely unwarranted?

- John Thompson says:
  
  June 29, 2023 at 3:14 pm
  
  I wasn’t referring to you. I was referring to the statement, Another education parrot, I men reporter who didn’t do their homework. As to why writers don’t do a better job explaining NAEP, I don’t know and I’m frustrated. I’ve talked with several top local reporters and they have been more concerned about the misuse of the term, Proficient. But because they need to report state and local results, which also use the term proficiency, they don’t know how to provide data and do so that doesn’t help the rightwingers and the corporate reformers. I think they’re doing better, but this is the result of decades of think tank propaganda which is entrenched.

June 28, 2023 at 3:59 pm

“Yes, Covid closures led to an unprecedented decline in test scores”

And we know that how?

What is the evidence?

First, correlation doesn’t imply causation, so it’s not even valid to claim from correlation alone that “some factor(s) related to the pandemic caused the score declines”
That something is plausible does not make it so.

Second, even if some factor related to the pandemic actually caused the decline in scores it does not necessarily follow that that factor had to be (or even include) Covid closures” in particular.

The pandemic caused a lot of disruption to the lives of students and there are many factors that could be quite unrelated to “Covid closures” that might have resulted in a decline in the NAEP scores.

An obvious example of one such factor would be the “attitude” of the students taking the NAEP test toward the test. If a student has a “who really cares?” attitude toward the test because they have far more pressing things on their mind (like whether their mother in the hospital with Covid will be OK) it is not hard to see how this alone could have a significant impact on their performance on the test. I would guess that young children might be particularly susceptible to such a factor.

I’m not claiming the latter was the cause of the score declines, of course (and there might have been multiple factors, at any rate) but merely suggesting it as one plausible cause that does not necessarily have anything to do with “Covid closures”.

The main problem I see with all the discussion about the recent NAEP score declines is that it involves assumptions that simply might not be true.

SomeDAM Poet says:

June 28, 2023 at 4:05 pm

It’s important to note that at least some of the factors that might plausibly have resulted in score declines need not have anything to do with so called “learning loss”.

The child who is worried about her mother in the hospital at the time of the NAEP test administration and hence doesn’t do her best on the test is a perfect example of this.

- democracy says:
  
  June 29, 2023 at 4:35 am
  
  Exactly.
SomeDAM Poet says:

June 28, 2023 at 4:17 pm

For those who would say “But it was the average of thousands of student scores that declined”, I would just say that “yes, and a significant fraction of all the students who took the NAEP test may have had a family member who was sick with Covid at the time of the test , which could potentially have had a significant impact on the average scores.

Unless one knows this was not the case, one can’t rule out the possibility.

It turns out that deciding which factor or factors are important from among many possibilities is not an easy task and requires sophisticated statistics (and people who know what they are doing)

- SomeDAM Poet says:
  
  June 28, 2023 at 4:35 pm
  
  And of course, one also has to have all the detailed data (eg, on sickness es within the families of students taking the test) on which to perform the factor analysis.
  
  Is the NAEP organization in possession of the necessary data to determine what might be behind the score declines?
  
  I seriously doubt it.
  
  So they and others are left with pure speculation about what is behind the score declines.
  
  Carry on.
- SomeDAM Poet says:
  
  June 28, 2023 at 4:39 pm
  
  But as I suggested above, trying to draw conclusions from short term changes in scores is a fools errand at any rate.
  
  It’s a complete waste of time.
bethree5 says:

June 29, 2023 at 6:33 pm

SDP– This bit of data floats through my mind as I read your posts: TX was an early re-opener, with 90% of schools fully in-person by early September 2020. CA lagged most states on reopening. Over 6 months later (by 3rd week of March 2021), 80% of their schools were not yet fully in-person. [Most got there between April and June of 2021].

Yet the two states’ 2022 NAEP results are in the same ball park as far as % difference pre- & post-covid, i.e., not terrible. CA is a few spots ahead of TX in that state ranking—but that could be explained by TX’s several-% higher child poverty [both states have high child poverty]. I’d call it a wash: early vs late reopening, at least with these 2 states, had no comparative effect on “covid learning loss.”

bethree5 says:

June 29, 2023 at 8:27 pm

Mr Thompson, I wonder if you’ve seen this study released by Education Next [the publishing arm of the very conservative Hoover Institute] last summer: https://www.educationnext.org/half-century-of-student-progress-nationwide-first-comprehensive-analysis-finds-gains-test-scores/ It is a much-needed counterpoint to the typical sky-is-falling media coverage of NAEP stats et al contorted, misrepresented ed achievement stats, particularly as regards assumptions about minority progress [which has outpaced that of whites]. It’s a comprehensive meta-analysis of millions of test results, 1971-2017 [kids born 1954-2007].

Most notable is progress in mathematics. When folks complain about our mediocre PISA math scores, I like to point out that on the 1st 5 intl tests in the ‘60s [all in math], US scored right at the bottom. But between 1971-2017, according to this study, we have added 4 yrs’ math content to K12 ed reqd for hischool graduation. [and are now scoring “average” among OECD nations!]

Yet we only advanced 1 yrs in reading achievement? The article includes an interesting analysis of why that makes sense scientifically.

Meanwhile—PISA-wise— we score top-tier in reading (#13), and good in Science (#18), despite that very middling math score.

June 30, 2023 at 11:26 pm

Interesting. I don’t have as much experience in other tests or how the data is combined. When I read it more carefully, I MIGHT have questions about Peterson’s outspoken conservative perspective. I also may have questions about the PISA exception. My question may come from the pattern where growth in NAEP shows a pattern consistent to the time after 2001. Before, growth was increasing and continued as NCLB was implemented, but was the growth a continuance of pre-NCLB and/or other studies of test-driven reforms where the increase soon stops or goes back to before? I’m confident that most inner city teachers who have an opinion on it say that test increases don’t mean more real learning so maybe that may help explain the Pisa Exception. (I have to admit though that the highest-poverty schools have great success in “jukin the stats,” learning tricks from Texas for fabricating gains on state tests) There’s very little in data bases in the reports that is about what I was writing about. My point was that gains slowed and then peaked in 2013, then dropped or stagnated. There’s limits to my knowledge of the nuance in those studies, and I don’t have much experience in elementary schools. But in my field, the highest challenge secondary schools, I’ve talked with thousands of teachers. I don’t recall a single one who denied that the poorest children of color in neighborhood schools (that became more segregated in the 21st century) were damaged the most by high stakes testing. My kids all complained they had been robbed of an education by test, test, test. They first echoed what I had been told by students who endured the testing under Reagan, but it was Race to the Top when the decline in learning really surged. So, I’ve always enjoyed talking with researchers who I disagreed with but I was consistently appalled by their refusal to listen to teachers and students who disagreed with them

John Thompson: The Misinterpretation of NAEP Data

69 Comments Post your own

Leave a comment Cancel reply

Search All Posts

Previous posts

Recent posts

Top posts

Follow blog via email

Follow blog via RSS reader

Blog Stats

John Thompson: The Misinterpretation of NAEP Data

Share this:

69 Comments Post your own

Leave a comment Cancel reply

Search All Posts

Previous posts

Recent posts

Blog Topics

Top posts

Follow blog via email

Follow blog via RSS reader

Blog Stats