The “Mississippi Miracle” seems to be too good to be true. The scores of Mississippi fourth-graders have risen sharply on NAEP (the National Assessment of Educational Progress. Supporters of the Miracle attribute the dramatic increase to the state’s adoption of the “science of reading” curriculum, teacher training in the “science of reading,” and holding back third-grade students who aren’t reading well enough.
This formula is especially appealing to Republicans because nothing need be done to reduce the children’s poverty or improve their living conditions. Conservative states have hailed the “Mississippi Miracle” because it relieves them of any responsibility to create jobs or change the conditions in which the poor live. It’s a low-cost cure: Just raise reading scores and prosperity will follow.
The story of the Mississippi Miracle also appeals to blue states because they are convinced there is a quick and easy way to end the perennial “reading crisis.” So they too have passed legislation to require all reading teachers to adopt the “science of reading.”
Critics of the “Miracle” say that the practice of holding back low-performing third-graders artificially inflates the fourth grade scores. They also point to eighth grade scores to say that there was no miracle. Eighth grade scores are more important that fourth grade scores because they show longer-term effects of reading instruction.
Paul Thomas is a critic of the “Miracle.”
He begins a recent post with a quote from scholar Bruce Baker:
On NAEP Grade 8 Scores: “a better indicator of the cumulative effects of a system on student learning than 4th grade assessments.” Bruce Baker, February 11, 2026
He writes:
The media, political leaders, and education reformers are making a mistake about reading reform well explained in the parable of the blind men and the elephant.
In this case, many are rushing to make over-stated claims about reading reform in Mississippi by hyper-focusing on limited and distorted data—grade 4 NAEP scores on reading.
First, research details that states implementing reading reform have achieved some short-term test score increases in grade 4; however, those gains disappear by grade 8. And more damning, the determining factor in successful reform is exclusively grade retention policies (not teacher training, reading programs, direct instruction, etc.).
Next, grade retention in Mississippi has been analyzed revealing that retention distorts those scores, resulting in a statistical manipulation of the data and not higher student achievement. In short:

Yet, a new story has emerged claiming that Black students in MS are outperforming Black students in other states, notably California:

This sort of state comparison is grounded in political/ideological bickering that is challenged when grade 8 NAEP reading scores are analyzed instead of grade 4:

Suggesting that Black students are being better served in MS than CA is at least misleading. In fact, Black students in CA, GA, LA, MA, and notably the Department of Defense (DoDEA) schools outperform Black students in MS at grade 8.
Key here is that grade 8 NAEP scores are better data because of the distorting impact of grade retention (usually grade 3) on grade 4 data.
But an even better story is that student achievement among Black and Hispanic students is very complicated, especially when you consider that states have dramatically different percentages of these populations of students.
Further, if we return to the parable from the opening, even better data at grade 8 is not the full picture.
In MS and throughout the US, Black students are still suffering the consequences of the persistent race gap in achievement (most states have the same gap as 1998, including MS).
And Black Americans remain trapped in the burden of racial inequity both in schools and in their communities.
The misleading stories about MS using grade 4 NAEP data are designed to promote a “beating the odds” story—one that isn’t true—but all students in the US would be better served if we chose not to seek those who beat the odds, but to change the odds so that no one—especially children—would have to overcome those inequities in the first place.

It is easier to game the system in 4th grade test scores than 8th grade, particularly when retention is part of the strategy. As students move into middle school, students are no longer learning to read. They are reading to learn, and comprehension is the priority. Reading comprehension is far more complex than phonics. It depends on vocabulary, prior knowledge and higher order thinking, all of which are a greater challenge for disadvantaged students. The 8th grade scores of a school district more likely reflect students’ poverty and lack of opportunity than 4th grade scores because the cognitive demand of the test is much greater on middle school standardized tests.
LikeLike
Deficiencies in Paul Thomas’ analysis were discussed in the comments here: https://dianeravitch.net/2026/01/29/paul-thomas-why-you-should-never-believe-in-education-miracles/ For example, its absence of evidence that cumulative MS grade retention had actually significantly increased as of 4th grade along with the MS reading improvements, and the reality that Mississippi has been rated 4th in the nation on demographicqlly adjusted 8th grade reading (far above CA) while 1st in 8th grade math, and the fact that Jiee Jong’s research focused on retention without accompanying intensive remediation efforts, while citing research showing that the combination could produce positive results…
LikeLike
Stephen Ronan on March 16, 2026 at 11:37 am is right on target. There are many issues with Thomas’ analysis.
In addition to those Ronan cites, Thomas consistently ignores the statistical sampling errors in all NAEP scores. Once those errors are considered, the assertion that Black students in a number of states outscored those in Mississippi for NAEP Grade 8 Reading in 2024 are simply not true. Among the states (forget DodEA schools, they are nothing like the states), only the Black students in CO and MA statistically significantly outscored those in MS in 2024. Back in 2013, the year the MS literacy legislation began, Black students in 27 other states outscored those in MS. Not a miracle, but certainly very noteworthy and quite counter to Thomas’ claims.
As a note, white Grade 8 students in MS also made progress. In 2013, they were statistically significantly outscored by whites in 43 other states. In 2024 only the whites in 7 states can make the same claim.
That Royal Statistical Society paper, “On education miracles in general (and those in Mississippi in particular),” has some issues of its own. In fact, the authors have already posted one correction, but more are needed. This paper shares a misconception found in a number of other papers claiming retention numbers didn’t start to rise in Mississippi until after the 2013 legislation was enacted. Actually, age data captured in all NAEP Grade 4 Reading Assessments dating back to 1992 show that, just as Ronan mentions, retention in Grades K to 3 in the Magnolia State were always high. Very consistent percentages of Mississippi’s tested samples were above the modal age for Grade 4 (9 years old). This didn’t just start with the 2013 legislation. Once the rather consistent percentages in overage test takers are considered, if retention was the key to Mississippi’s improvement after 2013, that improvement actually should have started way back at the beginning of State NAEP. But that’s not how things actually happened.
One of Ronan’s comments needs repeating and amplification. He says, “Jiee Jong’s research focused on retention without accompanying intensive remediation efforts, while citing research showing that the combination could produce positive results…”.
This touches on a problem I suspect is found in many, if not all, papers about the supposed bad impacts of retention. I asked a Southern California-based scholar about this a little while back and he indicated that he was not aware of any papers that had paid attention to the quality of remedial programs given to the retained students. Certainly, if the students got either more of the same approaches that already had failed them or something different that really lacked effectiveness, the findings of problems for those students later on (absenteeism, dropping out, etc.) would not be a surprise. But this would not provide any indication that retention coupled with effective remediation would also be a failure.
Certainly, research on retentions in Texas just after the turn of the century, when impacts from the National Reading Panel Report (NRPP) had yet to show in classrooms, would not provide very useful insight about the programs in Mississippi. As Dr. Carey Wright told Chalkbeat, those post 2013 programs in MS featured special small group instruction and were certainly built around what had been learned since the NRPP came out.
Unfortunately, I have not been able to post here some supporting graphics I have created with NAEP data, but I have a number posted in X (the former Twitter). Here are just a few links:
https://x.com/Innes434/status/2032976401616318936https://x.com/Innes434/status/2032976401616318936
https://x.com/Innes434/status/2032977583667658992
https://x.com/Innes434/status/1885134907157995888
LikeLike
Please forgive me for asking, but why are you so focused on statistical significance instead of focusing on effect size?
Too much emphasis tends to be given to statistical significance when it doesn’t really tell us if our findings are meaningful. This problem is exacerbated by many introductory textbooks that fail to correctly explain statistical significance.
The ASA’s Statement on Statistical Significance and P-Values (see p. 131 after the editorial) is a reminder of the principles we should consider.
I promise I’m not trying to dump on your work, and I think these are important considerations that would add to it.
LikeLike
The links on X lead me to Richard Innes. Innes is associated with the Bluegrass Institute in Kentucky. Bluegrass is a rightwing thinky tank, which have been planted in almost every state tho advocate for free enterprise and school choice.
As I said in the post, rightwingers love the “Mississippi Miracle” because it claims that a few policy changes (esp the “science of reading”) will lift reading scores, esp those of black students. No need to undertake expensive interventions that would improve incomes, housing, healthcare, eg, the quality of life.
See, Bruce Baker, https://schoolfinance101.com/2026/02/11/on-miracles-in-mississippi/
LikeLike
When initially passed by the Mississippi legislature, the Literacy-Based Promotion Act (LBPA) enjoyed broad bipartisan support, as has also been the case with attempts to emulate its successes elsewhere. I hope you will remain open to the possibility, Diane, that it has indeed been a successful effort, one worthy of replication.
Initial attempts to debunk it have been proven erroneous or retracted.
I am not aware of any well-researched adverse opinions of the program and its results at this time, but would be curious to know of any critique you still find persuasive.
I would understand, and likely concur with, your skepticism if it were the case that major improvement in reading skills were a substitute for deeper, broader reforms, but have seen no evidence yet of that being the case.
LikeLike
Hi, Stephen,
Paul Thomas did a good job rounding up the evidence about the “science of reading.”
I will probably post this.
https://open.substack.com/pub/paulthomas701128/p/research-contradicts-science-of-reading?r=rls8&utm_medium=ios
I am neither for not against the science of reading. I am against miracle tales. There are no panaceas. I also oppose legislative mandates about how to teach. Mandating whole language or balanced literacy or the science of reading are all terrible. Not all children learn to read the same way.
Diane
LikeLike
It is an interesting list of research articles, worth sharing, but without celebrating it as effectively supporting his critiques of the MS model.
Some of the summaries certainly sound reasonable, e.g.:
“Special needs students and multilingual learners are not monolithic populations of students; they need a wide variety of instruction approaches to address their reading needs. (See Special Issues in RRQ; Seidenberg, 2026)/”
I wonder whether Thomas realizes that the text of that includes, “once this became broadly known, states passed “science of reading” legislation that allowed the adoption of research-based practices; the admirable Mississippi reading project shows that this approach is effective.(1,2)”
with its Footnote 1:
“1. The Mississippi project was a remarkable effort that seems to have yielded real, lasting gains in literacy. It was a multi-pronged approach involving cooperation among numerous stakeholders as well as a significant infusion of funding from private philanthropy. We don’t know which aspects of this effort were crucial; I think it very likely that having a cadre of well-trained coaches advising teachers on site, in the schools, was a huge factor. Whether other states are providing this level of teacher support is unclear. In any case, the successes in Mississippi do not confer validity on Structured Literacy, which was not codified until well after that effort was underway.”
Others of Thomas’ examples are misleading, like his continued reliance on studies that show harmful results of grade level retention in the absence of intensive accompanying remediation, while ignoring findings of positive effects when strong remediation accompanies both the risk of retention and its occurrence. Far more relevant to Mississippi than the largely irrelevant Zhong study he keeps citing is this: https://wheelockpolicycenter.org/wp-content/uploads/2023/02/WEPC-MS-Retention-Policy-Brief-02-03-2023.pdf
LikeLike
Stephen,
I will continue to criticize any instruction model that is declared a “miracle” and mandated by state legislatures.
I will applaud any state intervention and mandate that feeds children and provides regular health care.
Diane
LikeLike