Since the passage of No Child Left Behind, test scores have been defined by federal law as the goal of education. Schools and teachers that “produce”higher scores are good, schools and teachers that don’t are “bad,” and likely to suffer termination. The assumption is that higher test scores produce better life outcomes, and that is that.
In late 2016, Jay P. Greene produced a short and brilliant paper that challenged that assumption. I have fallen into the habit of asking myself whether the young people who are super-stars in many non-academic fields had high scores and guessing they did not. Fortunately, it is only in schools where students get branded with numbers like Jean Val Jean of “Les Miserables.” Outside school, they can dazzle the world as athletes, musicians, inventors, or mechanics, without a brand.
Greene writes:
“If increasing test scores is a good indicator of improving later life outcomes, we should see roughly the same direction and magnitude in changes of scores and later outcomes in most rigorously identified studies. We do not. I’m not saying we never see a connection between changing test scores and changing later life outcomes (e.g. Chetty, et al); I’m just saying that we do not regularly see that relationship. For an indicator to be reliable, it should yield accurate predictions nearly all, or at least most, of the time.
“To illustrate the un-reliability of test score changes, I’m going to focus on rigorously identified research on school choice programs where we have later life outcomes. We could find plenty of examples of disconnect from other policy interventions, such as pre-school programs, but I am focusing on school choice because I know this literature best. The fact that we can find a disconnect between test score changes and later life outcomes in any literature, let alone in several, should undermine our confidence in test scores as a reliable indicator.
“I should also emphasize that by looking at rigorous research I am rigging things in favor of test scores. If we explored the most common use of test scores — examining the level of proficiency — there are no credible researchers who believe that is a reliable indicator of school or program quality. Even measures of growth in test scores or VAM are not rigorously identified indicators of school or program quality as they do not reveal what the growth would have been in the absence of that school or program. So, I think almost every credible researcher would agree that the vast majority of ways in which test scores are used by policymakers, regulators, portfolio managers, foundation officials, and other policy elites cannot be reliable indicators of the ability of schools or programs to improve later life outcomes.”
I would add that Chetty et al did not establish a causal relationship between teacher VAM and later life outcomes, only a correlation. The claim that my fourth grade teacher “caused” me not to become pregnant a decade later strains credulity. At least mine.
Greene’s essay includes an excellent reading list of studies showing high test scores but no change in high school graduation rate or college attendance.
The Milwaukee and D.C. voucher studies that show a gain in high school graduation rate should note the high attrition rate from these programs, which inflates the graduation rate.
Imagine saying to a governor, I have a policy intervention that will raise test scores but will have little or no effect on life outcomes. Would they jump at the offer? Based on the political activity of the past 15 years, the answer is yes.
Overall, however, a seminal essay from a prominent pro-choice scholar.

Jay P. Greene wrote this? That Jay P. Greene? The rephormster’s rephormer? I’m confused. What’s his angle?
LikeLike
Yes. THAT Jay Greene.
LikeLike
Actually, he’s a nice enough chap. Met him at the “Failures to Fixes” conference put on by the libertarian Show Me Institute in which he was one of the moderators. The conference participants were lamenting that their edudeforms weren’t working, mainly due to implementation problems and now they needed to “fix the failures”.
We briefly spoke and he seemed open to my way of thinking. He was one of two people, the other the KC Public School District superintendent (who had to be a Broadie or similar background) who actually had K-12 teaching experience. Greene with 4 years and the supe with 3 1/2 years (basically nowhere near enough to become and administrator but somehow he did). Again, a nice enough fellow who is misguided in his thinking on education policy. Perhaps he is changing. One can only hope so.
LikeLike
Should be on everyone’s reading list: Cathy O’Neil, Weapons of MATH Destruction.
The idea that schools needed corporate management included tests as a measure of “productivity,” and “underperforming” schools, like underperforming stock in a portfolio should be ditched. NCLB and ESSA perpetuate this nonsense.
LikeLiked by 1 person
Meh. The tests are just the killing squads that eliminate those who was not blown up by horrendous math textbooks. I don’t know of a single math program in high school that is coherent, clear and concise. Some middle schools do not use texbooks at all, which msy be better than using crappy books.
LikeLike
YES.
LikeLike
I’m not familiar with Greene so I’m not applying this to him, but there’s a consistent pattern among Ohio ed reformers where they change the measure when the data doesn’t support the ideology.
Public schools in Ohio complained for 20 years that the ed reform measures were too narrow, too reductive, over-determined to the point that they were misleading.
They were not just ignored, they were scolded- the claim was they were avoiding accountability.
Then, lo and behold, when Ohio charters were turning in the same test scores as Ohio publics, ed reformers discovered a whole new appreciation for nuance and began to clamor for “growth scores” or insist that it didn’t matter how the school performed, as long as it was a “choice” school.
You see it with vouchers too. Vouchers aren’t living up to the hype so all of a sudden ed reformers changed the goal- now it’s not BETTER schools, it’s CHOSEN schools.
It’s kind of amusing. They now agree with you-all 🙂
LikeLike
Generalizations about test scores do not work for all populations. In my own work with ELLs I have seen that decisions based on scores, particularly when scores are used for selective purposes, shortchange students that are culturally and/or linguistically different. When scores are used to sort students, the poor and language minority students, are under represented in gifted and talented and other more rigorous options. These same students tend to be over represented special education and other lower academic programs. As a result, my district rarely classified ELLs unless they had been in this country at least three years. In the high school it also ran a summer program to prepare minority students to handle the demands of more challenging academic programming.
Correlation is not causation, and many of the low scores attributed to ELLs and many other poor students often is not the result of a lack of intelligence, but lack of exposure and opportunity.
LikeLike
DeVos defense of the lousy outcomes of ed reforms in Michigan is more ed reform.
I am hoping people in Michigan hire new state government. They better do something. The echo chamber are just doubling down.
It’ll be too late to go back when they privatize all the schools in that state. She has no earthly idea how this experiment she’s conducting will turn out and all available evidence is it’s not going well, but it’s full speed ahead!
She’s an absolute ideologue. She’ll destroy that whole public education system unless they hire a dissenter, and quick.
LikeLike
I don’t believe DeVos cares about public education or scores at all. All the evidence shows that vouchers produce worse results. DeVos could care less. She simply wants to get students out of the government schools and into religious ones using public money, and the sham of a a government keeps accommodating these ridiculous ideas including circumventing state charters and laws when they can get away with it.
LikeLike