Gene V. Glass is one of our nation’s most distinguished education researchers.
This post is an important analysis of the failure of the U.S. Department of Education’s Institute of Education Sciences, which was reorganized during the George W. Bush administration.
As a new administration moves into the US Department of Education, the opportunity arises to review and assess the Department’s past practices. A recent publication goes to the heart of how US DOE has been attempting to influence public education. Unfortunately, in an effort to justify millions of dollars spent on research and development, bureaucrats pushed a favorite instructional program that teachers flatly rejected.
The Gold Standard
There is a widespread belief that the best way to improve education is to get practitioners to adopt practices that “scientific” methods have proven to be effective. These increasingly sophisticated methods are required by top research journals and for federal government improvement initiatives such as Investing in Innovation (i3) Initiative to fund further research or dissemination efforts. The US DOE established the What Works Clearinghouse (WWC) to identify the scientific gold-standards and apply them to certify for practitioners which programs “work.” The Fed’s “gold standard” is the Randomized Comparative Trial (RCT). In addition, there have been periodic implementations of US DOE policies that require practitioners to use government funds only for practices that the US DOE has certified to be effective.
However, an important new article published by Education Policy Analysis Archives, concludes that these gold-standard methods misrepresent the actual effectiveness of interventions and thereby mislead practitioners by advocating or requiring their use. The article is entitled “The Failure of the U.S. Education Research Establishment to Identify Effective Practices: Beware Effective Practices Policies.”
The Fool’s Gold
Earlier published work by the author, Professor Stanley Pogrow of San Francisco State University, found that the most research validated program, Success for All, was not actually effective. Quite the contrary! Pogrow goes further and analyzes why these gold-standard methods can not be relied on to guide educators to more effective practice.
Researchers have told us that we need “randomized comparative trials” to reach “research-based conclusions.”
In fact, says Glass and the article he cites, this is not what happens. And the results of these trials turn out to be easily manipulated and falsified.
He writes:
Key problems with the Randomized Comparative Trial include (1) the RCT almost never tells you how the experimental students actually performed, (2) that the difference between groups that researchers use to consider a program to be effective is typically so small that it is “difficult to detect” in the real world, and (3) statistically manipulating the data to the point that the numbers that are being compared are mathematical abstractions that have no real world meaning—and then trying to make them intelligible with hypothetical extrapolations such the difference favoring the experimental students is the equivalent of increasing results from the 50th to the 58th percentile, or an additional month of learning. The problem is that we do not know if the experimental students actually scored at the 58th or 28th percentile. So in the end, we end up not knowing how students in the intervention actually performed, and any benefits that are found are highly exaggerated.
The sad part of the story is that we now have a new administration that is both ignorant of research and indifferent to it. DeVos has seen the failure of school choice in her own state, which has plummeted in the national rankings since 2003, and it has had no impact on her ideology. Ideology is not subject to testing or research. It is a deep-seated belief system that cannot be dislodged by evidence.
Leaving aside all of the substantial problems with RCT as the preferred evidentiary standard in education, what is noteworthy is its hypocritical application. Under both Bush and Obama, educators were held to the RCT standard for research and program funding. However, few if any of the big programs that the Department of Education put in place could come close to that burden of proof. There is no evidence for test-best accountability, more “rigorous” standards, charter schools, or hiring and firing to improve education. By their accounting, progress was made when their favored policies were enacted, not when educational outcomes actually improved.
More alarming, is that while prior administrations have given lip service to the idea of evidence, Trump has rejected the idea altogether. Under Trump, the evidence is in. Evidence is out. http://www.arthurcamins.com/?p=441.
Dump the DUMP.
I have read the blog post of Gene Glass, and the article in Education Policy Analysis Archives. I have received IES alerts and a lot of nonsense from the What Works Clearing House for a long time.
Some of the rationales for IES reviews of research were explicit: Researchers could submit their work for review. But IES could also initiate a review if a study had received a lot of publicity.
I suppose that the IES was envisioned as a more authoritative, high profile, and accessible “peer review process” than could be had through the traditional scheme of publishing in peer reviewed journals.
The IES demonstrated that the wars over “what counts” as valid and useful research in education were won by the hard-nosed folks who wanted the vintage standards for experimental research to count as “educational science.” The “gold standard” of the Randomized Comparative Trial assumed the study of education could be conducted in the manner of research in sterile laboratories, under highly controlled conditions, with informed consent of the participants, and the rest.
The IES has also been pre-occupied with definitions of “what works” and “best practices” based on a limited vision of educational success and “impacts” — mainly test scores in math and ELA at the center of everything.
Unless I am mistaken the IES is a legacy from the No Child Left Behind Act. NCLB was “based on four basic principles: stronger accountability for results, increased flexibility and local control, expanded options for parents, and an emphasis on methods that have been proven to work.”‘
NCLB actually worked in tandem with less publicized legislation of the same era, the Education Sciences Reform Act of 2002 (ESRA).The two laws were closely related.
In funding educational research, ESRA said that federal officials were required to seek scientific proofs of effective, low-cost, user-friendly, and replicable “best practices” in education. The best practices, identified by ESRA’s criteria, were supposed to be used for school improvements undertaken with funds from NCLB.
If NCLB stressed back-to-basics with a vengeance under the guise of excellence, ESRA offered an image of scientific research as “secular, neutral, and non-ideological.” That language was actually in ESRA. Of course that definition was at odds with the uses of research in human affairs and particularly at odds with No Child Left Behind.
IES is unlikely to survive the era of Trump and Devos. The program supporting regional R & D labs (set up in the 1960s) is also likely to be killed.
Several Republicans from Kentucky have determined that the Department of Education can be abolished within two years by putting student loans into the Department of Treasury, and all IDEA issues in Health and Human Services.
All of this falls under the category of mathturbation.
VAM is the poster child.
The blame lies as much with the fake scientists (Hanushek, Chetty and others) who push this stuff as it does with the policymakers.
To help understand that mathturbation or as I call it mental masturbation one need only to understand what is supposedly being measured in the standards and testing regime–Nothing, absolutely nothing. My response from sometime along the way to another posting somewhere:
The most misleading concept/term in education is “measuring student achievement” or “measuring student learning”. The concept has been misleading educators into deluding themselves that the teaching and learning process can be analyzed/assessed using “scientific” methods which are actually pseudo-scientific at best and at worst a complete bastardization of rationo-logical thinking and language usage.
There never has been and never will be any “measuring” of the teaching and learning process and what each individual student learns in their schooling. There is and always has been assessing, evaluating, judging of what students learn but never a true “measuring” of it.
But, but, but, you’re trying to tell me that the supposedly august and venerable APA, AERA and/or the NCME have been wrong for more than the last 50 years, disseminating falsehoods and chimeras??
Who are you to question the authorities in testing???
Yes, they have been wrong and I (and many others, Wilson, Hoffman etc. . . ) question those authorities and challenge them (or any of you other advocates of the malpractices that are standards and testing) to answer to the following onto-epistemological analysis:
The TESTS MEASURE NOTHING, quite literally when you realize what is actually happening with them. Richard Phelps, a staunch standardized test proponent (he has written at least two books defending the standardized testing malpractices) in the introduction to “Correcting Fallacies About Educational and Psychological Testing” unwittingly lets the cat out of the bag with this statement:
“Physical tests, such as those conducted by engineers, can be standardized, of course [why of course of course], but in this volume , we focus on the measurement of latent (i.e., nonobservable) mental, and not physical, traits.” [my addition]
Notice how he is trying to assert by proximity that educational standardized testing and the testing done by engineers are basically the same, in other words a “truly scientific endeavor”. The “same by proximity” is not a good rhetorical/debating technique.
Since there is no agreement on a standard unit of learning, there is no exemplar of that standard unit and there is no measuring device calibrated against said non-existent standard unit, how is it possible to “measure the nonobservable”?
THE TESTS MEASURE NOTHING for how is it possible to “measure” the nonobservable with a non-existing measuring device that is not calibrated against a non-existing standard unit of learning?????
PURE LOGICAL INSANITY! (or mental masturbation/mathturbation).
Once I consulted the What Works Clearinghouse to find a better alternative to the crappy character education curriculum our school was using. To my shock, it rated the one we were using as the best. To me, the WWC has zero credibility.
Has anybody else choked on the expression “teach with fidelity”? So a company, interest group,… conducts an RCT study for one of its pet products, systems, instructional methods,… and pronounces results/statistical analysis that shows positive results. These results filter down into the marketing material which admonishes all good administrators to make sure that their teachers use the materials “with fidelity” to ensure the promised gains. The program is so rigidly controlled that any deviation is met with expressions of horror by all those tasked with making sure teachers follow the protocol. Naturally all those little wooden children must be taught with uniform procedures that will routinely spit out little scholars. Does any of this sound like what most of us were probably taught about the art (and science) of teaching, the development of children, or the nature of learning?
As used by marketeers and adminimals “teach with fidelity” actually means “Do it as we say or else!”
Now to be a true teacher one must teach with a “fidelity to truth” attitude in all aspects of the teaching and learning process. Being “true to truth” and not allowing falsehoods, e.g., standards and standardized testing for just one example, to influence the process must be the ONLY attitude, frame of mind and/or philosophy (heaven forbid American teacher’s might think and act in a philosophical fashion) that should be paramount.
FRAUD IN FRACTALS: So this is/was an agency funded by our tax dollars that purported to validate third-party education programs using statistical analysis even though we could all see they provided no valuable data? I can just picture someone in the Bush administration saying “we need someone with big fancy words to certify that this horsesh:t works”.
This reminds me of the bond rating agencies that were coerced to prop up mortgage backed securities containing time bomb mortgages, where they knew or should have known they would fail, but they kept everyone fooled long enough for the bandits to make off.
It’s ironic now that the ESSA funding has to go to “evidence based” programs that the USDOE is now run by someone with no qualifications. But this may be an area where the DOE needs a healthy downsizing. Teachers have always wondered why the government ever bothered to try to formularize or mass-verticalize teaching practice in the first place when you never know what the next batch of kids coming through the door is going to be like.
ART VS. SCIENCE: Because the corporate reformers who research teaching methods can never seem to capture successful teaching and put it in a bottle to replicate and sell, they instead came up with a protocol to game the ratings, a Success Certification Analysis Metric, otherwise known as “SCAM”.
This commonly develops when one useless oversized federal agency relies on a second one to convince people the first one is worthwhile, and it just grows fractally from there, diverting education dollars out of the classroom and into the hands of people spouting mumbo jumbo in conferences as teachers teach real kids in classrooms.
“. . . that this horsesh:t works”.
Getting to be that time of year to make horseshit work. . .
. . . in the garden that is!!
Test scores = fools gold for lazy ed thinkers (or simply non-ed-thinkers).
YEP!
This constant tinkering always eventually fails. We know what works and that was established in the early 20th Century. Dialogue between teachers & student high order questioning in order to achieve understanding-individuals not class lessons and multiple guess testing. That which the upper social classes receive in their schools. Please see my web site-Red Queen. (http://redqueen.me).