Walter Stroup is chair of the department of STEM education and teacher development and an associate professor at the University of Massachusetts Dartmouth. In 2014, as a professor at the University of Texas, he publicly testified that the state was wasting hundreds of millions of dollars on standardized testing because the only thing that was measured was skill at passing standardized tests. This was hugely embarrassing to Pearson, which had a $500 million contract with the state of Texas. Recently Professor Stroup sent a letter to the Houston Chronicle, supporting its editorial calling for a pause in standardized testing For 2020-21.
I asked if I could post his response here.
He wrote:
[Response to July 22, 2020 “Editorial: What Gov. Abbott should do about STAAR testing this year for Texas schools.”]
As researchers and longtime education advocates, we support the conclusions of the July 22, 2020 “Editorial: What Gov. Abbott should do about STAAR testing this year for Texas schools.” Before our school system can run as normal, it will need to learn to walk again. And we shouldn’t keep objects in its way that may make it stumble.
We agree that state-mandated standardized exams should be the “last thing” student and teachers need to worry about. But that’s not enough. To support our schools and teachers, the next question has to be: if not STAAR, then what?
There is indeed a substantial body of research showing that current tests are “invalid indicators of student progress and ineffective in closing the so-called educational achievement gap.” We also agree with Commissioner Morath that we need shared measures of student progress if we are all to be held accountable for the educational outcomes in our schools.
To start our thinking about what might come next, we should ask whether STAAR tests are useful to teachers – the first responders of our school system. For that matter, are the products from one of the largest non-high-stakes test vendors in Texas, Northwest Evaluation Association (NWEA), useful to teachers?
We believe the answer is a resounding, No.
Although well intended, these tests measure the wrong kind of growth. Not only does this make them the wrong kind of tool to evaluate student achievement and institutional quality, it also means the tests themselves have become an instrument in preserving inequities in students’ educational outcomes.
When it comes to test development and scoring, two kinds of growth can be assessed.
“Growth” can be evaluated relative to achievement – how much students have learned. Or “growth” can be evaluated on a scale similar to measurements of height. Just as children get taller with age, they also get generally better at certain kinds of problem-solving tasks.
It makes a world of difference which kind we use if we want to help schools recover.
The first kind of growth – in achievement – is the only kind for which schools can, and should, be held accountable. We send children to school because we know that’s where we learned to read, write and do mathematics and we want the same for our children. Tests, to be useful in improving student outcomes, must be highly sensitive to differences in what schools do – sensitive to good teaching.
Unfortunately, current test development methodologies give us tests that behave, in almost every significant sense, like measures of biological growth, not measures of achievement.
If we buy a thermometer to measure temperature, put it in a pot of hot water, and the numbers barely change, that’s a problem. If we buy a box of these thermometers that all do the same thing, then that makes it a bigger problem. Our current box of tests has been shown to have very little sensitivity to temperature change — to differences in the quality of instruction.
When it comes to the issue of what kind of growth is being assessed by current tests, the evidence is equally clear. The grade-related growth curves the test vendor NWEA shares on its web site are remarkably similar to curves pediatricians use to chart children’s height.
Age-related or grade-related mental growth metrics can’t be used to improve educational outcomes – they simply aren’t meant to help us become mentally “taller.” Compounding the problem, they have a long history of lending support to oppressive ideologies and practices. In effect, tests fully intended to help address structural inequalities in our educational system end up having the opposite effect: keeping groups of students in the same relative position year-after-year, and across subject areas.
What are the alternatives?
Here are just some of the possibilities. Pattern-based items (PBIs) provide up to eight times more achievement-specific information per question than current items and have been deployed at scale across Texas. Performance-based assessments are being used in New Hampshire. “Badges” are being used in a number of industries as part of digital credentialing programs. Portfolio-based assessment has a long history of use in a wide array of educational settings.
The last time our legislators gathered in Austin, they passed a bill, HB-3906, directing the Texas Education Agency to “establish a pilot program” in which participating school districts would “administer to students integrated formative assessment instruments for subjects or courses for a grade level subject to assessment.” Now is the time to pilot alternative assessments that will help schools and teachers do what they do best – educate our children.
Walter Stroup has his home in Austin, Texas and is chair of the department of STEM education and teacher development and an associate professor at the University of Massachusetts Dartmouth.
Anthony Petrosino is associate dean for research and outreach in Southern Methodist University’s Simmons School.
Link to Editorial we were responding to:
Related links (links are also in the text above):
What was published in the Houston Chronicle
An Op-Ed in Dallas Morning News discussing research on current tests
CDC growth curves used by pediatricians
Thank you, Professor Stroup! The current standardized tests are a costly scam. Here, some of the reasons why: https://bobshepherdonline.wordpress.com/2020/03/19/why-we-need-to-end-high-stakes-standardized-testing-now/
What’s the alternative? Not some new variety of universal testing but, rather, teacher-made tests. A radical proposal? No. We used teacher-made tests almost exclusively in K-12 for most of the history of public schooling in the U.S., and that public schooling made us the most prosperous and powerful country in the history of the world.
Diane Below, I’m posting a note to Chiara from a different blog heading: about the conflict between reform movements and public institutions.
It’s important here because it shares the same philosophical roots as the STEM movements that, from my reading of THIS blog note, are struggling to broaden the parameters of their own now-formalized philosophical roots . . . STEM . . . by omission. The shared roots are various expressions of mechanist determinism and, personally, of naive realism and empiricism. <–not for our discussion here, but nevertheless factual.
Concretely in education, however, it means, for over 50 years, the slow but consistent diminishing of formal studies that, if present and consistent, tend to stoke creativity and to humanize us and our students; and when omitted, to leave students to rely on family and the surrounding culture (such as these are–and good sometimes, but not exactly systematic) for their history, psychological, social, etc., aka human development. In any case, even if good, the sources are not supported by the educational establishment and so too-easily become outliers in students’ minds.
The democratic experiment goes into failure mode when those who are involved in it, fail to understand it. It’s as simple as that.
Here is my previous note:
Chiara “Why don’t public school students deserve advocates? Ed reform advocates for charter and voucher students. Why can’t our kids have the same thing?”
MY RESPONSE: Reformers would counter that public schools have the advantage of NOT COMPETING in a marketplace situation (duh). From THAT distorted view, public schools as a PUBLIC SERVICE (like the USPO) are not “playing fair.” <–again, this speaks of a massive misunderstanding of the import and meaning of the role of PUBLIC INSTITUTIONS in a democracy.
Also, Diane’s question about “inadvertence or ignorance” speaks to an educational background that, for whatever cause, didn’t inform them of their own political roots (democracy) or how those roots differ from other political roots . . . and how a democracy is NOT an equivalent of market-based capitalism. As I see it, too many people in our culture really don’t understand that difference.
The moral aspect comes into play is when those folks consciously embrace class and other kinds of bias as a “reason” for their obvious arrogance.
But the whole movement of political ignorance is rooted in our 50-year movement to emphasize tech/math/science (STEM) and to DE-emphasize history, political and social sciences, the arts and humanities. In my view, we are paying for that huge oversight as we speak.
Some of these people (like in my own family) cannot understand the distinction between issues of health (mask-wearing) and issues of political manipulation. That’s someone who hasn’t thought one-bit about their freedoms or the political ground that the walk on. CBK
The complexity of trying to be fair in our evaluation of students’ work should give us pause. As Duane is always pointing out, these numbers become self-fulfilling prophecies. As I near retirement, I get less sure of my ability to do more than make a passing guess as to the performance of a student.
Given my own growing insecurities regarding my own evaluations, I perceive any suggestions that may be remotely generated, testing wise or otherwise, to be evidence of a hubris undeserving of any man.
First, let us agree to do no harm.
Roy A cautionary tale, for sure. But I also think teachers need to de-center our own influence and realize that students are, at once, and to some degree, autonomous thinkers; and second, students have many other sources of influence, and long before they entered the classroom and came under teacher-influence.
We cannot let fear be our guide, but rather our door to taking reflection, because what we do or don’t do WILL have its influence. Remember the story of the two children who each were told they were to receive a donkey. The outtake for me: know our students well, and aim at their well-being, whatever form that might take. CBK
All you can assume, Roy, is that your students do or don’t understand the material as measured by your test of that material. I vividly remember waiting to see a teacher’s first test to get a feel for how they evaluated performance. Then I could focus my energies toward certain types of response as I got to know that teacher. It was a lucky student who had a teacher who spent more than a few minutes getting to know a student. Just like us, their understanding of us was based on what only scratched the surface–our behavior that they could see. Years ago, one of my sons had an English teacher who required students to come to him personally to get their grades on papers and tests. Initially I found the process annoying and time consuming. The kids already had no free time other than lunch. As time went on, though, I began to understand that he used that time to get to know his students and really talk to them about their work. My son learned a lot through those sessions, and no kid could ever glance at the grade and dismiss an assignment.
speduktr Another tacit endorsement for smaller classes . . . but when Devos took over, public schools lost whatever political pull they had before, such as it was. CBK
I don’t know how he did it, but relative to some people’s classes, his could be consider small, probably around 25 with, I think, 4 classes. I believe it was freshman English in this case. He taught other grades/courses as well.
speduktr An embarrassment of riches . . . teachers. CBK
Beautifully said, Roy. This sort of humility about evaluation is necessary.
Check out our blog for more information on a real alternative to current standardized testing models: http://www.genedcorp.com/blog/