As we saw in the previous postTom Sgouros explained in detail why it was wrong for Rhode Island to use the NECAP as a graduation requirement. It was not designed for that purpose, and many students will fail who should have passed.
State Commissioner of Education Deborah Gist said Sgouros was wrong because he is not a psychometrician. She did not explain why he was wrong, not did she understand that psychometricians would likely agree with Sgouros. The cardinal rule of testing is that tests should be used only for the purpose for which they were designed.
Here is Sgouros’ account (if I hear from Gist, I will print hers):
Gist Offers Logical Fallacies On NECAP Value
By Tom Sgouros on March 20, 2013
I was on the radio ever so briefly this afternoon, on Buddy Cianci’s show with Deborah Gist. Unfortunately, the show’s producer hadn’t actually invited me so I had no idea until it had been underway for an hour. I gather they had a lively conversation that involved belittling the concerns about the NECAP test that I expressed here.
While I was on hold, I had to get on a bus in order not to leave my daughter waiting for me in the snow. Then Buddy said the bus was too loud but he’d invite me back on. So I was only on for about five minutes, long enough to hear Gist say I may be good at math, but I’m no psychometrician.
Guilty as charged, but somewhat beside the point.
I’ve heard the commissioner speak in public in a few different ways since I published my letter last week. She tweeted about it a couple of times last week and over the weekend. She was quoted in the paper this morning about how it was an “outrageous act of irresponsibility” for adults to take the NECAP 11th grade math test at the Providence Student Union event on Saturday. And today she spent a while on the WPRO airwaves insulting me.
But I have yet to hear any of the points I’ve made taken on directly.
Only what is called the argument from authority: I’m education commissioner and you’re not. Or in this case: I’m education commissioner, and you’re not a psychometrician.
As a style of public argument, this is highly effective, especially if salted with a pinch of condescension. It typically has the effect of shutting down debate right there because after all, who are you to question authority so?
The problem is if you believe, as I do, that policy actually matters, this is a dangerous course to take.
After all, the real point of any policy discussion is not scoring debate points, but finding solutions to the problems that beset us. This is a highly imperfect world we live in, filled with awful problems, some of which we can only address collectively. If you don’t get the policy right, here’s what happens: the problems don’t get solved. Frequently, bad policy makes the problems worse, no matter how many debate points you scored, or how effectively you shut up your opponent.
So, do I care that Deborah Gist thinks I’m an inadequate excuse for a psychometrician? It turns out that, upon deep and lingering introspection, I can say with confidence that I do not. But I do care about the state of math education in Rhode Island, and I believe she has us on a course that will only damage the goal she claims to share with me.
Now I may be wrong about my NECAP concerns, but nothing I’ve learned in the past week has made me less confident in my assessment. On the one hand, I’ve seen vigorous denunciations of the PSU efforts, and mine, none of which have actually addressed the points I’ve raised. These are specific points, easily addressed. On the flip side, I’ve quietly heard from current and former RIDE employees that my concerns are theirs, but the policy is or was not in their hands.
Those points again: there are a few different ways to design a test. You can make a test to determine whether a student has mastered a body of knowledge; you can make a test to rank students against each other; you can make a test to rank students against each other referenced to a particular body of knowledge. I imagine there are lots of other ways to think about testing, but those are the ones in wide use. The first is a subject-matter test, like the French Baccalaureate or the New York State Regents exams. The second is a norm-referenced test like the SAT or GRE, where there are no absolute scores and all students are simply graded against each other on a fairly abstract standard. NECAP is in a third category, where it ranks students, but against a more concrete standard. The Massachusetts MCAS is pretty much the same deal, though it seems to range more widely over subject matter.
The problem comes when you imagine that these are pretty much interchangeable. After all, they all have questions, they all make students sweat, and they all require a number two pencil. How different could they be?
Answer: pretty different. If your goal is ranking students, you choose questions that separate one student from another. You design the test so that the resulting distribution of test scores is wide, which is another way to say that lots of students will flunk such a test. If your goal is assessing whether students have mastered a body of knowledge, the test designer won’t care nearly so much about the resulting distribution of scores, only that the knowledge tested be representative of the field. (The teacher will care about the distribution, of course, since it’s a measure of how well the subject has been taught.) The rest was explained in my post last week.
The real question is, if you don’t know what the NECAP is measuring, why exactly might you think that it’s a good thing to rely on it so heavily as a graduation requirement?
Deborah Gist is hardly the first person to call me wrong about something. That happens all the time, as it does for anybody who writes for the public about policy. But like so many others who claim I am wrong, she refuses to say — or cannot say — why.
It’s just the old saying, if you can’t dazzle them with brilliance, baffle them with bulls—. The rheephormers certainly don’t have any brilliance to dazzle with, but they have definitely perfected that bulls— part.
I am ALWAYS saying that–it’s the unspoken motto of the reform movement and every other lousy thing/perpetrator (Jindal, Bloomberg, White, King, Jeb Bush, Mike Millken, and on & on…ad nauseum). Unfortunately for them, we’re all just too smart, and the tide IS turning.
Tom, you make excellent points and for some reason I thought of this Burroughs quote after reading your post.
“I don’t care if people hate my guts; I assume most of them do. The important question is whether they are in a position to do anything about it.”
Reblogged this on Kmareka.com and commented:
Some national coverage of the dust-up between Sgouros and Gist. Moving the conversation forward, I hope!
Tom,
I do not hear Gist claiming that a psychometrician has said that you are wrong! I know that in NJ the designers (I would assume some level of a psychometrician) of SGP have stated that it is not designed to evaluate individual teachers, but the powers to be are determined to use it for that purpose anyway. In recent CREDO reports, psychometricians have said that you cannot determine years of growth based on the analysis but they do it anyway.
As we have seen in many States, the powers to be, back by the corporate reformers, are not listening to the Public or the Educators, they are doing whatever they want!
This and the immediately preceding posting illustrate a very important aspect of the so-called education debates: the self-proclaimed “education reformers” literally don’t know what they are talking about.
For example, this one line from Diane is a humdrum commonplace among test designers: “The cardinal rule of testing is that tests should be used only for the purpose for which they were designed.”
Commissioner Gist obviously doesn’t have a clue what this means. Tom Sgouros politely states that she is using the “argument from authority” to dismiss this and other points he makes but I would rephrase this to state that she is using the “argument from ignorance and self-interest.”
Could I be making this up? Isn’t this an isolated incidence? Take a look at the recent tweets by Wendy Kopp lauding the expertise of TFA novice teachers based on a particular set of test scores. Gary Rubinstein knocks down her house of cards just as effectively as Tom Sgouros does Gist’s. Link: http://garyrubinstein.teachforus.org/2013/03/09/is-a-half-year-of-learning-equivalent-to-one-question-on-a-multiple-choice-test/
For those interested in getting more informed about standardized testing, I recommend: Todd Farley MAKING THE GRADES: MY MISADVENTURES IN THE STANDARDIZED TESTING INDUSTRY (2009), Phillip Harris & Bruce M. Smith & Joan Harris THE MYTHS OF STANDARDIZED TESTS (2011), and Daniel Koretz MEASURING UP: WHAT EDUCATIONAL TESTING REALLY TELLS US (2008).
I am sure other titles could be added as well.
🙂
my response to Gist’s question about whether he was a pyschometrician would be “are you?” Yes, she has a doctorate in administration and a masters in policy, neither of which qualify her to be a psychometrician. And the concerns Sgouros expresses are reflective of the joint statement of the three professional associations involved with educational testing and of such prominent psychometricians as W. James Popham.
So perhaps Gist would like to sit in a debate on the merits of what she is doing with a psychometrician? IF she does I know a few who would be more than happy to take her on and take her apart.
Oh, and btw, while I make no claim to be a psychometrician, having had only one course in the subject (with an acknowledged expert, Bob Lissitz at U of Maryland College Park), I have had multiple courses in statistics, in research design, and read and written about the relevant research for more than a decade. As far as I can tell, Sgourous is right and Gist – like far too many who came through the Broad Academy – does not know about what she bloviates.
Part of what’s wrong with testing as we know it in the 2lst century is that it takes a psychometrician to understand? Bah humbug. even my medical doctor takes the trouble to explain his diagnosis, the tests, etc. so that I can–in the end–make up my wn mind, with the guidance of expert opinions. Note the “s”.
While acknowledging the power of statistical analysis and advances in mapping the brain, I’d still contend that “metering” the mind is fundamentally preposterous.
And even if I’m wrong, Gist’s politically motivated attacks are fallacious and motivated by fear. That she and her ilk are running scared about people’s awakening to the purposes high stakes exams are being put to is good news for students, teachers and the public schools.
In a 3/20/2013 PROJO article titled, “Most Adults don’t Make the Grade in Mock NECAP,” Rep. Teresa Tanzi, D-south Kingston is quoted as saying, “I don’t see how cramming for this test and earning a better score will in any way make me a better person or help me be more effective in my career.”
On the opinion page of the same paper there is a reprint of “Standard Tests Do Reveal Which Teachers are Best.” Since I’m not a psychometrician, can someone pick apart the methodology or validity of the Gates study referenced here?http://www.bloomberg.com/news/2013-03-12/standard-tests-do-reveal-which-teachers-are-best.html