A few years ago, teachers at Garfield High School in Seattle launched a boycott to protest the use of the MAP test. They believed it was a waste of time. Teacher Jesse Hagopian wrote about the successful protest in his book More Than a Score. The test was canceled in high school, but unfortunately not in middle schools or elementary schools. It’s typically offered (required) three times a year so teachers can measure student progress in the skill of taking standardized tests.
Steven Singer writes that the MAP test is junk.
He was required to attend training to give the MAP test and write the following:
This is an assessment made by Northwest Evaluation Association (NWEA), a so-called non-profit organization out of Portland, Oregon.
The company claims its assessments are used by over 9,500 schools and districts in 145 countries – but none is more popular than the MAP.
Some states even require the MAP as part of their standardized testing machinery. However, in the Commonwealth, the MAP is used as a pre-test or practice assessment by districts that elect to pay for it.
My building – the middle school – used a variety of different assessments throughout the years for this purpose – IXL, CDT, etc.
However, things are changing this year. No, we’re not getting rid of these pretests altogether – why enact sane policy now after a decade of wrongheadedness!?
My district had used the MAP consistently for years at the elementary schools, so someone in administration thought it made sense to bring it to the middle school now and eventually institute it in the high school, as well.
Do we really need an assessment BEFOREthe state mandated assessments?
Heck no!
Classroom teachers give enough assignments and tests of their own to know where their students are academically throughout the year. We grade them after all. What do you think that’s based on – guessing?
But certain administrators just love these pre-tests. They love looking at spreadsheets of student data and comparing one grading period to another. They think if the numbers go higher, it will be proof they’re good principals and functionaries.
It’s pathetic to be honest. What a waste of taxpayer dollars that could be used for actual learning! What a waste of class time that could be used for actual teaching!
And what a negative impact these assessment actually have on students and their learning!
For instance, at the MAP training, teachers were told the assessment’s job was to show how our students were doing in Reading, Math and Science compared with an average test taker.
How is that useful?
I don’t teach average test takers. I don’t even teach average students.
How is constantly comparing them to a norm going to help them improve?
If I went on a diet and stepped on the scale, learning that my weight loss wasn’t as high as an average dieter would not help me stay away from sweets. If anything, it would inspire me to go on a binge in the snack drawer.
It’s the same with my students. Constantly pounding into them how below average their scores are does not inspire them to do better. It teaches them that they cannot do what is being asked of them so they stop trying.
When learning a skill, it doesn’t help to know how well others are or are not learning that same skill. It matters how much you are learning in comparison to yourself. Yesterday I knew THIS. Today I know a bit MORE. Who cares what the so-called average learner can do!?
Students learn at their own rates – sometimes faster, sometimes slower. We don’t quicken the timescale with needless comparisons.
But no matter how many times I say such things to administrators or paid trainers from NWEA, they just don’t get it.
At this training, the instructor actually wanted to know what “elevator speech” teachers were going to give to parents about why the MAP was important!
It’s bad enough we’re being forced to give this crappy assessment, but now you want us to spout propaganda to the very people paying our salaries!?
Why not invite us to the school board meeting and ask us what we really think of this initiative? Why not have us submit comments anonymously and have them read publicly to the school board?
Why not invite us to the school board meeting and ask us what we really think of this initiative? Why not have us submit comments anonymously and have them read publicly to the school board?
But of course not! That would be actually valuing the opinion of the people you’ve hired to teach!
It’s no wonder the trainer was anticipating blow back. Many parent and teacher groups across the country have opposed the MAP test. Most famously in 2013, teachers at several Seattle schools lead by Garfield High School actually refused to give the MAP test.
Having trusted teachers sooth community worry with corporate propaganda would be a big win for the testing company.
However, I’ll give the trainer one thing – she understood that the MAP assessment scores would not be useful unless students could be encouraged to take the test seriously. Nobody tries their best at something they think is unimportant.
Her solution was two-fold. First, NWEA has produced several propaganda videos to show students why the test is important.
I can imagine how much they’ll love that!
Second, the MAP is an adaptive test taken on a computer or iPad. And it actively monitors the students taking the test.
Teachers are supposed to monitor all this on a screen and intervene when it occurs. We’re supposed to counsel kids not to just guess and then allow them back on the test. If the algorithm still thinks students are guessing, we’re supposed to suspend their test and make them take it all over again.
You know, I did not get a masters in education to become a policeman for a standardized testing organization.
Open the link and read the post in full.
WAPO/Donna St. George fired up 9/1 headlines.
“National test scores plunge during the pandemic.”
Producing the largest dip in 30 years on the National Assessment for Educational Progress. The falloff is “historic” and left little doubt about the pandemic’s toll.
https://www.washingtonpost.com/education/2022/09/01/student-test-scores-plunged-pandemic/
Don’t give a damn about the completely invalid NAEP scores. Any discussion of those scores, which is like a Manx cat chasing it’s tail, is mental masturbation. . . it may feel good for a little bit but it ain’t the real thing.
Expensive garbage is the new math for public schools as corporate predators circle the schools like sharks that never sleep. The main objective of public schools is to educate, not to collect endless amounts of meaningless data. Singer is correct in stating that, if used at all, data should be used by students and teachers to compare a student to him or herself. The elusive “norm” is what is being used to takeover schools as generally schools with lots of poor and poor ELLs are routinely going to be below the norm. Interestingly, the private charter schools tend to reject the ELLs because it takes a few years before they can show their potential, and they are seeking a magic bullet that will show instant results. They also do not want to hire specialists that can teach this population. Unfortunately, we are in a time in which data collection appears to be more important that real learning, and the amount of time spent on this fool’s errand actually impedes genuine education.
” I did not get a masters in education to become a policeman for a standardized testing organization.”
This is why I retired in 2021. There are far too many assessments and new curriculums in our public schools. I don’t even want to sub because it is far too stressful to enter into the schools for any reason. As an Adult with HF Autism (Asperger), I am working for me now. I am returning to what turned me on before I entered public schools at 45: sewing.
“I am returning to what turned me on before I entered public schools at 45: sewing.”
Then I guess Mr. Singer is the right person to read, eh!
Fortune to you in your new endeavor, Ms. Horsley!
Bob-
If you read this, would you please explain Neo-orthodoxy in Christianity in the thread for the Hillsdale College post?
Hillsdale has a course, Neo-orthodoxy renaissance: Barth and Bohoeffer in Nazi Germany.
I’m not the one to answer this call, Linda. I have an interest in religion and in theology, but it doesn’t extend to following the niceties of modern and postmodern Christian theologies. Neoorthodoxy was a mystical reaction against natural theology and its liberal and, in the limit, deistic tendencies, emphasizing the Otherness and Mysteriousness of God, our inability to comprehend God, and our necessary submission to God as a matter of faith, not reason. None of this makes the slightest bit of sense to me. I am tempted to say, well, I guess I am about to say, it doesn’t make any sense to me because it doesn’t make any sense, period. But that’s the point, isn’t it, of neoorthodoxy? Belief, faith are beyond reason. They are unreasonable.
https://bobshepherdonline.wordpress.com/2014/02/22/the-tractatus-comico-philosophicus-soren-kierkegaard/
Steve Singer is absolutely RIGHT.
Classroom Teachers KNOW more about a student than any standardized test score.
“It’s typically offered (required) three times a year so teachers can measure student progress in the skill of taking standardized tests.”
100% Pure Grade AA Bovine Excrement. . . as there is no “measuring of student progress”.
The most misleading concept/term in education is “measuring student achievement” or “measuring student learning”. The concept has been misleading educators into deluding themselves that the teaching and learning process can be analyzed/assessed using “scientific” methods which are actually pseudo-scientific at best and at worst a complete bastardization of rationo-logical thinking and language usage.
There never has been and never will be any “measuring” of the teaching and learning process and what each individual student learns in their schooling. There is and always has been assessing, evaluating, judging of what students learn but never a true “measuring” of it.
But, but, but, you’re trying to tell me that the supposedly august and venerable APA, AERA and/or the NCME have been wrong for more than the last 50 years, disseminating falsehoods and chimeras??
Who are you to question the authorities in testing???
Yes, they have been wrong and I (and many others, Wilson, Hoffman etc. . . ) question those authorities and challenge them (or any of you other advocates of the malpractices that are standards and testing) to answer to the following onto-epistemological analysis:
The TESTS MEASURE NOTHING, quite literally when you realize what is actually happening with them. Richard Phelps, a staunch standardized test proponent (he has written at least two books defending the standardized testing malpractices) in the introduction to “Correcting Fallacies About Educational and Psychological Testing” unwittingly lets the cat out of the bag with this statement:
“Physical tests, such as those conducted by engineers, can be standardized, of course [why of course of course], but in this volume , we focus on the measurement of latent (i.e., nonobservable) mental, and not physical, traits.” [my addition]
Notice how he is trying to assert by proximity that educational standardized testing and the testing done by engineers are basically the same, in other words a “truly scientific endeavor”. The same by proximity is not a good rhetorical/debating technique.
Since there is no agreement on a standard unit of learning, there is no exemplar of that standard unit and there is no measuring device calibrated against said non-existent standard unit, how is it possible to “measure the nonobservable”?
THE TESTS MEASURE NOTHING for how is it possible to “measure” the nonobservable with a non-existing measuring device that is not calibrated against a non-existing standard unit of learning?????
PURE LOGICAL INSANITY!
The basic fallacy of this is the confusing and conflating metrological (metrology is the scientific study of measurement) measuring and measuring that connotes assessing, evaluating and judging. The two meanings are not the same and confusing and conflating them is a very easy way to make it appear that standards and standardized testing are “scientific endeavors”-objective and not subjective like assessing, evaluating and judging.
That supposedly objective results are used to justify discrimination against many students for their life circumstances and inherent intellectual traits.
C’mon test supporters, have at the analysis, poke holes in it, tell me where I’m wrong!
I’m expecting that I’ll still be hearing the crickets and cicadas of tinnitus instead of reading any rebuttal or refutation.
Duane, if I were to give a test on state capitals, then the unit of measurement would be correctly identified state capitals. If I were to give a test on the times table from 1 x 1 to 12 x 12, then the unit of measurement would be correct products of pairs of integers from 1 x 1 to 12 x 12. The state tests are terrible. MAP isn’t much better. And yes, there is a problem with these tests, especially in ELA, with regard to validity. They cannot measure what they purport to measure, but the problem is not always with the lack of a unit of measurement. One HUGE problem with the ELA tests is that the standards are very vague and very broad, and the tests typically have one question per “standard,” and it simply isn’t possible, validly, to test a broad standard with one multiple-choice question. If our school officials had stopped to think about this even for a moment, they would have realized the problem and laughed these tests off the national stage. OK. All that said, it is in fact the case that due to the variety of things that one of these incredibly vague, broad “standards” refers to, there cannot be, in these cases, a single unit of measurement of the “standard.”
The administrators who take these tests seriously show, by this, their utter incompetence. They shouldn’t be anywhere near a decision-making desk in a school system. It really is time for educators to start saying, “Enough! No more of this pseudoscience!”
NO! Not at all. Counting correct answers is not a measurement. We’ve been through this before, many times and your adherence to the standards and testing malpractice regime’s language only serves to hide the fact that there is no measuring going on at all. There is counting. There may even be assessing, judging and/or evaluating going on but not measuring.
This confusion is compounded by what it means to measure something and the similar misuse of the meaning of the word measure by the proponents of the standards and testing regime. Assessment and evaluation perhaps can be used interchangeably but assessment and evaluation are not the same as measurement. Word usage matters!
The Merriam-Webster dictionary definition1 of measure includes the following:
1a (1): an adequate or due portion (2): a moderate degree; also: moderation, temperance (3): A fixed or suitable limit: bounds b: the dimensions, capacity or amount of something ascertained by measuring c: an estimate of whit is to be expected (as of a person or situation d: (1): a measured quantity (2): amount, degree
2a: an instrument or utensil for measuring b (1): a standard or unit of measurement—see weight table (2): A system of standard units of measure
3: the act or process of measuring
4a (1): melody, tune (2): dance; especially: a slow and stately dance b: rhythmic structure or movement: cadence: as (1): poetic rhythm measured by temporal quantity or accent; specifically: meter (2): musical time c (1): a grouping of a specified number of musical beats located between two consecutive vertical lines on a staff (2): a metrical unit: foot
5: an exact divisor of a number
6: a basis or standard of comparison <wealth is not a measure of happiness
7: a step planned or taken as a means to an end; specifically: a proposed legislative act
Measure as commonly used in educational standard and measurement discourse comes under definitions 1d, 2, and 3, the rest not being pertinent other than to be used as an obfuscating meaning to cover for the fact that, indeed, there is no true measuring against a standard whatsoever in the educational standards and standardized testing regimes and even in the grading of students. What we are left with in this bastardization of the English language is a bewildering befuddle of confusion that can only serve to deceive many into buying into intellectually bankrupt schemes that invalidly sort, rate and rank students resulting in blatant discrimination with some students rewarded and others punished by various means such as denying opportunities to advance, to not being able to take courses or enroll in desired programs of study.
Who says that counted answers cannot be a measurement? It certainly can be and is, in fact, used in this way all the time. It meets the criteria of having a precise measurement unit, a raw score in those measurement units, and a range of possible scores, and it can be subjected to all the standard statistical treatments: calculation mean, median, mode, range, variance, percentage, standard deviation, quartile, z-score, t-score, etc.
measurement unit: state capitals identified correctly (xSC)
your raw score: number of state capitals correctly identified (nSC)
range, or upper and lower bounds (0SC – 50SC)
You cannot simply decide, Duane, that words are not to be used with a technical meaning commonly ascribed to them because they are imprecise or inaccurate when they are, in fact, in many instances quite precise, quite accurate. I am, in fact, measuring, here, by counting correct answers, an intellectual achievement: knowledge of state capitals, and I’m doing it precisely. Yes, things can go wrong with the measurement. I can try to do it when you are sleeping, for example, or hungry or drugged or distracted or emotional, but these are contingent facts, not essential, defining ones.
Don’t give a shit about the psychomeretricians bullshit statistical machinations. Counting in not measuring, it is counting. Ay ay ay ay ay!
So, if the standard is, “The student knows the capitals of the 50 states,” then this standard can be measured, quite accurately, quite precisely, and repeatedly, using a test fairly simply constructed and the unit of measurement, etc., described above.
That’s not a standard. It’s a curriculum goal/objective. It is a statement of what we would like the students to know. That is VERY different than it being a “standard”.
As used in education, the word standard means, precisely, a curriculum goal. It’s a statement of what the student should be able to do or should know at the end of a course of study. The word standard has different meanings in different contexts/fields, as potential means one thing to an electrician and another to life coach.
No, standard is used to confuse curriculum goals/objectives as being a scientific endeavor when it is not.
Duane, I am entirely with you in spirit with regard to people’s tendency to make pseudoscientific claims about academic measurement. Here’s a story to illustrate that: The greatest Shakespeare scholar of the early 20th century, or so many people thought, was George Lyman Kittredge. He had no PhD. Once, he came to Indiana University to give a lecture and, according to a story told me by an English professor there, one of the IU professors asked Kittredge why he didn’t sit for the PhD exams, which he would obviously fly through. I utterly love his response, which was, “Which of you is going to examine me!”
I love this response because it gets to the overweaning pride, the hubris, of exam makers. OFTEN, their tests don’t measure what they purport to measure, and there are specific reasons why, but one of those is NOT the general reason that any measurement of any intellectual attainment is a logical impossibility. That’s poppycock. It’s crank stuff. Obviously, some knowledge and ability can be measured–can be very clearly and precisely defined via the process of operationalzing it–turning the intellectual ability into an equivalent set of operations. So, knowledge of the 50 state capitals becomes the operation, teacher names each state, student responds with the name of its capital. The measurement unit is “name of a state capital.”
There are lots of real problems with these so-called standardized tests (I say so-called because the raw scores of state tests are typically NOT converted to standard scores because these are criterion-referenced, not norm-referenced tests. Here, an inventory of actual problems with state ELA tests:
https://bobshepherdonline.wordpress.com/2020/03/19/why-we-need-to-end-high-stakes-standardized-testing-now/
THE most important problem is the one identified by Noel Wilson, that of the onto-epistemological invalidities involved in the whole process. Lacking validity means that the whole process is bullshit.
Yeah, I’ve gotten to the point of being quite crude with my descriptions because no one seems to have listened over the last twenty years as I have said these thing in a nicer fashion.
So, I will agree that there is no standard unit of measurement of something as vague as “reading ability.” Clearly, that’s so, and so to that extent your critique applies,, but it is NOT the case that it is never possible to have a measurement, using the term quite precisely, of an intellectual attainment that is precise, valid, and reliable.
There is no “intellectual attainment”, which is just another fancy word for the bogus “academic achievement” concept, that is precise, valid and reliable. Wilson showed us that fact in 1997.
There is no intellectual attainment? So no one ever, for example, learns elementary Algebra or how to write a sonnet in proper form? Or are these somehow not intellectual attainments? That’s an utterly BIZARRE assertion, Duane. Some people know things that others don’t. Some people have trained themselves (or been taught) to do things with their minds that others can’t. These are intellectual attainments. OBVIOUSLY. It’s simply crackpot to say that there is no such thing. And I tried to read Wilson. It was clear to me that he was a crackpot. He’s the educational equivalent of folks who write to famous physicists to explain to them why physics has it all wrong and how to build a perpetual motion machine.
I bother with this because it is extraordinarily important for educators and education policy makers to understand the real reasons why these tests are pseudoscience, and it isn’t because it is never possible to measure, rationally and precisely, ANY intellectual attainment.
Yes, it is! “. . .Because it is never possible to [supposedly] measure, [certainly not) rationally and precisely ANY intellectual attainment.” You’ve fallen into their bastardization of language usage trap and only help them continue with their harming of students.
It’s all a big effin lie, every last bit of it.
Oh, so that’s a relief. There’s no difference between me and John von Neumann or Edward Whitten when it comes to mathematics because there is no such thing as intellectual attainment, and those guys didn’t attain anything. LMAO!!!
Ah, you caught me in a verbal trap of my own making. Okay so there is no measurable intellectual attainment. And that’s a fact, Jack! 😉
https://bobshepherdonline.wordpress.com/2020/03/19/why-we-need-to-end-high-stakes-standardized-testing-now/
So, your argument works with regard to these state tests. They don’t have a clearly defined standard unit of measurement (because the standards are too broad and vague for them to be precisely operationally defined). But it doesn’t rule out using the term “measurement” to refer to some kinds of testing of some intellectual attainments.
Yes, it does rule out “using the term measurement. Again, Wilson has proven why in 1997.
Notice that not a single rebuttal or refutation has ever been made of Wilson’s work that I know of and I’ve been asking for rebuttals and refutations for going on 20 years now.
Please refute or rebut Wilson’s work and then I might lend an ear to your arguments here and elsewhere.
One of the defining characteristics of EVERY crackpot treatise is the claim “No one has ever disproved x,” where x is the crackpot thesis–ancient aliens, the squared circle, perpetual motion machines, etc.
Prove Wilson wrong!
You can’t and know you can’t, hence your vociferous defense of the standards and testing malpractice regime.
Again, every crank who writes to a math professor with his squaring of the circle or to a physics professor with his perpetual motion machine or to a history professor with his theory about the ancient British Hebrews caps his or her “argument” with exactly this: “The world leaders are shape-shifting reptilians from Alpha Draconis! Prove me wrong!!!”
This prove me wrong bit is a defining, essential characteristic of crackpot theorizing.
Again, here’s why this Wilson stuff bothers me: there are real reasons why the standardized tests are pseudoscience, and people need to understand those. If they encounter crackpot claims (nothing intellectual is measurable), then they just dismiss ALL critique of the standardized tests as crackpot stuff because that critique is so clearly ridiculous.
You demean and slander Noel Wilson by calling him/insinuating him to be a “crackpot”. Just because you can’t comprehend what he is saying doesn’t mean he is a supposed “crackpot”.
I’ll gladly walk you and anyone else who like to understand what he is saying going chapter by chapter and discussing what is there. Anyone, feel free to email me at dswacker@centurytel.net, putting Wilson in the subject line.
There is no way to put this nicely, Duane. “Educational Standards and the Problem of Error” is a crackpot work. It belongs on the bookshelf next to Ignatius L. Donnely’s Atlantis: The Antediluvian World and Michael Drosnin’s The Bible Code, between Lucian’s True History and Alison’s Bird’s Marconics: The Human Upgrade, between Erich von Däniken’s Chariot of the Gods and Raymond Bernard’s The Hollow Earth, between Samuel Hahnemann’s The Organon of the Healing Art and Rhonda Byrne’s The Secret. Crackpot literature. BTW, one of the criteria on John Baez’s Crackpot Index is offering prize money to anyone who disproves your theory. Why? Because people with crackpot theories ALWAYS say, Disprove This! No one ever has because No One Can! I started reading the Wilson work many years ago when you first suggested it. I dutifully downloaded it and started plowing through it. Immediately, it was suspicious because of the really weird language in it–stuff one wouldn’t find in professional work, and it just got loonier and loonier–misunderstandings of basic concepts, semantic and logical confusions, mere assertion, and yes, some flirting with actual truths, such as the fact that assessment always involves a relation of unequal power between the assessor and the assessed. For many decades, I have had a fascination with the works of cranks and with writing by persons with various psychological disorders, such as the many varieties of schizophrenia. Give me a Unabomber Manifesto, and I’ll find that an interesting, if totally crazy, read. And Wilson seemed to me in that vein. Crank stuff. I really wondered about the man’s sanity.
Nice total ad hominem attack along with a nothing supposed critique.
Just admit you didn’t understand what he is saying.
OK, Duane. I’ve worked in education all my life. I’ve studied educational measurement, and I’ve done assessment development for decades. I am responsible for some pretty major innovations in the field. But I know nothing of this. Thank god that there are crackpots to clear this all up for me.
Your diatribes do nothing to refute what Wilson has shown us. It’s sad that you are so blindly ignorant in not being able to read his work without getting hell bent out of mental shape. Can you say projection?
I don’t get “hell bent” when I read his work. First I get amused. Next I get nauseated. Then I just shake my head.
2+2=4.
You can test for that knowledge of that proposition with complete accuracy.
I gave you a number of existence proofs demonstrating accurate measurement of intellectual attainments. Case closed.
Part of it is Wilson’s deficiencies as a writer. Here is the first sentence, proper, of the work:
The project grew out of a general critique of assessment theory and practices, and in particular of the way in which the notion of error in measurement is obfuscated.
Now, of course, Wilson doesn’t mean to be saying that his is a critique of the manner of the obfuscation, as though all would be fine if the obfuscation were done in a different manner, but that is what he unintentionally says in his first sentence. Fine. Some people use syntax that gets way ahead of them. One can read generously and automatically correct this. And why “the notion of error in measurement” instead of just error in measurement, for error is a notion, though notion is a vague and imprecise word. I suppose he means the concept of error in measurement–that he is going to be critiquing obfuscations, or problems, with how people conceptualize error. But already, this is a lot of work to figure out what he intends to say as opposed to what he actually says, because he is not using the language precisely or well.
The opening of Chapter 1 continues:
The study that subsequently developed . . . Clearly positions the writer in terms of the experience, philosophy and values that he brings to this study.
OK. This is a red flag. Scientific studies don’t do autobiography, and there is a reason for this. They are supposed to strive for objectivity. But this is a standard feature of science books by crackpots/cranks. They always go into detail about “How I arrived at the truths that Einstein missed while working as a dishwasher while in the Army.”
Next sentence:
The study that subsequently developed . . . Develops some of the tools of analysis of the assessment process that enables a more stringent critique of the nature and extent of error in the measurement of standards.
OK. One can look past the obvious errors in grammar and usage, “The study developed . . . Develops” (redundancy) and “tools . . . that enables” (agreement error) and “more stringent” (reference error–more stringent than what?) Then we get,
The study that subsequently developed . . . Re-examines some of the fundamental assumptions of educational assessment generally and psychometrics in particular. Indicates some of their most blatant self-contraditions and fudges.
Fudges. This is the kind of thing one sees in papers by 9th graders–extremely informal language used inappropriately in a highly formal context, but that sort of thing is a common characteristics of crackpot writing. And why the use of the personal pronoun (their) to refer to “assumptions” and the attribution to assumptions of selfhood? Extremely sloppy language and personalization of concepts–again a typical feature of crank science writing.
Then we get,
The study that subsequently developed . . . Reconceptualizes the notion of invalidity, and positions the field of educational categorisation here, from the perspective of the examined, rather than with validity, which is an advocacy for the examiner.
I am all in favor of people looking at assessment from POV of the examined, but what not simply say that? And WTF is meant by reconceptualizing invalidity, and, again, why the notion of invalidity, given that invalidity is a concept, or notion? And in what sense is validity “an advocacy for the examiner”? Does he mean that anyone who uses terms like valid or validity is taking the POV of the examiner? If so, that clearly is not the case. If I write, as I often have, that the state ELA tests do not validly measure what they purport to measure, I am using the term validly and the concept of validity to make a critique ON BEHALF of the examined.
Then one gets the concluding line of the Summary with which Chapter 1 begins:
As can be seen, the initial research question has generated action as well as understanding, a tool to repair the damage resulting from the critique, and a way to reduce some of the violence it implies.
He doesn’t mean, of course, that the damage done by assessment results from his critique, though that is what he inadvertently says.
Shall I go on?
What annoys me about the Wilson book is that it is indeed the case that
a. tests are often invalid
b. standards are often so broad and vague as to be untestable (not operationalizable and so not measureable)
c. there is often an epistemic gap between the testing instrument and what is being tested (the test often doesn’t measure what its makers say it measures)
d. it’s important when evaluating assessments to think of them from the POV of the assessed
These and other matters treated by Wilson are of extreme practical importance because of the outsized role that invalid assessments now take in education due to G.E.R.M. And so it’s important to educate people on the actual issues with the tests.
And then, in the very next section, Wilson hauls off with what reads like a parody of a literature review, referring to “critical studies that overlap mine,” “The sneakiness of some of the research techniques” in one of these studies; claiming that one of these “overlapping” works shows that “there are no standards, or at least none that psychometrics can produce; mentioning a study of “public examining,” whatever that is, and generally making claims about the contents of the literature reviewed that are far too vague and hyperbolic to be legitimate summaries.
He goes on:
On the other hand, most of the literature on reliability and validity is pertinent to this study. He means not “On the other hand” but “In addition.” He then tells us that literature provides “enough invalidity information to self destruct.” He means, I suppose, that if one applies the criteria for invalidity advanced by assessment people to the tests created by them, then the tests are demonstrably invalid. This is true. But if that’s what he means, why doesn’t he just say so?
So, part of it is just that he is an extraordinarily sloppy writer attempting to use sophisticated, formal, academic language of which he has little command, like the 9th grade students many of us have had who write sentences like “In the essay, the author of the essay writes paradigms that wonder what is going on with space and time.”
For years, I have mostly held my tongue about this because you and Wilson and I share antipathy for most standardized testing. I utterly loathe and think demonstrably unscientific the state standardized tests in ELA. But I’ve read these postings about Wilson’s crackpot notion that no intellectual attainments can be measured for too long. I could hold my tongue no longer. It’s wacko stuff, obviously, easily refuted, but here’s the thing about ideology, about cultish belief: no rational refutation works. Reread carefully my critiques, above, and my counterexamples. Enough said. I’ve wasted too much time on this already.
Can’t refute is work, eh!
I could wade through Wilson, sentence by sentence, and show how misguided and kooky this stuff is, line by line. And after all those thousands of hours of explicating the problem with each of thousands of bits of crank theorizing, you would just say that I was a tool of the standardized testing industry, swallowing the orthodox line. In other words, this would be an utter waste of my time. I leave people to read the stuff themselves. It’s clearly way outside the bounds of reason.
“I could wade. . . . ”
Then do it because you certainly haven’t refuted nor rebutted anything that he has said.
Again, yes, there is an epistemic gap between the independently observable and measurable and astract concepts or mental events. But there are rational means for operationalizing those and measuring them. Are there pitfalls? Yes. And the current mandated state standardized testing falls head over heels into those. But this doesn’t mean that no measurements of such things are possible. That’s simply a CRAZY notion, refuted by thousands and thousands of existence proofs of pretty good measurements of those things, something done all through the social sciences, including, for example, anthropology and psychology and political science. We can have pretty good measurements of how free people are to vote. It’s not terribly hard to come up with objective correlatives of this abstract notion.
Just because you say that “thousands and thousands of . . . .” doesn’t mean that Wilson is wrong. It just shows that you are refusing to give concrete examples of his work that you dispute.
I’m from Missouri. . . SHOW ME!
Oh, those people on Ravitch’s site. They’re just a bunch of loons, crackpots, cranks. They think you can’t measure learning of any kind at all ever.
That’s what they end up saying, Duane, and I’m sick of people giving them reason to say that.
And I don’t give a damn about what they say-they’re wrong, dead wrong. I know that not one single rebuttal or refutation have I ever seen. Be the first.
I have a friend whose son was, many years ago, diagnosed with schizophrenia. The son writes like this:
God exists, the Universe descended from a single absolute principle, the Higgs Boson, physics proved that, on the other hand, it can be made in a lab, in the sense that God is already embedded in everything, and there is no practical application of this discovery, once again, only the broken symmetry is of interest. As Fuller put it “Unity is at minimum 2” the monotheism is hand-in-hand with people who are traumatized and have their own God complex, which is why Jesus, with his actual human faults is so appealing, they think they’re gonna get a “get out of jail free card” while having no work to do, no willpower or attention or energy required, and that’s not what YHVH means.
Which reminds me of Noel Wilson writing like this:
Very briefly, the Liebnizian inquiry mode begins with undefined ideas and rules of operation, ending with models that count as explanations. The Lockean mode begins with undefined experiential elements, and uses consensual agreement to establish facts. The Kantian system shows the interdependence of the Liebnizian and Lockean modes, and uses somewhat complementary Liebnizian models to interrogate the same Lockean data bank, to ultimately arrive at the best model. the Hegelian mode uses antithetical models to explain the same data, leaving it for the decision maker to create the most appropriate synthesis for a particular purpose. In this mode values of enquirer and decision maker are exposed. Finally, the inquiry system of Singer (1959), is one of multiple epistemological observation, where each inquiring system is observed from the assumptions of the others, and each methodology is processed by those of the others.
Both examples of people making peculiar statements about and unwarranted grandiose connections between things that they clearly don’t understand but seem to imagine that they have enveloped in a grand vision, an overalls scheme. But it’s nonsense, likely pathological nonsense.
tis sad that you don’t understand what Wilson is saying there.
Can you get any lower by comparing him someone with a serious mental condition?
Well, maybe if you compared him in that fashion with the tRump.
And then we get this kind of weird stuff that one might expect from a remedial ninth grader who hasn’t yet learned what kinds of things one does and does not say in a professional paper:
The general strategy used to make the case for invalidity of most current assessment practice is borrowed from military policies of nuclear deterrance. It is a strategy of overkill. Of the thirteen sources of invalidity developed in this study, any one would, if fully applied to current assessment practices, take them out, neutralize them, render them inoperable.
This is not writing by someone who has participated in an ongoing academic community. It’s the kind of thing that crackpots write, extremely telling in that regard. You know, the crackpots who send their explanations of the universe to professors of cosmology with a note saying, Einstein was wrong, and I have proved this! I just need someone to flesh out the TRUTH, which I have revealed here, with equations, for my work renders inoperable ALL SCIENCE has to date been foolishly believed by the sheep.
Nice adhominem attack on Wilson!
this reminds me of someone say, Duane,
Prove to me that Queen Elizabeth isn’t actually a shape-shifting Reptilian extraterrestrial from Alpha Draconis! HA!!! No one has because no one ever can! I’m sorry, I’ve given several examples, above, of rational measurement of intellectual attainments using a clearly, operationally defined unit of measurement making the attainment quantifiable, and yes, by simple counting. There’s your refutation, Duane. Enough. Wasted too much time on this already.
Prove to me that there is no Russell’s Teapot.
If someone says, “There are no black swans! Black swans are a logical impossibility! Wilson has proved that!” and someone shows you a black swan, well, that’s pretty much end of discussion!
“There can’t be heavier-than-air flying machines! These would violate the principles of physics!”
Uh, are birds heavier than air?
Yes.
Do they fly?
Yes.
Are they biological mechanisms?
Ha! See, you can’t prove that! They are held up by God’s hand!
OK, scratch the birds thing. I didn’t get anywhere. Here’s an airplane. It’s going to take off in a few minutes. Stand and watch. Oh, and have a nice day.
Sad, indeed sad that you dance around the real validity issues brought up by Wilson.
Again, your examples do nothing to refute/rebut what Wilson has proven. I guess I’m expecting too much of you!
But, hey, it’s all only mental masturbation anyway, eh!
The whole point of operationalizing a concept is to make it independently observable, verifiable, countable, measurable. Social scientists and educators do this all the time. Sometimes they do it poorly. Sometimes they do it well. And that’s what we can argue about, not about whether it is possible to do it at all.
But I agree with you that one cannot measure something as broad and vague as, Whitten’s math ability, though it’s pretty darned clear that he’s good at it. Same with these reading tests. They pretend to a scientific accuracy that they don’t have because what is being measured is too complex and too broad and the measurement apparatus is too crude. It’s kinda like trying to measure the width of various synaptic clefts with a yardstick.
From Master of the Education Universe Bill Gates down to the cub reporter writing a story about the latest NAEP scores, people make this fundamental error with regard to the state standardized tests: they think that these accurately measure what they purport to measure, as one can, for example, accurately measure, via a test, whether someone knows the state capitals or the times table or the Level 1 kanji. This is the thing that makes their reasoning and reporting entirely screwy. It’s THE big mistake. Everyone (almost) knows that some things can be accurately measured with simple, straightforward tests that operationalize the learning in straightforward, reasonable ways. And they thing that the mandated state tests to that: that they accurately, validly, reliably measure “reading ability.” They don’t. Understanding why isn’t simple, it seems. One has to explain the reasons why, and there are several–why the state reading tests are NOT like, say, a test of the times table, which can be extraordinarily accurate, an actual measurement of learning attained.
Including an extraordinarily accurate accounting of the AMOUNT of it, variously reportable as raw scores, percentiles, standard scores, etc.
What’s needed is for people to understand WHY and HOW the state tests are NOT like those obviously valid and reliable tests, why the scores from those cannot be taken at face value. I got tired of explaining the various reasons why these tests don’t do what they say they do, and there are several, so I put the major ones into one short essay, here, so I wouldn’t have to repeat myself any time this question about the validity of these tests came up:
https://bobshepherdonline.wordpress.com/2020/03/19/why-we-need-to-end-high-stakes-standardized-testing-now/
At any rate, Duane, I AGREE with you that the state tests and tests like MAP Reading do not have a standard unit of measurement and don’t measure what they purport to measure. Both true. What’s isn’t true is the contention that no measurement of any intellectual attainment is possible, unless, that is, you chose to have private meanings for “measurement” and “intellectual attainment,” which isn’t allowed because word meanings are matters of shared public conventions. Again, I don’t want to give Deformers good reasons to dismiss as cranks, as crackpots, those who attack the standardized testing.
What I don’t want to have happen is for people to dismiss all criticism of the federally mandated state standardized testing as “crank stuff,” which they could easily do if one tried to make the argument that no intellectual attainment is measurable because counterexamples spring readily to mind and are easy to find. It’s important to define, clearly, what the problems are with the standardized tests and their interpretation. People, including, alas, a lot of journalists and politicians and parents, simply assume that a state Reading test is a valid, reliable instrument, like a test of, say, knowledge of state capitals or of the times table. And that’s wrong. Understanding why that’s wrong takes a little work, and that’s perhaps why so many never do it. But it’s not that hard to grok the problems that make the state tests invalid and unreliable and so NOT proper instruments on which to base policy decisions, such as ones related to student, teacher, and school evaluation, directions for curricula and pedagogy to take, etc.
“t’s important to define, clearly, what the problems are with the standardized tests and their interpretation.”
And Wilson has done just that!
[sighs deeply]
If you go around saying, the state tests are terrible because you can’t measure intellectual attainments, people will dismiss that argument because it’s clearly, obviously UNTRUE GENERALLY. There are many obvious instances in which one can do precisely that. The issue is not with measurement IN GENERAL. Rather, there are a number of issues that keep the state tests from measuring what they purport to measure. It’s those that must be clarified for folks who simply assume that the test scores mean what they purport to mean. People need to understand WHY they don’t, and it’s not because measurement in general of any intellectual attainment is impossible.
No, it’s not “UTRUE GENERALLY”. But if there is one thing that is generally true is your need to always have the last say.
“It’s typically offered (required) three times a year so teachers can measure student progress in the skill of taking standardized tests.”
I think this was meant tongue-in-cheek, Duane, to imply that the MAP test measures not reading or math or science knowledge or ability but the ability to take the MAP test.
Yes, I caught that tongue in cheek aspect. It doesn’t negate the fact that NAEP doesn’t measure a damn thing.
Let’s consider the paradigmatic case of “measurement,” Duane: Finding out how tall your kid is. You chose a unit of measurement, you take an instrument, and you COUNT off units from top of the head to heel or vice versa. You COUNT in the units of measurement: one centimeter, two centimeters, as in one state capital correctly identified, two state capitals correctly identified. Is that clear enough?
Measuring distance is one thing with an agreed upon standard unit length of measurement, with an exemplar of that unit to gauge judge the ruler against for accuracy. Standardized tests do not have those things.
Yes, absolutely right about that, Duane!
Measuring distance. What I want to know is why it took so long to drive to see my girlfriend when I was a boy and so long to drive home.
And why does it take longer to boil water for tea if you are in a hurry?
I say that being able to make a good cup of English Tea is the measure of a good host
I used to go to church with my girlfriend and her parents so that I could take her home [the long way] afterward.
https://bobshepherdonline.wordpress.com/2019/03/18/wooing-dating-hooking-up/
By the way: nice discussion you two. I would love to float a flat river with some of the folks who post here
I have threatened Duane with coming to do that, but he might throw me over to the gar and the snapping turtles!
I am assuming that the plural of gar is gar.
The gat is a wonderful and ancient fish, but I think it may have been named by a tongue-tied pirate:
“Gar matey!”
Gar not gat
lol
Many years ago, my grandfather put, live, three huge alligator gar he had caught into a public fountain in the middle of town. He was incorrigible. A merry prankster.
“I have threatened Duane with coming to do that, but he might throw me over to the gar and the snapping turtles!”
Come on out and float some of the best rivers in the world. There are no alligator gar in the rivers I float, only spotted gar (they’re fun to catch and to see a three foot one do a tail dance is thrilling). Now alligator snapping turtles get to be a good size-a couple of feet in diameter after 75-100 years of living but they’re not that many of them, being endangered here in Missouri.
Current, Jack’s Fork, Eleven Point or North Fork of the famous White River. Take your pick! I’ll even supply the water craft-canoe, kayak or john boat.
So, Duane, here’s how operationalizing works: You find a concrete operation or set of operations that is a correlative of the abstract thing that you want to measure, and then you measure that. And yes, it is common to measure by counting. One millimeter, two millimeters, . . . One gram, two grams, . . .
So, for example, people in democratic countries are naturally interested in the actual freedom that people have to vote. Well, “freedom to vote” is an abstraction. But you can render it as a concrete correlative: a) the person wants to vote and then actually votes. those two things are clearly and independently observable, unlike abstract “freedom to vote.” Then, you can study b) the ways in which those who wanted to vote and didn’t vote failed to accomplish that objective. Well, five percent didn’t have transportation on the day. Twenty percent were turned away because they didn’t have a current state ID. And so on.
Now, is there an epistemic gap between “freedom to vote” in the abstract and this particular measurable operationalization of that concept? Of course there is, and one can refine one’s operationalization based on a critique of the one currently in use. And sometimes, of course, there simply isn’t a practical, concrete operationalization. So, for example, when one of a couple hundred so-called “standards” is “making inferences from text,” one CANNOT ask one multiple-choice question that is going to measure, accurately, that ability IN GENERAL. It’s far, far, far, far too broad. The one MC question is an operationalization of the abstract (because general and not directly observable) concept, but it’s an utterly ridiculous one. It’s like saying that people are unhealthy because your niece has the measles. Your niece’s having or not having measles is not a proper operationalization of the abstract concept of the health, in general, of people in general.
There’s an epistemic gap. Of course. But that’s a pretty good rough operationalization BECAUSE if people who want to vote then vote, and this is generally the case, then it is likely that people are free to vote. If people who want to vote then fail to vote, this raises a question about their “freedom.” What is preventing them? What is making them not free to carry out their intent?
So that’s how it works, Duane. You measure the abstract thing by substituting an observable objective correlative action and measuring that. This is not impossible, in all cases, to do. In fact, it’s done all the time quite rationally. So, in a nutshell, that’s why Wilson is a crackpot, a crank, a flat Earther among education theorists.
“When you are excited to share your MAP growth with your principal,” tweeted Atlanta Public Schools Leadership…
OK. I just threw up a bit in my mouth.
“… a bit …?” Lucky you. My mouth’s vomit-holding capacity is nowhere near my stomach’s, so out the nose goes the excess.
Gross.
Yup.
Over the years, I have made my share of enemies–including an NWEA representative who visited the High School of Commerce in Springfield, Massachusetts when I worked there during the 2018-2019 school year–asking tough questions about this irritating bureaucratic ritual (and yes, it is used in the high school in New York City in which I currently serve). In the case of the NWEA rep, I simply asked her what he company was planning to do with the extensive data on students this test gathers. She assured me said data was safe. When I pressed her, explaining that she didn’t answer the question I asked, I ended up with a disciplinary letter from the principal of the school.
You, mere mortal, spoke back to the Lord of the Test!?!?
Excellent job, markstextterminal!
Give em hell!
Away with thee! Thou impertinent knave.
Proof that a mortal can sometimes do this and live!
Honestly, Bob, I was shocked that my colleagues at that school weren’t more skeptical of this test and its representative. One colleague of mine there later approached me, thanked me for challenging the dismal received wisdom on this test, and explained that fear for her job kept her from speaking up. So much for intellectual freedom–let alone skepticism–in education. It was really grim. One of the data points this test gathers (which I am afraid to mention in a public forum–I think we all know how covetous toward and defensive these companies are about their “intellectual property”) particularly troubled me. Let’s just say that if I were a parent, I wouldn’t want any private company animated by the profit motive to have this information about my child. And thanks for your endorsement, Bob. It means a lot to me.
Mark, you did what teachers have a duty to do. All honor to you for that. One year I took a sample release 9th-grade FCAT test (FCAT. the Florida Child Abuse Test) and did a written analysis of every question on it, demonstrating how every question, AS WRITTEN, was not answerable given the information provided or had more than one possible answer or had an answer that wasn’t among the answers provided. Then I sent a copy of this analysis to every teacher in my department, our Principal, and our AP. It’s astonishing, really, that they didn’t fire me. It’s breathtaking how sloppy these tests are, in addition to being invalid–that is, in addition to not validly measuring what they purport to measure. I have thought of publishing such analyses, but I, too, am worried about lawsuits. These test publishers are pretty ruthless. They will go after people.
They sure will. I think Pearson has been especially aggressive in this area.
My colleagues, of course, were greatly amused by this tour de force, but it made ZERO difference. It just confirmed what they all suspected, and nothing changed.
And nothing will change until the teacher’s unions stop being complicit, with their silence, in the child abuse that is the standardized testing regimen, take it to the streets, and demand an end to it. This has to be done at the state and national levels. It can’t be done building by building, alas, in public schools. It’s usually too difficult to get all or most of the teachers to agree to an act of major civil disobedience, such as all refusing to administer the tests.
Yes–and I don’t see my union doing this, which is particularly disappointing.
It’s disgusting. The leadership of the national teachers’ unions could end this tragi-farce. And they refuse to do so. Sickening.
That they refuse to act to end this child abuse makes my blood boil. The language I have in reaction to that is not allowed on Diane’s blog.
A head of one of the national unions could call a national month of action to end the federal standardized testing mandate. Until one of them does that, they are all complicit in child abuse. I mean that quite literally.
Thanks for reposting this, Diane. I really appreciate it, and I love reading the comments. This is a good tonic to the gaslighting classroom teachers get day-in, day-out. I wish we could bring everyone here to a school board meeting.
Thank YOU, Steven. The Renaissance test my district is now forcing on us three times a year is also expensive junk.
Testing/teaching to tests=one of the MAJOR reasons for the American teacher shortage.
Exactly!
Imagine if all the money wasted on these invalid tests (ones that do not measure what they purport to measure) were spent on classroom libraries. Imagine that! Millions and millions of enticing books in the hands of kids.
None measure what they purport to measure!
Duane, I gave you your counterexamples. I described completely rational procedures, using clear units of measurement, for validly and accurately measuring mental attainments–learning of the times table, of state capitals, and of the Level 1 kanji. And I explained, clearly, I think, that while there is an epistemic gap between mental events and the observable, this is commonly overcome via operationalizing the not directly observable and then quantifying those operations via measurement. Yes, that epistemic gap should always be interrogated, and this is not done by the people who, ironically, call for accountability but take the outcomes of these state standardized tests and the validity of the instruments entirely on faith.
Wilson’s magnum opus begins as crackpot works often do, with wild claims of a revolutionary overthrowing of existing ideas with references to broad areas of knowledge that the crackpot clearly doesn’t have, and then launches into personal narrative (how I came to discover The Truth) entirely inappropriate to a professional academic work like a dissertation (a tell) before making assertions with a purely semantic basis. It’s not measurement because you are not using a physical standard (as if that were necessary; obviously, one can measure whether a kid has learned the times table by questioning him or her on products from the table and counting how many he or she knows. Duh).
And saying, “That’s not measurement!” is just attempting to enforce a private language. Well, yes, that is a measurement. It’s a clear, completely reproducible, completely valid, quantification of the thing, showing how much. It meets all the standard definitions of measurement. OBVIOUSLY. And it’s entirely crackpot to claim that it isn’t. And as is often the case with crackpot theories, it’s based on fundamental misunderstanding, in this case, failure to grasp that language is a matter of arbitrary convention. We choose to use the word measurement to describe that which can reliably and validly be quantified to show us how much, and we do this both with regard to some physical things (the lengths of things, for example) and some abstract or mental things (via operationalization to make it measurable).
Which means, of course, that we should always interrogate the operationalization, and this is what deformers and those who report on standardized test score results fail to do. They naively take them on faith without interrogating the epistemic gap. Does this in fact measure what it purports to measure. And the answer with regard to the state ELA tests and to a lesser extent with regard to the Math test is, “No. It doesn’t.”
I’m retired going on three years, now. Terrible that these initiatives are still so widespread. What started out as technology as a tool for the teacher has morphed into the teacher serving as a monitor tool for the technology. Absolutely pathetic.