A while back, Michelle Rhee had an article published under her name in the Washington Post criticizing parents who opt thir children out of state testing. Her main reason seemed to be that parents won’t know whether he school is doing a good job unless they see standardized test scores.
Matt Di Carlo, no fan of he opt out movement, here takes issue with Rhee. She doesn’t understand the purpose of testing, he writes.
He writes:
“For example, right at the outset, the article asserts that tests are “designed to measure how well our schools are teaching our children.”
“This is just not accurate. Tests are designed to permit inferences, however imperfect, about how well students know a given block of content (e.g., relative to other students).
“Now, of course, we as a nation also have chosen to use these data to assess schools’ and teachers’ contributions to students’ progress. Done correctly and interpreted carefully, such analyses potentially yield useful information, even if reasonable people disagree on how and how much they should be used. Regardless, an important part of calibrating and designing that role is to understand the tests and what they can and cannot do.
“Michelle Rhee is highly visible and wields vast resources. When she asserts that tests are constructed to do something they’re not, with scarce acknowledgment as to how little we know about using the data in this manner, one can understand why people feel nervous about the standardized testing enterprise.
“Similarly, later in the article, Ms. Rhee goes on to offer the claim that opt-out advocates mistakenly think tests “are designed to pass judgment on students,” and responds that the truth is “quite the opposite” – i.e., that tests are “an indicator of … whether schools, educators and policymakers are doing their jobs.”
“While “pass judgment on students” carries negative connotations (and thus strikes me as a kind of a straw man), the truth is that tests are, at least in many respects, designed for this purpose – to assess (again, imperfectly) students’ knowledge of the material. Moreover, to reiterate, using testing data to draw inferences about the performance of schools, educators and policymakers is enormously complex and difficult.
“This distinction between the measurement of student versus school/educator performance is not semantic (and their conflation not at all confined to this op-ed). The flawed assumption that testing results are, by themselves, indicators of school/teacher performance is poisonous to both education policy and the debate surrounding it, It is, for example, reflected in the consistent misinterpretation of testing data in our public discourse, as well as the painfully crude, sure-to-mislead measures of NCLB.”
Matt is a middle-ground kind of guy. He is always reasonable.
But now, I think, parents are not feeling reasonable. Many believe that their children are cheated of a good education by the current obsession with testing. Many feel that the stakes are too high and the pressure on children and teachers robs schools of the joy of learning. High-stakes testing is out of control, and reasonable people recognize it.
I think they are right.
Parents do not need to be reasonable. They need to be Right for their child! Advocate for their child, demand for their child, stand up for their child. When parents are not heard, they must speak up, LOUD…and they don’t have to be reasonable.
Calls to be “reasonable” are just a variation on the “wait” theme that Dr. King was responding to in his Letter from Birmingham Jail.
The problem is if parents really know what’s happening. At least in my state, we are not allowed to tell them.
Tell em anyway!
To me it’s UNETHICAL not to discuss these things with the parents, especially ones of younger students and the students themselves.
There need to be pamphlets handed to parents about the testing same as privacy info at doctors’ offices.
“This distinction between the measurement of student versus school/educator performance is not semantic. ..”
Just that Rhee and others define their vision of testing this way, and not as what testing has been and should be is the best argument for opting out. To say to the parent “we are testing your child (and doing the test prep curriculum) as some kind of quality control” should be all it takes.
It’s I just … wrong.
Yep,
Completely UNETHICAL as it uses the students, who have no means of controlling, judging, giving the okay to, etc. . . the situation, as means to an end.
TAGO.
C. Kirabo Jackson published an amazing paper in which he showed that test scores account for a very small percentage of what actually goes on in school. Even worse, they are poor predictors for long term outcomes for things like college going and earnings. You can read part I of my summary of his paper on this issue at my blog: http://bltm.com/blog/2014/04/14/vamvomvom/
Standards for Educational and psychological Testing: an assessment must be validated for the purpose for which it is used.
Read and understand Noel Wilson’s take down of that Testing Bible in his essay review of it: “A Little Less than Valid: An Essay Review” found at: http://www.edrev.info/essays/v10n5.pdf
This is the wisest statement I’ve heard in a while: “Tests are designed to permit inferences, however imperfect, about how well students know a given block of content ”
And inference is tricky – very tricky. It requires a sophistication with results of this type to draw valid inferences to understand anything from test scores.
So why are test scores published? Probably required by FOIA and tremendous readership boost for newspapers. But publishing test scores school by school (much less teacher by teacher) is like handing a crazy man a loaded gun…not gonna end well no matter how much you think you can control things.
For example… My son’s high school has quite middling average test scores – scores that match the socio-economic makeup of the school. And yet, do those scores mean he’s getting less of an education? The crazy man with the gun (general newspaper readership) would say it does.
But, inference would also investigate whether he isn’t getting a BETTER education because the teachers he’s working with have to be BETTER – because they face more challenges.
I had a boss who worked on Navy shipboard radar in the 1970’s. He was “stuck” with the old tube based systems (not the sexy new solid state) and told me he was frustrated by not having the “best”. Except, he discovered that he really learned how radar worked tube by tube while the solid state guys just knew how to swap boards – never really learning or gaining the instinct for how a radar operated.
School seems entirely similar. There are great teachers everywhere. But the ones facing the biggest challenges often are the ones who develop the most sophistication for teaching.
Thanks Mr. Garnett,
I teach in an inner city school and I second your analysis.
3rd it!
A very common Core-ish smarter balanced question….
Devin and Taylor are two very talented teachers who are equally respected by their peers and students.
However, Devin teaches in a school that scores well above average on state assessments and Taylor teaches in a school that ranks amongst the weakest scores in the state on the very same test. Mr. Swanson who is the Commissioner of Education has ordered both Devin and Taylor to swap teaching assignments for a full year as an experiment to prove his point that teachers like Taylor are the cause of student low test scores and that successful teachers like Devin who teach using best practices and the latest research based methods will improve test scores and prepare Taylor’s students to be college ready.
Devin is coming from a suburban school district that is well supported by an thriving community that takes great pride in their neighbourhoods, parks, their community centre and especially their schools. A majority of of Devin’s students come from middle to upperclass 2 parent families whose parents have earned at least a 4 year degree.
Taylor is from an impoverished school district where a majority of students are born and raised in neighbourhoods that offer no more safety or security than most impoverished third world countries, whose neighbourhoods are plagued with gunshots, stabbings, murders, fights, theft, prostitution, drug abuse, alcoholism, abandoned buildings, foul language, and garbage. Most students go to school hungry and rely on their school to feed them breakfast and lunch. Most of Taylor’s students come from broken families whose parents have less than a high school education, are well below the poverty line, rely on government subsidies, and almost half of the students have fathers whose degrees are pinned to a crime and not an education.
Considering each teachers successes and the socioeconomic background of each district, make a prediction of the outcome of Mr.Swanson’s experiment. Will Devin continue to produce great student test results in his new location? Will Taylor continue to produce unsatisfactory student test results? Is it possible that Taylor and Devin can swap test results?
“This is the wisest statement I’ve heard in a while: ‘Tests are designed to permit inferences, however imperfect, about how well students know a given block of content ‘”
Not me, as it’s a false designation. It’s not logically true.
The tests are designed to supposedly assess a students knowledge in a certain format at a certain limited time (the testing time, no more no less). Any resulting “inferences” by definition have to be limited to a description of that particular interaction. One cannot logically infer anything else about either the test itself nor the student nor make comparisons to other tests, students, etc. . . . To do so is to commit a fundamental logical error that can only lead to falsehoods or as Wilson calls them invalidities.
Tests can’t “assess student’s knowledge”. They can only assess a small corner of that knowledge – presuming they are able to translate that knowledge in their minds onto the test paper or screen.
So when he says “infer”… The test process is to ask one set of questions and infer a degree of total knowledge out of those questions. It’s what all testing is about – whether standardized or the tests I give my college students (when I teach).
The accuracy of the inference depends on (a) the validity of the test and (b) the care with which the inference is made.
So I think I understand what you’re suggesting. But to my mind… Any analysis or attempt to create a total course grade or standardized test score based on testing is inherently an inference based operation.
Quite often, one that is entirely wrong. But the definition that it’s an inference process appears accurate to me.
Cheers…
PARCC isn’t doing a very good job of informing parents now. This is their much-hyped “live field test update” link.
5:20 pm (ET)
We are closing out of the Performance Based Assessment (PBA) field test window for most states today. We only had about 5,100 new test takers log in today but our total number of tests completed is over 403,000. See the list below for the total number of tests started by students in each state.
Total number of tests started by state:
Arizona – over 23,000
Arkansas –over 12,000
Colorado –over 13,000
Florida –almost 6,000
Illinois –over 66,000
Louisiana –over 23,000
Maryland –over 38,000
Massachusetts –over 47,000
Mississippi –over 31,000
New Jersey—over 62,000
New Mexico –over 7,000
New York –over 8,000
Ohio –over 68,000
Rhode Island –over 12,000
Tennessee –over 36,000
The number above will change and the number of completed tests will rise as Arkansas, Arizona, New Mexico, and Washington, DC will continue or start to field test in later this month. We look forward to keeping you updated through our PARCC Updates newsletter and twitter. We also can’t wait to compile and share everything that we learned at the end of the field test!
What is any parent supposed to do with that information?
Also, the idea that people who have absolutely obsessed over test scores, as national media and ed reform celebrities have done over the last decade, will now suddenly provide a “nuanced” view is just not credible.
We all watched the insane media/marketing campaign they indulged in re: international test scores. That happened. Who is going to rein them in when these scores are released? They haven’t shown any self-discipline or put any number they collect in context in the past. They’re the same group of people. Did they have some kind of collective conversion? Now they’re all about “nuance” instead of using scores to treat public schools as political punching bags? These tests really must be miraculous if that happened!
https://www.parcconline.org/live-field-test-updates-april-11
There’s a sucker born every minute. Thanks for reminding us Chiara.
You have to wonder if Rhee is really that ignorant about testing or whether her 20 year-old acolytes who write for her are. Wait, she married, Kevin “Sweet 16” Johnson. I guess I answered my own question.
Mark Collins: funny you should wonder “if Rhee is really that ignorant about testing.”
Let’s see…
Jersey Jazzman, April 7, 2014, “Why Is Michelle Rhee Wrong About Everything?”
Link: http://jerseyjazzman.blogspot.com/2014/04/why-is-michelle-rhee-wrong-about.html
Apparently she not only has not mastered the closet, er, close reading of CCSS, she—in her own inimitable style—“sucks at reading” and perforce “sucks at making assertions supported by what she has read.”
This from someone who declares her faith in data-driven management aka management by the numbers but can’t even come up with a shred to evidence to support her claim that she took her students from the 13th to the 90th percentile during her brief stint as a teacher.
Exactly why I often use the terms “data-drivel management” and “data-drivel instruction” when referring to the self-styled “education reformers.”
But then, she never understood what one of those old Greek guys wrote long ago:
“A good decision is based on knowledge and not on numbers.” [Plato]
😎
Touché!
Here’s a high-stakes test that millions of third graders are taking this year, thanks to the non-stop lobbying of ed reformers. This is Oklahoma, but Ohio has the same Third Grade Reading Guarantee cookie-cutter statute:
“This morning, I made a quick run through sample questions of the Reading Sufficiency Act.
As you probably know, this is the test that third graders take. Fail this test and you don’t advance to fourth grade, unless you fall under the category of some narrow exemptions.
Now for the embarrassing part: I missed two out of six! (Probably shouldn’t admit that as an editor.)
Here is a link to the story that includes the sample questions at the bottom. I missed No. 1 and No. 4. And I’ve got a complaint: I thought both questions had a couple of answers that would work.”
An editor of the Tulsa World did poorly on the 3rd grade reading test.
“Thankfully, I’m only embarrassed and frustrated. For third graders, there’s a lot more at stake than pride.”
Absolutely ridiculous, and obviously harmful to third graders.
Did any of the adults pushing these tests take a high-stakes test in third grade?
http://www.tulsaworld.com/blogs/news/mikestrain/mike-strain-embarrassing-truth-i-just-missed-two-questions-on/article_bcfc9f0a-c18c-11e3-a0e8-001a4bcf6878.html
Is it proper to refer to a birds as “fishermen” in an informational text :).
It does seems as though q 4 has more than one correct answer.
Let me see if I get this right: students should be able to get answers right if they have good “close reading” and out-of-context clues. The writer missed 2 out of 6 sample questions for third grade. I read the original article, and the test publisher thinks your problem was merely that they didn’t state exactly which standard was being addressed with each question (such as determining fact from opinion). Is this information provided to the third graders who are taking the test? Would they even be able to process that information (cognitively, within the time period of testing, or in light of their stress levels)? What a crock!
Are the CC tests going to be used to enforce the Third Grade reading guarantee in all the states that adopted that ed reform gimmick?
What do the well-funded and huge ed reform lobbying groups intend to do to avoid that result?
They have a lot of control over how tests are used. God knows they have an almost total lock on media and they seem to own the US Dept of Ed. Why haven’t they opposed any of the stupid fads around testing? Why would this new round of tests be used any differently than ed reformers have used the past rounds of tests? What’s the guarantee they’ll be more responsible and NOT use these tests to push their political agenda?
I don’t think it’s excessively technical to note that standardized, predominantly multiple-choice tests are not designed to “permit inferences about how well students know a given block of content.” Standardized tests are designed to create the widest spread of student scores in the fewest number of test items. That’s a different aim than trying to figure out how well students know something.
In order to create this spread of scores, questions that are answered correctly by too many students must be rejected. Testing companies don’t want all students to score well on their tests, because they’ve promised to create an assessment that will make obvious distinctions between students – even if those distinctions are due to cultural (read: socioeconomic) biases in the questions.
Therefore, there is a certain amount of instructional insensitivity built into most standardized tests.
KenS: IMHO, no, you are not being “excessively technical.”
To expand slightly on what you wrote. A standardized test that is properly designed, produced, and pre-tested yields pretty much the “right” amount of fails and passes.
“Right” amount? Nonsense? Nope. Because with many many decades of practical experience behind them, along with the work of sometimes truly gifted [in their own way] psychometricians, high-stakes standardized tests in particular can be, and are, delivered to a client’s/clients’ specifications. The client/clients decide what is “right” and “wrong” re the desired result.
For example, you want something very close to a 70% fail on the recent NYS tests? Known in advance.
Was everyone surprised? Uh, the clients couldn’t have been since they paid for a product that was tailored to their specs. The rest of us—yeah, surprised in the same way one is surprised by a sucker punch.
Thank you for your comments.
😎
Great point about the NY State tests, KrazyTA. I know I could look it up, but it’s late so I’ll just ask: didn’t Commissioner King even warn people, well before the test scores were released, that the pass rates were going to be shocking to many?
King gazed into his crystal ball and saw the number 70 form as if by magic in a cloud of smoke. Taking this a sign, he confidently proclaimed that the new CCSS test would in fact result in a70% failure rate. Relaying this premonition to the supervisors at Pearson simply added to the astonishing coincidence of the actual 70% failure rate thta played out months after his bold prediction.
Interestingly enough, the Kommishiner King has failed to predict the results of this year’s tests.
NY Teacher: most excellent!
The self-styled “education reformers” present their data-drivel management & instruction as a kind of magic—e.g., charters are miracle schools with those high test scores and graduation rates!—but we can see behind the curtains and up the sleeves and know what they’re hiding in the palms of their hands.
All surface, no depth. All decontextualized talking points that don’t connect. And they are so cluelessly self-assured, so unselfconsciously arrogant, that one of their own, Secretary of Education Arne Duncan, can loudly broadcast his misunderstandings about irrelevant standardized test scores in order to sneer at
“white suburban moms who — all of a sudden — their child isn’t as brilliant as they thought they were, and their school isn’t quite as good as they thought they were.”
Link: http://www.washingtonpost.com/blogs/answer-sheet/wp/2013/11/16/arne-duncan-white-surburban-moms-upset-that-common-core-shows-their-kids-arent-brilliant/
For people who love numbers and stats as they are used in education they know far less than I do—and I’m a rank amateur!
But then, of course, they refuse to abandon their Marxist principles:
“The secret of life is honesty and fair dealing. If you can fake that, you’ve got it made.”
¿? The smart one. Groucho.
And why can’t they abandon those principles?
“You can’t teach an old dogma new tricks.” [Dorothy Parker]
😎
Shine on you Krazy TA
The tests are norm referenced, not criterion based.
If more people understood this, they would realize just how gamed the NCLP/RTTT/CCSS testing policies really are.
A norm-referenced test (NRT) is a type of test, assessment, or evaluation which yields an estimate of the position of the tested individual in a predefined population, with respect to the trait being measured. The estimate is derived from the analysis of test scores and possibly other relevant data from a sample drawn from the population.[1] That is, this type of test identifies whether the test taker performed better or worse than other test takers, not whether the test taker knows either more or less material than is necessary for a given purpose.
A criterion-referenced test is one that provides for translating test scores into a statement about the behavior to be expected of a person with that score or their relationship to a specified subject matter. Most tests and quizzes that are written by school teachers can be considered criterion-referenced tests. The objective is simply to see whether the student has learned the material. Criterion-referenced assessment can be contrasted with norm-referenced assessment.
The money quote:[A norm referenced] test identifies whether the test taker performed better or worse than other test takers, not whether the test taker knows either more or less material than is necessary for a given purpose.
Read this over and over until the almost incredulous truth about high stakes tests taking sinks in. It takes a while because the human mind tries vey hard to reject ideas that are counter intuitive, illogical, or fundamentally unfair.
An example to help. Teachers use criterion referencing for classroom exams. Imagine if my classes scored in a range of 100 (A+) to 67 (D) on my final exam,. This would show that students knew at least 67% of the material tested. If I switched to a norm referencing, the bell curve is applied to the same scores. A 67 becomes a low failing grade.
Note:
Since the “given purpose” as proclaimed by Duncan and Co. is college and career readiness, then why are they using a norm referenced instrument that does NOT identify whether the test taker knows more or less material for [college and career readiness]?
Would someone please ask Duncan this question.
This is SMOKING GUN revelation should be used against them
Read and re-read until you force your brain to suspend the disbelief.
Reblogged this on 21st Century Theater.
Can we please STOP giving this know-nothing, know-it-all any recognition at all.
Bad publicity is better (for her) than no publicity.
The woman that single-handedly gave erasers a bad name.
“She who shall not be named”, would be acceptable, barely.
LOL@She who shall not be named.
Sometimes I wish there was a “like” button on this blog.
From what I can see parents use both the test scores and their own child’s success or lack of it to judge a school. For example if a school has been in “school improvement” for the whole school life of their child, even that doesn’t cause them to get upset. But if their child can’t read by 5th or 6th grade or should be in special ed and isn’t, then parents start looking at the whole picture and saying, wait…Is it my kid or is there some reason this school can’t compare with others around it?
“Ms. Rhee goes on to offer the claim that opt-out advocates mistakenly think tests “are designed to pass judgment on students,” and responds that the truth is “quite the opposite” – i.e., that tests are “an indicator of … whether schools, educators and policymakers are doing their jobs.”…isn’t passing judgement exactly what these tests do when passing the MCAS is a requirement to graduate from high school as it is in Massachusetts?
All of a sudden I realized Michelle Rhee nailed it – and has no idea what she admitted when she said tests are indicators of whether “policymakers are doing their jobs. ..”.
Add to policy makers, consultant ants and superintendents. And change “doing their jobs” to “should be paid big bucks and kep their jobs.”
It’s about as bad as when CEOs in the 1990s began to sell their worth as measured by company stock price – shareholder value.
And guess who paid the price for that CEO strategy. Tweren’t the CEOs that’s for sure!
“Tests are designed to permit inferences, however imperfect, about how well students know a given block of content (e.g., relative to other students).
YES!!!
NOOOOOOOOO!!!!!!!!!!!!
See my responses above. Read Wilson.
Permit inferences. There is no guarantee the inferences will be any good!
…Baby steps…
Duane, Two questions:
Who’s Wilson?
How was the fishing?
Fishin’s this weekend. Heading out tomorrow.
Largemouth?
Bottom line: these tests can have an extremely limited diagnostic value when results are interpreted in the light of other data by expert observers familiar with the contexts in which the tests were given. Otherwise, they can only do more harm than good. My wife and I will continue opting our children out until the tests are zero stakes.
Our society is in the grips of some kind of bizarre mania; we actually believe there’s a formula out there somewhere that can quantify how our kids are doing. Test scores as crystal ball. Rather than abandoning this hopeless quest, we argue endlessly about how to find the phantom number that will guarantee we’ll beat China and our kids will be self-supporting.
It is so easy to see the insanity in someone else’s society, so hard to acknowledge it in our own!