The minority leader of the syate Senate Education Committee is proposing legislation to stop using a standardized test as a graduation test. Standardized tests are designed to produce a bell curve. A set proportion of students will fail, by design.
“WEST CHESTER (January 16) – State Senator Andy Dinniman said the lack of resources in Pennsylvania’s financially distressed public schools is so stark that the use of the Keystone Exams as graduate requirements must be stopped before they exacerbate an already dire situation. “It’s clear to me that there are two systems of public education in Pennsylvania: separate and unequal,” said Dinniman, who serves as minority chair of the Senate Education Committee. “Until we resolve that discrepancy, how can we, in good conscience, stamp ‘failure’ on the backs of kids who lack the teachers, resources and classes to pass such standardized tests? To continue down this path without addressing such basic issues is beyond the pale. It’s downright shameful.” Dinniman announced that he will introduce legislation to end passage of the Keystone Exams as high school graduation requirements because they will only widen the growing gap between financially distressed and more affluent high schools.”

Since the very goal of the Republican majorities in the Pa. senate and house is to “widen the gap between financially distressed and affluent” schools, so that Philadelphia, Harrisburg, York and Reading, to name a few, go the way of privatization, I am, sadly, skeptical that Sen. Dinniman will succeed with his bill.
LikeLike
Diane,
Only norm-reference standardized tests are created to result in a bell-curve distribution. Theoretically, criterion-referenced tests such as the one in Pennsylvania, would allow for 100% of students to meet or exceed the criterion. I’m not saying that the PA should have a standardized test as a graduation requirement, but arguing against having one because it dooms X% to failure is not a valid argument.
The much more important made by the legislator is that PA does not provide adequate funding to the majority of schools such that many of them simply don;t have the necessary resources to ensure that all kids have a reasonable chance to meet the graduation standard. The inadequate, inequitable, and classist/racist funding system in PA is the big issue for our state!
LikeLike
Dr Fuller,
You are far too kind in referring to the Keystone exams as “norm-reference”. The formula for going from raw scores to scaled scores is a “secret sauce” that DRC will not release to teachers, administrators, or independant parties. The test are not independantly reviewed for accuracy or appropriate content (I have seen them and many of the questions are not appropriate level or matching the state standards that they are supposed to be testing) The pass rate on the Biology statewide was under 40% and I would challenge you to find 10 districts whose Biology passing rate was over 70%. Either 99% of the districts in PA are not teaching Biology to an acceptable level or the tests are grossly out of line. I’ll leave it as an exercise to the reader as to which is the more likely scenario.
LikeLike
I said they are NOT norm-referenced. They are clearly criterion referenced. Now the cut score can be set too low. But the tests are not norm-referenced. The “secret sauce” is published in the technical digest on teh state’s website at http://www.portal.state.pa.us/portal/server.pt/community/state_assessment_system/20965/keystone_exams/1190529. So, they don’t seem very secret. Anyone can look at the equations used to convert the scores. Every testing contractor follows these pyschometric properties.
Its fine to be against testing and accountability, but you do a dis-service to our arguments when you don;t make fact-based statements.
LikeLike
@ Dr Fuller
Sorry, I meant to say “criterion referenced” at least as it relates to the PA educational standards. You ignored my remarks about inappropriate levels and content and lack of independant oversight.
The link that you sent does NOT include how a raw score is translated to a scaled score. Yes it is the link that says “Keystone Technical Reports” and yet it does not contain any of that information. I will ask you a very simple question: How many points does it take to score a 1500 on the Biology Keystone exam? You say that this score may be set too low when only a handful of schools have a 70% passing rate? So you think that the vast majority of PA schools are failing to teach Biology to an appropriate level?
Anyone interested in how evaluations are done properly should look to New York State where the subject area tests are released and the conversions are clearly explained.
LikeLike
Pages 168 and 169 have the equation that was used to convert raw scores to scale scores. Its a linear equation. Plug in the raw score to get the scale score. Nothing tricky here. They are not hiding anything.
LikeLike
And having actually found the technical report for 2013, how can you say that using the”Raush Model is criteria and not norm referenced when the difficult for the items is set by field testing? If the score is based on question difficulty, which is set by field testing, how is that criteria referenced? I’d love to hear an explanation.
LikeLike
The scaling formula is set after the test results and is varies from test to test. Again, how is that “criteria referenced” rather than “norm referenced”?
LikeLike
I dont have time to get into an extended discussion. Please read some books about the differences between the two. The definitions are based on whether there is a criterion score that determines passing or whether students are compared to each other. It has nothing to do with the equating, conversion to scale scores, etc. The company does a good job in developing the tests and documenting the methods. They follow accepted psychometric and test-development standards.The cut score is determined by politicians, not the testing agency. The real problem is the cut score, how the tests are used, and he inequitable funding in the state. I am now done discussing the difference between criterion and norm-referenced tests. Please read Daniel Koretz for an easy to understand explanation.
LikeLike
Thanks, Ed. I recall hearing–when I was a member of the NAEP board–that criterion referenced tests have an odd habit of turning into norm-referenced tests. Is that an urban myth?
I appreciate the point about Pennsylvania’s inequitable funding. Why don’t you write a blog post about how Pa. funds its schools?
LikeLike
Not an urban myth. I mean, they don’t actually become norm referenced in design, but people keep analyzing the results as if they were.
LikeLike
Norm referenced by design as many items test IQ/aptitude or out of class experiences instead of questions or prompts directly related to the standards/curriculum. This article from 1998 I lifted from Peter Greene; it does an excellent job of describing the problems of using standardized test scores to measure the quality of an education. One of the most interesting points made is that field test results end up eliminating some of the most important and best taught material. test items that students do too ell on are usual eliminated because they will not produce the proper bell curve type spread in scores that test writers must produce. A must read:
http://www.ascd.org/publications/educational-leadership/mar99/vol56/num06/Why-Standardized-Tests-Don%27t-Measure-Educational-Quality.aspx
LikeLike
danielkatz2014: with all due respect to Dr. Fuller—and I mean that without sarcasm or mental reservation—you have just eviscerated the second half of his first paragraph.
We rightly place much responsibility on those that design, produce, pretest, publish [in hard copy or digitally] and rent or sell outright high-stakes standardized tests. However, the clients they pander to are the ones that ensure through legally binding mandates the [few] rewards and [many] punishments that come from their use. One may applaud or deplore standardized tests which [IMHO] measure very little in an inherently imprecise way—but the uses and abuses of those eduproducts for purposes for which they are grossly inappropriate and misleading are well known and of long standing. Quite frankly, given the SOPs [StandardOperatingProcedures] of the sellers and buyers of standardized tests, it is a non-starter with me when someone seems to extract a widespread lucrative SOP from it socioeconomic context and tries to explain that ‘technically’ it’s not supposed to work that way.
Again, I do not think Dr. Fuller a fool. Online discussions are prone, even with the best of intentions, to folks misstating or overstating or understating—you get my drift—their own stance. *I include myself in the preceding sentence.*
However, in this case I agree with you and dianeravitch and Bill Bradley.
That’s the way I think the cookie crumbles.
Even if ‘technically’ it’s not supposed to crumble that way.
Thank you for your comments.
😎
LikeLike
I attended the PA Senate Hearing on the Keystone Exams in August 2013. About a year later I wrote a blog post about it:
Much was said, but one thing retired Senator, Jeff Piccola said in his testimony has stuck with me. After a brief history of the way our politicians developed more school accountability, he testified about why the Keystones and their higher stakes were necessary.
“It is important to note that the Keystones are the first instance that the students are held accountable for their academic achievement since Pennsylvania began developing these standards in the 1990’s. Heretofore, the PSSA’s could be blown off by the individual students because it didn’t count anything for them. And I recall visiting schools in various school districts and elementary school students can be cajoled, and bribed, and encourgaed to do well on the PSSA’s, but by the time they get to 8th grade they’ve figured out they have no stake in the exam…”
“Former Senator Piccola continues, “…Therefore we do need some kind of exam that students know they are accountable for their grade. Heretofore, school administrators and teachers were accountable for an exam that students may or may not take seriously. With proficiency requirements for graduation, the students will certainly view the tests as important. That is only fair to the students. That is something we need in our system to be fair to teachers and administrators.”
They implemented the Keystones because they couldn’t “cajole and bribe” our older children?!
I know children with anxiety & learning disabilities. I know children who fail tests because they just found out their parents are getting a divorce, a grandparent died, or a parent is not well. I know children who live in or come from neighborhoods where death and poverty haunt them and have left them with gaps in their content knowledge. All kids deserve a shot at growing up and getting a high school diploma, and degrees beyond. Children (and teens are still our children) care about these tests, indeed they they live in fear of them. There will be a tidal wave of failures if our legislators don’t move forward to stop this high stakes exam.
Read more here:
http://whatsthebigideaschwartzy.blogspot.com/2014/08/former-senator-says-pennsylvania-kids.html
I was also interviewed while at that Senate hearing here:
http://articles.philly.com/2013-08-28/news/41500834_1_keystone-tests-high-school-students-pssas
This post is on the position of the PA NAACP on the Keystones. They call them, “”Eugenics…human rights violation…unspeakable horror…holocaust on our youth and society…life-long trauma… a system of entrapment for the youth of Pennsylvania…depraved indifference…deficient in a moral sense of concern…lacks regard for the lives of the children who will be harmed, and puts their lives and futures at risk…LYNCHING OUR OWN YOUNG.”
Read more here: http://whatsthebigideaschwartzy.blogspot.com/2013/09/naacp-calls-it-like-they-see.html
Thank you Senator Dinniman and all who are fighting for our kids.
Danielle Arnold-Schwartz
Parent and Teacher
Lower Merion School District, PA
LikeLike
Indeed, children realize that the state tests are not worth their trouble because there are no consequences if they just blow them off.
Yet, the tests in California were designed to have half those taking them, at any level, to score below the proficient point which was set, you guessed it, at the average.
Nobody said anything until “those people” started to demand that teachers be punished for these results. And now we have to contend with the Vergara decision but in the mean time the CSTs are gone and the SBAC tests are not ready for prime time. Nor is there enough bandwidth and computers to take them as “they” intend them.
So now what?
BTW, there is actual hard data proving that LAUSD kids gave up on the tests once they got into middle school (6th grade here). The verdict on the discrepancy between test scores and classroom grades? Simultaneous grade inflation and deflation across the entire district. Have you heard much about that? Of course not. It got swept under the rug once Deasy came in.
LikeLike
“Test Renormalization”
Norm-reffed is as norm-reffed does:
(Test-type notwithstanding)
Setting cut score after cuz
You found too many standing
LikeLike
A great argument was made by Mr. Fuller in regards to the difference between norm-referenced and criterion-referenced tests.
Indeed they are very different in theory. But in practice, the difference might not be much if they are treated and designed the same. No amount of reading books by Koretz is going to change that.
The bottom line is that if the tests produce a Gaussian (aka the bell curve), the test is a norm-referenced test.
I’ve gotten into major arguments with one of those responsible for the old California Standards Tests at a well-known education policy blog over this. He finally admitted that his side of the shop did not determine what sets the cut scores. He claimed that it was above his pay level and that the natural distribution of scores in a criterion-referenced test will always shake down into a Gaussian.
Fine and good, but if it quacks like a duck, waddles like a duck and water does not go through its feathers, it is a standardized test.
LikeLike