Despite the fact that major scholarly organizations have debunked value-added measurement as a way of identifying and quantifying teacher quality, there are still a few lonely defenders of VAM.
There is the U.S. Department of Education, which bet nearly $5 billion on VAM.
There is the Gates Foundation, which has bet hundreds of millions on VAM.
There are stragglers here and there.
And then there is the Center for American Progress, which says that despite all the research to the contrary, they are sticking with VAM.
Just a few weeks ago, the American Statistical Association stuck a pin in the VAM bubble.
The National Academy of Education and the American Educational Research Association had earlier expressed their skepticism about the utility of VAM.
Nonetheless, the CAP still wants to believe. They really truly want to believe, no matter what the statisticians and researchers say.
Probably they are just showing their loyalty to Arne Duncan and the Obama administration.
So Audrey Amrein-Beardsley decided to stick a pin in CAP’s ideological bubble.
She writes:
Their research is notably a small subset of the actual research out there on VAMs, research that was used to rightfully construct the aforementioned position statement released by the ASA, and research that for decades has evidenced that teachers account for, or can be credited for, approximately 10% of the variance in student test scores, while the other 90% is typically due to factors outside of teachers’ control.
Regardless, while the Center for American Progress briefly acknowledges this, they spin this into their solution: The reason this percentage is so low is because we have not yet been accounting for growth in student achievement over time; that is, via value-added models (VAMs). In other words, using more sophisticated models of measurements (i.e., VAMs) will help to illuminate the “real” results we know are out there, but simply have not been able to capture given our archaic models of measurement and teacher accountability.
Not to worry, though, as they write that these “[n]ew measures of teacher effectiveness, determined by evidence of teacher practice and improvements in student achievement, are now available [emphasis added] and provide strong markers for assessing teaching quality and the equitable distribution of the most capable teachers.”
CAP wants to believe in VAM, therefore it does believe in VAM, no matter what the evidence may show.
This should be laughable but this skit, she says, is not funny.
The argument of “what we’re doing isn’t working, so what we need to do is more of what we’ve done” is really troubling…..it’s tautological.
M: an accomplished numbers/stats guy put it this way—
“Insanity: doing the same thing over and over again and expecting different results.” [Albert Einstein]
Troubling? Tautological? I think that to a 98% “satisfactory” [thank you, Bill Gates!] chance of certainty that the German guy put the best label on it.
Thank you for your comment.
😎
I think the VAM proponents aka “researchers” at the Center For American Progress should be willing to risk their careers on their beliefs. All the supposed degrees they earned should be stripped away as if they never got an MBA or a MA in economics or political economy or an MA in applied statistics or whatever “advanced” degrees they get that make them “education experts”! Afterall, teachers’ careers en masse are being put at risk every day for something they know is pure JUNK SCIENCE (and we can take away the word “science” from the phrase to be more respectful of the field of science)! Teachers are being stripped of their certifications, villified and labelled with the education version of the “Scarlett Letter”… “Ineffective” all because they teach students who come from impoverished backgrounds in schools where they are forced to follow useless top-down curriculum which is becoming increasingly developmentally inappropriate. Yes, let this “think” tank put its money where its big mouth is.
Doing the Wrong Thing Righter
The proliferation of educational assessments, evaluations and VAM belongs in the category of what systems theorist Russ Ackoff describes as “doing the wrong thing righter. The righter we do the wrong thing,” he explains, “the wronger we become. When we make a mistake doing the wrong thing and correct it, we become wronger. When we make a mistake doing the right thing and correct it, we become righter. Therefore, it is better to do the right thing wrong than the wrong thing right.”
Our current neglect of instructional issues are the result of assessment policies that waste resources to do the wrong things, e.g., canned curriculum and standardized testing and VAM, right. Instructional central planning and student control doesn’t – can’t – work. But, that never stops people trying.
The result is that each effort to control the uncontrollable does further damage, provoking more efforts to get things in order. So the function of management/administration becomes control rather than creation of resources. When Peter Drucker lamented that so much of management consists in making it difficult for people to work, he meant it literally. Inherent in obsessive command and control is the assumption that human beings can’t be trusted on their own to do what’s needed. Hierarchy and tight supervision are required to tell them what to do. So, fear-driven, hierarchical organizations turn people into untrustworthy opportunists. Doing the right thing instructionally requires less centralized assessment, less emphasis on evaluation and less fussy interference, not more. The way to improve controls is to eliminate most and reduce all.
Former Green Beret Master Sergeant Donald Duncan (Viet Nam) did when he noted in Sir! No Sir! that:
“I was doing it right but I wasn’t doing right.”
And from one of America’s premier writers:
“The mass of men [and women] serves the state [education powers that be] thus, not as men mainly, but as machines, with their bodies. They are the standing army, and the militia, jailors, constables, posse comitatus, [administrators and teachers], etc. In most cases there is no free exercise whatever of the judgment or of the moral sense; but they put themselves on a level with wood and earth and stones; and wooden men can perhaps be manufactured that will serve the purpose as well. Such command no more respect than men of straw or a lump of dirt.”- Henry David Thoreau [1817-1862], American author and philosopher
An excellent summation of why worst business management practices are so destructive in education.
“No problem can withstand the assault of sustained thinking.” [Voltaire]
In baseball terms, you hit it out of the park.
😎
Kind of like Jhimmy today for the Cardinals! We needed that!
Sorry, meant Jhonny, not Jhimmy! Can’t even blame that mistake on, as Shannon would say “the cold frosty ones”!
This position should come as no surprise because the Center for American Progress is effectively an Obama think tank that supports his policies. The fact that they call themselves “progressive” is a misnomer, thoroughly disgusting and another example of how center-right Democrats have hijacked our language in order to confuse, manipulate and/or swindle the party’s liberal base.
I think it was Michael Fiorello who coined the term “belief tank” rather than think tank to describe outfits like the Center for American Progress.
This reminds me of paid expert witnesses in legal trials.
It always amazes me how these organizations name themselves, such as the Center of American Progress. And of course the irony is everything they think is progress is actually setting us back years. It should be renamed the Center of American reformists or the experiment in education reform movement, but progress, hardly.
The earth is flat. The sun revolves around the earth. One might think that there is sufficient research to bury VAM. I think the moral of Ed deform is that research is trumped by neo Liberal ideology. Anyone who disagrees can go take a hike and just f-f-fade away. This is no laughing matter.
Santa Claus. the Easter Bunny, the Tooth Fairy, Leprechauns, Munchkins, Scarecrow, Tin, Man, Arne Duncan, and Andrew Cuomo have all voiced their strong, unwavering support of VAM teacher evaluations.
CAP has taken $4.8 million from Gates since 2008:
http://www.gatesfoundation.org/How-We-Work/Quick-Links/Grants-Database#q/k=center%20for%20americanprogress
“CAP wants to believe in VAM, therefore it does believe in VAM, no matter what the evidence may show.”
The true believers in the high church of Testology are like any other brainwashed religious believers, true believers “no matter what the evidence” DOES SHOW. And the evidence as presented by Wilson SHOWS how completely lacking in logical thought and basis is the “high church of testology” that renders any conclusions INVALID. To understand the inherent errors, falsehoods and fallacies involved in this belief system read and understand Wilson’s “Educational Standards and the Problem of Error” found at: http://epaa.asu.edu/ojs/article/view/577/700
Brief outline of Wilson’s “Educational Standards and the Problem of Error” and some comments of mine. (updated 6/24/13 per Wilson email)
1. A quality cannot be quantified. Quantity is a sub-category of quality. It is illogical to judge/assess a whole category by only a part (sub-category) of the whole. The assessment is, by definition, lacking in the sense that “assessments are always of multidimensional qualities. To quantify them as one dimensional quantities (numbers or grades) is to perpetuate a fundamental logical error” (per Wilson). The teaching and learning process falls in the logical realm of aesthetics/qualities of human interactions. In attempting to quantify educational standards and standardized testing we are lacking much information about said interactions.
2. A major epistemological mistake is that we attach, with great importance, the “score” of the student, not only onto the student but also, by extension, the teacher, school and district. Any description of a testing event is only a description of an interaction, that of the student and the testing device at a given time and place. The only correct logical thing that we can attempt to do is to describe that interaction (how accurately or not is a whole other story). That description cannot, by logical thought, be “assigned/attached” to the student as it cannot be a description of the student but the interaction. And this error is probably one of the most egregious “errors” that occur with standardized testing (and even the “grading” of students by a teacher).
3. Wilson identifies four “frames of reference” each with distinct assumptions (epistemological basis) about the assessment process from which the “assessor” views the interactions of the teaching and learning process: the Judge (think college professor who “knows” the students capabilities and grades them accordingly), the General Frame-think standardized testing that claims to have a “scientific” basis, the Specific Frame-think of learning by objective like computer based learning, getting a correct answer before moving on to the next screen, and the Responsive Frame-think of an apprenticeship in a trade or a medical residency program where the learner interacts with the “teacher” with constant feedback. Each category has its own sources of error and more error in the process is caused when the assessor confuses and conflates the categories.
4. Wilson elucidates the notion of “error”: “Error is predicated on a notion of perfection; to allocate error is to imply what is without error; to know error it is necessary to determine what is true. And what is true is determined by what we define as true, theoretically by the assumptions of our epistemology, practically by the events and non-events, the discourses and silences, the world of surfaces and their interactions and interpretations; in short, the practices that permeate the field. . . Error is the uncertainty dimension of the statement; error is the band within which chaos reigns, in which anything can happen. Error comprises all of those eventful circumstances which make the assessment statement less than perfectly precise, the measure less than perfectly accurate, the rank order less than perfectly stable, the standard and its measurement less than absolute, and the communication of its truth less than impeccable.”
In other word all the logical errors involved in the process render any conclusions invalid.
5. The test makers/psychometricians, through all sorts of mathematical machinations attempt to “prove” that these tests (based on standards) are valid-errorless or supposedly at least with minimal error [they aren’t]. Wilson turns the concept of validity on its head and focuses on just how invalid the machinations and the test and results are. He is an advocate for the test taker not the test maker. In doing so he identifies thirteen sources of “error”, any one of which renders the test making/giving/disseminating of results invalid. As a basic logical premise is that once something is shown to be invalid it is just that, invalid, and no amount of “fudging” by the psychometricians/test makers can alleviate that invalidity.
6. Having shown the invalidity, and therefore the unreliability, of the whole process Wilson concludes, rightly so, that any result/information gleaned from the process is “vain and illusory”. In other words start with an invalidity, end with an invalidity (except by sheer chance every once in a while, like a blind and anosmic squirrel who finds the occasional acorn, a result may be “true”) or to put in more mundane terms crap in-crap out.
7. And so what does this all mean? I’ll let Wilson have the second to last word: “So what does a test measure in our world? It measures what the person with the power to pay for the test says it measures. And the person who sets the test will name the test what the person who pays for the test wants the test to be named.”
In other words it measures “’something’ and we can specify some of the ‘errors’ in that ‘something’ but still don’t know [precisely] what the ‘something’ is.” The whole process harms many students as the social rewards for some are not available to others who “don’t make the grade (sic)” Should American public education have the function of sorting and separating students so that some may receive greater benefits than others, especially considering that the sorting and separating devices, educational standards and standardized testing, are so flawed not only in concept but in execution?
My answer is NO!!!!!
One final note with Wilson channeling Foucault and his concept of subjectivization:
“So the mark [grade/test score] becomes part of the story about yourself and with sufficient repetitions becomes true: true because those who know, those in authority, say it is true; true because the society in which you live legitimates this authority; true because your cultural habitus makes it difficult for you to perceive, conceive and integrate those aspects of your experience that contradict the story; true because in acting out your story, which now includes the mark and its meaning, the social truth that created it is confirmed; true because if your mark is high you are consistently rewarded, so that your voice becomes a voice of authority in the power-knowledge discourses that reproduce the structure that helped to produce you; true because if your mark is low your voice becomes muted and confirms your lower position in the social hierarchy; true finally because that success or failure confirms that mark that implicitly predicted the now self evident consequences. And so the circle is complete.”
In other words students “internalize” what those “marks” (grades/test scores) mean, and since the vast majority of the students have not developed the mental skills to counteract what the “authorities” say, they accept as “natural and normal” that “story/description” of them. Although paradoxical in a sense, the “I’m an “A” student” is almost as harmful as “I’m an ‘F’ student” in hindering students becoming independent, critical and free thinkers. And having independent, critical and free thinkers is a threat to the current socio-economic structure of society.
Here’s a dash of Banesh Hoffman to complement a hearty serving of Noel Wilson:
[start quote]
The most important thing to understand about reliance on statistics in a field such as testing is that such reliance warps perspective. The person who holds that subjective judgment and opinion are suspect and decides that only statistics can provide the objectivity and relative certainty that he seeks, begins by unconsciously ignoring, and end by consciously deriding, whatever can not be given a numeral measure or label. He comes to believe that whatever is non-numerical is inconsequential. He can not serve two masters. If he worships statistics he will simplify, fractionalize, distort, and cheapen in order to force things into a numerical mold.
The multiple-choice tester who meets criticisms by merely citing test statistics shows either his contempt for the intelligence of his readers or else his personal lack of concern for the non-numerical aspects of testing, importantly among them the deleterious effects his test procedures have on education.
[end quote]
[THE TYRANNY OF TESTING, 2003 edition of the 1964 edition of the 1962 original, pp. 143-144]
And literally the last paragraph of his book, Chapter 9, “Don’t Be Pro-Test—Protest” on pp. 216-217—
[start quote]
All methods of evaluating people have their defects—and grave defects they are. But let us not therefore allow one particular method to play the usurper. Let us not seek to replace informed judgment, with all its frailty, by some inexpensive statistical substitute. Let us keep open many diverse and non-competing channels towards recognition. For high ability is where we find it. It is individual and must be recognized for what it is, not rejected out of hand simply because it does not happen to conform to criteria established by statistical technicians. In seeking high ability, let us shun overdependence on tests that are blind to dedication and creativity, and biased against depth and subtlety. For that way lies testolatry.”
[end quote]
I urge viewers of this blog to read both.
😎
Uh-oh. My union just retweeted a link to an organization which is helping the State DESE proliferate some of these wacko, unscientific “[n]ew measures of teacher effectiveness, determined by evidence of teacher practice and improvements in student achievement, [that] are now available and provide strong markers for assessing teaching quality and the equitable distribution of the most capable teachers.”
This can help us to see more deeply into the flaw in the whole standardized “value added” metric evaluations, and how these ultimately destroy individual opportunity for our children. Pay attention. Ready?
“Massachusetts Association for Health, Physical Education, Recreation & Dance”
“…we are appointing a MAHPERD Assessment Committee and will work with DESE this June to provide you assessments, examples of DDM’s and the parameters of scoring in regards to PE Metrics (K-12) , an assessment tool which we are encouraging all school districts to utilize for physical education DDM’s. ”
“We highly discourage fitness scores to be used within DDM”s. … because of physiological rationale and time on learning aspects as well…we need to have two things in mind: (1) They must be aligned to content 2) Provide useful information. Scoring for high, medium and low performance and what those parameters look like will be determined by your MAHPERD Assessment Committee … Student growth has to be calculated in one academic year.”
“However, there are many other content areas that have to be addressed besides health and physical education… So most likely we all will receive an extension.”
The twisted logic requires that there must be a metric to represent student physical and health learning as a data point, so teachers can be compared and evaluated by algorithms.
Even though a good teacher will be able to help a student identify good individual physical activity goals and health outcomes, and even measure progress toward them with teacher-created assessments, there is absolutely no possibility of creating any statistical algorithm to deal with the complexity and variation across a population of children’s physiology. So, we can’t score a PE teacher on the fundamental and vital service they might otherwise offer. We have to invent some bogus content-aligned metric that can generate a standardized “student growth” score over one academic year, so we can hold the PE teachers accountable to data-driven instruction.
What will the PE teachers do, in the face of waves of RIFs and closures, especially for the poorest children with the greatest need for their individual attention? Will they try to offer the potentially life-changing opportunity for children to develop healthy and joyous physical activity habits and skills? And lose their jobs? What a tragedy in the making for this abused generation.
So, my eventual take-home is … how could a child’s intellectual growth really be less varied, elusive and complicated than her physiological growth?
Nobody read this, did you? I mean, you all care, but it’s too involved and long, even for you, and there are so many other attacks to fend off.
Oh. The link:
http://www.ma-hperd.org/DESE%20Update%20DDM.htm
Suggestion: It would be helpful if you always define terms like VAM when using them. Sometimes we would like to forward a post to someone who won’t know–but we’d like them to know. Or new subscribers may not be as informed as they would like to be.