New York is very proud of its new Educator Effectiveness Evaluation model, which claims to measure which teachers and principals are effective, relying in part on the increase (or not) of test scores of students.
Bruce Baker of Rutgers demonstrates that the model is biased and inaccurate. It favors classes and schools that start off with higher-performing students.
He concludes with a brief sermon about the importance of ethics:
“I have pointed out that the originators of the SGP approach have stated in numerous technical documents and academic papers that SGPs are intended to be a descriptive tool and are not for making causal assertions (they are not for “attribution of responsibility”) regarding teacher effects on student outcomes. Yet, the authors persist in encouraging states and local districts to do just that. I certainly expect to see them called to the witness stand the first time SGP information is misused to attribute student failure to a teacher.”
“But the case of the NY-AIR technical report is somewhat more disconcerting. Here, we have a technically proficient author working for a highly respected organization – American Institutes for Research – ignoring all of the statistical red flags (after waiving them), and seemingly oblivious to gaping conceptual holes (commonly understood limitations) between the actual statistical analyses presented and the concluding statements made (and language used throughout).”
“The conclusions are WRONG – statistically and conceptually. And the author needs to recognize that being so damn bluntly wrong may be consequential for the livelihoods of thousands of individual teachers and principals! Yes, it is indeed another leap for a local school administrator to use their state approved evaluation framework, coupled with these measures, to actually decide to adversely affect the livelihood and potential career of some wrongly classified teacher or principal – but the author of this report has given them the tool and provided his blessing. And that’s inexcusable.”
Allow me to clarify one thing. You note above that I demonstrate that the model is inaccurate and biased. What is actually so disturbing here is that the author of the technical report itself demonstrates – repeatedly – that the model is biased and inaccurate. And then, the author goes on to completely ignore those findings, setting aside any regard for how this endorsement of bad measures may adversely affect the livelihoods of teachers and the quality of education received by children.
SF101,
What amazes me is that there is one simple FACT about standardized testing, well actually there are many simple, and even complex errors about standardized testing, but the one I would like to point out is that all the major testing organizations state that to use the results of a standardized test for anything other than the intended purposes is UNETHICAL. So that using a 5th grade standardized math test to make a very error prone statement about the effectiveness of the teacher instead of the interaction of the pupil and the test is and UNETHICAL and therefore WRONG. (And I use the capital letters to EMPHASIZE that FACT).
And the author of the report is blinded by ideology to say and do as he does and is therefore also acting UNETHICALLY and is WRONG. To paraphrase Jefferson “It is error alone which needs the support of [ideology]; truth can stand by itself.” (original quote has government where I put ideology).
Any standardized test, as shown by N. Wilson, has a least 13 sources of error which render the whole process completely invalid and any resultant declarations about the student and/or teacher to be, as he states, “vain and illusory”. In other words more likely than not false. Bovine excrement in equine excrement out to paraphrase an old country saying.
Also isn’t this student growth scores rather than value-added? Slightly different and perhaps even more unreliable method to measure teacher effectiveness ?
Leonie,
If I am not mistaken (and I haven’t read the article referenced above yet) the “student growth scores” are only one part of the VAM process. Due to the myriad (13) sources of error shown by N. Wilson to completely invalidate the whole standards and standardized testing process, any conclusion drawn is a chimera, un duende, “vain and illusory”, bogus, bubble, a delusion, a fabrication, a fancy, a fata morgana, a figment, a fool’s paradise, an hallucination, an ignis fatuus, an illusion, a mirage, a monster, a monstrosity, a pipe dream, a snare, a specter, a virtual reality.
Notice the antonyms: reality, truth (thanks to thesaurus.com for the synonyms and antonyms).
Utah has chosen a new testing program from AIR at the cost of $39 million. http://www.sltrib.com/sltrib/news/55349773-78/tests-state-system-students.html.csp. It is all done all for the common core and to better identify the needs of each student. Really?!! I am not happy.
Of course not. It’s used to financially help those in power, because now we have to order more software and computers and stuff. My students tracked that bill last year as it worked its way through the state legislature. Even my 8th graders could see that this would take so much more time and space away from classrooms and actual teaching.
These measures may be worthless as a means of evaluating teachers, but they are an excellent weapon to be used against them, and isn’t that what we’re really talking about?
This junk science is in reality the pretext, using the presumed objectivity of numbers, for a political attack against teachers and their unions, so that edupreneurs can sink their claws even more deeply into the schools, without worrying about those pesky teachers and and their unions interfering with their God-given right to profit at the expense of children and society.