A study commissioned by school leaders in New York’s Lower Hudson Valley reviewed the state’s teacher evaluation system and concluded that it was irreparably flawed.
“The study, released Friday, found that the state formula for calculating evaluations forces school districts to inflate classroom-observation ratings so teachers do not get poor overall scores.
“If districts were to give more accurate grades to teachers after classroom visits, the study found, many teachers would “unjustly” receive overall ratings of “developing” or “ineffective.” Such districts would “end up looking like they have an underperforming workforce,” the report said.
“This is not something that can be fixed; the state Education Department needs to start over,” said Louis Wool, Harrison schools superintendent, who was president of the Lower Hudson Council of School Superintendents when the group commissioned the study last year.
“The study reviewed 2012-13 evaluation results for 1,400 teachers in 32 districts in Westchester, Rockland, Putnam and Dutchess counties. The superintendents group provided the data to Education Analytics, a non-profit organization in Madison, Wisconsin, which did the study.
“Researchers credited New York state with improving its methods of measuring teacher effectiveness. In fact, the report called New York “a pioneer” in developing a modern evaluation system. But researchers said there are few examples nationally of effective implementation and that strong use of data may not necessarily translate into good policy.”
The state apparently wants a system that gives many teachers low scores so they can be fired; schools and districts want to retain their decision-making power over which teachers should be kept or terminated. The state is trying to take that authority away from schools and districts by creating a mechanical formula. The formula doesn’t work, and no such formula works anywhere in the country. The biggest problem in teaching today is recruiting, supporting, and retaining good teachers, not finding and firing bad ones. Any administrator worth her salt knows how to do the firing part.
The state should not start over. The state should get out of the way.
Why doesn’t this lengthy, detailed study contain a full description of the demographic characteristics and home districts of the students whose test scores were analyzed? There is only a passing reference to a mere 4% of the sample being ELLs and 12% being SWDs, which leads me to believe the data set was not taken from every district in the consortium.
“. . . and concluded that it was irreparably flawed.”
By definition if any of the teacher’s evaluation is based on student test scores then they are COMPLETELY INVALID. Wilson has shown the COMPLETE INVALIDITY of the educational standards and standardized testing educational malpractices in his 1997 dissertation. Base any of the evaluation on COMPLETE INVALIDITIES and one ends up with COMPLETE INVALIDITIES. Or in more mundane terms shit in shit out!
To understand why, read and comprehend his never refuted nor rebutted “Educational Standards and the Problem of Error” found at: http://epaa.asu.edu/ojs/article/view/577/700
Brief outline of Wilson’s “Educational Standards and the Problem of Error” and some comments of mine. (updated 6/24/13 per Wilson email)
1. A description of a quality can only be partially quantified. Quantity is almost always a very small aspect of quality. It is illogical to judge/assess a whole category only by a part of the whole. The assessment is, by definition, lacking in the sense that “assessments are always of multidimensional qualities. To quantify them as unidimensional quantities (numbers or grades) is to perpetuate a fundamental logical error” (per Wilson). The teaching and learning process falls in the logical realm of aesthetics/qualities of human interactions. In attempting to quantify educational standards and standardized testing the descriptive information about said interactions is inadequate, insufficient and inferior to the point of invalidity and unacceptability.
2. A major epistemological mistake is that we attach, with great importance, the “score” of the student, not only onto the student but also, by extension, the teacher, school and district. Any description of a testing event is only a description of an interaction, that of the student and the testing device at a given time and place. The only correct logical thing that we can attempt to do is to describe that interaction (how accurately or not is a whole other story). That description cannot, by logical thought, be “assigned/attached” to the student as it cannot be a description of the student but the interaction. And this error is probably one of the most egregious “errors” that occur with standardized testing (and even the “grading” of students by a teacher).
3. Wilson identifies four “frames of reference” each with distinct assumptions (epistemological basis) about the assessment process from which the “assessor” views the interactions of the teaching and learning process: the Judge (think college professor who “knows” the students capabilities and grades them accordingly), the General Frame-think standardized testing that claims to have a “scientific” basis, the Specific Frame-think of learning by objective like computer based learning, getting a correct answer before moving on to the next screen, and the Responsive Frame-think of an apprenticeship in a trade or a medical residency program where the learner interacts with the “teacher” with constant feedback. Each category has its own sources of error and more error in the process is caused when the assessor confuses and conflates the categories.
4. Wilson elucidates the notion of “error”: “Error is predicated on a notion of perfection; to allocate error is to imply what is without error; to know error it is necessary to determine what is true. And what is true is determined by what we define as true, theoretically by the assumptions of our epistemology, practically by the events and non-events, the discourses and silences, the world of surfaces and their interactions and interpretations; in short, the practices that permeate the field. . . Error is the uncertainty dimension of the statement; error is the band within which chaos reigns, in which anything can happen. Error comprises all of those eventful circumstances which make the assessment statement less than perfectly precise, the measure less than perfectly accurate, the rank order less than perfectly stable, the standard and its measurement less than absolute, and the communication of its truth less than impeccable.”
In other word all the logical errors involved in the process render any conclusions invalid.
5. The test makers/psychometricians, through all sorts of mathematical machinations attempt to “prove” that these tests (based on standards) are valid-errorless or supposedly at least with minimal error [they aren’t]. Wilson turns the concept of validity on its head and focuses on just how invalid the machinations and the test and results are. He is an advocate for the test taker not the test maker. In doing so he identifies thirteen sources of “error”, any one of which renders the test making/giving/disseminating of results invalid. And a basic logical premise is that once something is shown to be invalid it is just that, invalid, and no amount of “fudging” by the psychometricians/test makers can alleviate that invalidity.
6. Having shown the invalidity, and therefore the unreliability, of the whole process Wilson concludes, rightly so, that any result/information gleaned from the process is “vain and illusory”. In other words start with an invalidity, end with an invalidity (except by sheer chance every once in a while, like a blind and anosmic squirrel who finds the occasional acorn, a result may be “true”) or to put in more mundane terms crap in-crap out.
7. And so what does this all mean? I’ll let Wilson have the second to last word: “So what does a test measure in our world? It measures what the person with the power to pay for the test says it measures. And the person who sets the test will name the test what the person who pays for the test wants the test to be named.”
In other words it attempts to measure “’something’ and we can specify some of the ‘errors’ in that ‘something’ but still don’t know [precisely] what the ‘something’ is.” The whole process harms many students as the social rewards for some are not available to others who “don’t make the grade (sic)” Should American public education have the function of sorting and separating students so that some may receive greater benefits than others, especially considering that the sorting and separating devices, educational standards and standardized testing, are so flawed not only in concept but in execution?
My answer is NO!!!!!
One final note with Wilson channeling Foucault and his concept of subjectivization:
“So the mark [grade/test score] becomes part of the story about yourself and with sufficient repetitions becomes true: true because those who know, those in authority, say it is true; true because the society in which you live legitimates this authority; true because your cultural habitus makes it difficult for you to perceive, conceive and integrate those aspects of your experience that contradict the story; true because in acting out your story, which now includes the mark and its meaning, the social truth that created it is confirmed; true because if your mark is high you are consistently rewarded, so that your voice becomes a voice of authority in the power-knowledge discourses that reproduce the structure that helped to produce you; true because if your mark is low your voice becomes muted and confirms your lower position in the social hierarchy; true finally because that success or failure confirms that mark that implicitly predicted the now self evident consequences. And so the circle is complete.”
In other words students “internalize” what those “marks” (grades/test scores) mean, and since the vast majority of the students have not developed the mental skills to counteract what the “authorities” say, they accept as “natural and normal” that “story/description” of them. Although paradoxical in a sense, the “I’m an “A” student” is almost as harmful as “I’m an ‘F’ student” in hindering students becoming independent, critical and free thinkers. And having independent, critical and free thinkers is a threat to the current socio-economic structure of society.
By Duane E. Swacker
In the 60’s early 70’s while computers where becoming more prevalent, the term was GIGO, Garbage In Garbage Out. With the “miracle” computer, people were under the assumption that the computer was always correct. The biggest error was made by the entering of false or invalid data (Garbage) and not recognizing the error.
Today, we are not only entering Garbage but the Program (formulas) are flawed also, which doubly invalidates the results (if such a thing was possible)!
The problem is that Cuomo is already talking about changing the evaluations to require a HIGHER percentage of the “test scores” (Garbage) and lowering the percentage of local evaluation. In other words, he wants more Teachers failing evaluation!
“Beam me up, Arne…On second thought, cancel that”
“Value In Garbage Out (VIGO)”
Star Trek transporter run amuk.
Ed reform is scramble stuck
Humans in and energize
Garbage out that’s full of flies
Reblogged this on David R. Taylor-Thoughts on Texas Education.
State takeover are not working…
http://www.indystar.com/story/news/education/2014/10/06/school-turnaround-efforts-take-new-twist/16678403/
The good news for teachers is that it seems that administrators are unwilling to fire large numbers of teachers no matter what the evaluation system is.
Maybe the best administrators realize what the public does not – there is no need to fire large numbers of teachers in spite of the evaluation system.
Congratulations to New York for its amazing progress with the terrible teacher effectiveness rating system! As Thomas Edison would say, “Now we have learned one more way not to rate a teacher!” Keep up the good work!
Reblogged this on peakmemory and commented:
” The biggest problem in teaching today is recruiting, supporting, and retaining good teachers, not finding and firing bad ones.”
The state is interpreting and implementing federal policy, deeply flawed, reductive classifications of teachers based on unreliable and invalid and one-size-fits-all instruments. The evaluation system is rigged to produce a portrait of at least 50% of teachers as incompetent or marginally competent on the HEDI scale.
“The biggest problem in teaching today is recruiting, supporting, and retaining good teachers, not finding and firing bad ones. Any administrator worth her salt knows how to do the firing part.
“The state should not start over. The state should get out of the way.”
Refreshing to read. Thank you, Diane.
I have 2 children. One is very bright and never put forth much effort in school, and still earned decent grades, went on to college, graduated, and is working full-time and living on his own soon after he graduated. The second child is also smart and puts a great deal of herself into her work (we can’t take blame or credit for either child — just that we supported them with books and some normal middle class opportunities). Should the teachers be blamed or credited with my kids’ achievement? I think not.
My colleague and I recently spoke to the perils of applying physical systems principles to human systems (e.g., teacher evaluation) in a commentary published in Teachers College Record:
Why Applying Engineering Design Principles and Methods to Improve Teacher Evaluation Systems May Not Produce Desired or Expected Results: A Response to Kersting’s Engineering Teacher Evaluation Commentary
by Amanda Bozack & Amy Thompson — 2014
Recently Nicole Kersting wrote a commentary suggesting that teacher evaluation design could benefit from being modeled after engineering design principles. She indicated that a systematic approach to design, coupled with continuous improvements, could make systems stronger and less political. This response addresses several of her assumptions and argues that an engineering approach to improving teacher evaluation systems may not produce desired or expected results.
This is truly a double edged sword for me. I used to believe that a teachers review should include the scores of their children, after all that is supposed to be a measure of what was taught. After all, if you were hired to make widgets and less than 40% passed quality control you would be out of a job; so why should it be different for a teacher.
There are two other ways of looking at this. On one hand you could argue that a child who gets straight “A’s” is not challenged to their full ability. On the other, you could say, based on what the test was designed to measure, is a reflection of how much the student retained or understood.
Looking at the latter, retained doesn’t mean understood. You might equate this to a parrot learning to say, “Hello”. Additionally, understood doesn’t mean learned. That is, the student might have had full comprehension of the topic before the teacher began the class. That, in turn, would mean the any score by the child is not a reflection of the teachers teaching abilities or effectiveness but rather a good grade is more likely to be an indicator that the student is not being challenged to their fullest abilities. That would leave you wondering if you have an under achiever, sand bagger or wrongly placed.
So maybe we should look to Europe. Some countries there include some or all of these: Student scores, teachers education, teachers post graduation updates, Principal/administration review as well as both parental and student evaluation of the teach…
Yes, let’s look to Europe. Our friends from Finland, who, inspired by the best of what we already have here, in the U.S.A. completely turned around their school system:
http://pasisahlberg.com/five-u-s-innovations-that-helped-finlands-schools-improve-but-that-american-reformers-now-ignore/
Let’s also listen to words of warning from our friends in China, who warn us, from experience, that we have lost course:
http://zhaolearning.com/2014/09/13/fatal-attraction-americas-suicidal-quest-for-educational-excellence/
It is time to take back teaching as a profession.