Thomas Good sent me this research paper about teacher evaluation that he wrote with Alyson Lavigne.
Division 15 (Educational Psychology) of the American Psychological Association is proud to announce their second policy brief, “Addressing Teacher Evaluation Appropriately.” This brief, focused on teacher evaluation practices and policies in schools was written by Alyson Lavigne and Thomas Good. A copy of the brief is attached for you to read and share.
About the Brief: In this policy brief, Lavigne and Good argue that the most commonly used practices to evaluate teachers—statistical approaches to determine student growth like value-added measures and the observation of teachers—have not improved teaching and learning in U.S. schools. They have not done so because these approaches are problematic, including the failure to adequately account for context, complexity, and that teacher effectiveness and practice varies. With these limitations in mind, the authors provide recommendations for policy and practice, including the elimination of high-stakes teacher evaluation and a greater emphasis on formative feedback, allowing more voice to teachers and underscoring that improving instruction should be at least as important as evaluating instruction.
Share the Brief! It’s important that our national policy be based on sound evidence. We have attached a copy of the brief so that you may share this directly with your constituents—local policymakers, practitioners, educational organizations, faculty, staff, and students who are engaged in K-12 settings and research. You can also promote this important work via social media using Twitter or Facebook using the following link: EdPsych.us/AddressingTeacherEvaluation
If you have any questions about the contents of this brief, please contact Alyson Lavigne (alyson.lavigne@usu.edu). Any questions or ideas for future Division 15 policy briefs should be directed to Sharon Nichols, Chair of Division 15’s Policy and Practice Committee (Sharon.Nichols@utsa.edu). For additional information about research related to problems involved in current teacher evaluation practices, see Lavigne and Good’s recent publication, Enhancing Teacher Education, Development, and Evaluation.
You can read the report here.
Using algorithms to help evaluate teachers is inadequate and inappropriate. Teaching is far too complex to assume that a formula based on false assumptions will result in some type of truth. Since these systems have been already imposed on many teachers, some trends have emerged. Teachers that serve wealthy students are highly rated, and those that serve the poor get labeled as a failing teacher. This form of evaluation is biased and useless.
During my career I received numerous formative evaluations from central administrators, building principals and department chairs. While the system is not perfect, it is far less flawed than a high stakes evaluation based on an inaccurate algorithm. Since I spent my career working with some of the poorest and neediest ELLs, my value add scores would have likely resulted in getting me fired or ranked “in need of supervision.” If I had lost my job from a rigged algorithm, I would not have had the chance to see so many of my former students go on to lead successful, independent lives. The success of these students is the most meaningful form of evaluation. Students often returned to visit me years after they were in my class to thank me for helping them through a difficult period in their lives. That is the best form of evaluation!
There are far too many variables in what different types of teachers do to assume, their value can be distilled into a number. All statistical approaches to evaluation do is contribute the the churn and burn of disruption.
I don’t have the answer to good teacher evaluation methods, but it seems to me that school administrators doing evaluations has a flaw. Teaching and administrating require different skill sets. Also being out of the classroom for an extended period of time causes administrators to lose touch with the craft. Administrators are not likely to be well positioned to give good feedback on teaching because in most cases they were not in the classroom long enough to master the craft themselves and have lost contact with what day to day teaching entails.
tultican : “Administrators are not likely to be well positioned to give good feedback on teaching because in most cases they were not in the classroom long enough to master the craft themselves and have lost contact with what day to day teaching entails.”
It sounds like you have had the same administrators that I had. As a music specialist in elementary schools, I’ve taught in many schools in the suburbs of Illinois and I’ve taught overseas in two American schools. Only a few administrators did a good job…either as an evaluator or as a leader in the school.
I agree that administrator have been “out of the classroom for an extended period of time” and that “causes administrators to lose touch with the craft.” Many were just plain incompetent. Perhaps they go into this field because it pays more than being in the classroom.
There is a reason I tend to call administrators “adminimals*”. What tultican stated can be found in every school building in the country.
*Adminimal: A spineless creature formerly known as an administrator and/or principal who gleefully implements unethical and unjust educational malpractices such as the standards and testing malpractice regime. Adminimals are known by/for their brown-nosing behavior in kissing the arses of those above them in the testucation hierarchy. These sycophantic toadies (not to be confused with cane toads, adminimals are far worse to the environment) are infamous for demanding that those below them in the testucation hierarchy kiss the adminimal’s arse on a daily basis, having the teachers simultaneously telling said adminimals that their arse and its byproducts don’t stink. Adminimals are experts at Eichmanizing their staff through using techniques of fear and compliance inducing mind control. Beware, any interaction with an adminimal will sully one’s soul forever unless one has been properly intellectually vaccinated.
While some administrators may not do the best job in evaluations, at least decisions are made by a human that has some awareness of what competent instruction and classroom management look like. They understand human development, and they can get some idea about whether a teacher is reaching students. Algorithms fail all teachers of poor, disabled or other students that don’t fit the middle class mold. No system is perfect.
The challenge with administrators doing evaluations is that they are not experts in the content areas – traditionally most administrators have backgrounds in elementary ed, or in english or history (as opposed to the STEM areas, or as someone mentioned above, music). As with many topics, there is no easy solution to teacher evaluation.
I’m a French teacher. In my almost 30 years of teaching, I’ve been observed and evaluated by administrators who were former social studies teachers, health teachers, science teachers, English teachers, and my most favorite of all – gym teachers!!! Usually they understood not one word of French so I could have been up there making all kinds of mistakes, giving the wrong information etc. and they wouldn’t have known. I can’t recall a High School administrator who made what I would consider to be helpful comments about my teaching. I wonder if they often felt out of their league having to observe a French teacher. So, I don’t really blame them. It’s a screwed up system. The best help teaching I ever received was when I was starting my Master’s program in French and had a Teaching Assistantship. My supervisor was a French professor who specialized in pedagogy and linguistics, and she was very helpful. Although I didn’t always agree with her, I did respect her because she was a professional in my field.
Many evaluations rubrics would fail a teacher who lectures. But a good lecture is one of the best modes of teaching; far better than inquiry and other fashionable techniques that they rubrics favor. We need to fix our ideas of what constitutes good teaching and good education before we can fix our evaluation system.
I didn’t find anything that was remotely new in the article. We have known what the authors state since at least, oh, the last 40 years or so or probably longer. But if it serves for some readers and policy makers to listen because it is “APA approved”, giving it a scientific sheen/validity that it doesn’t warrant, then I guess that is a good thing. But teachers have been saying these things since at least when I started teaching in 94.
Unfortunately, the authors and the APA are stuck in the pseudo-scientific of “measuring classroom teaching”. From the article:
” Current observation systems are problematic for high-stakes evaluation. Research
indicates that:
a. Classroom teaching is complex, dynamic, and contextual.15 No single observation
system measures all aspects of good teaching. One ramification of the selectivity of
observation systems is that they measure only how the teacher interacts with the class as a whole and ignore how teachers interact with individual students.”
No single observation system measures anything about the teaching and learning process. As I have posted many times before in regards to standardized testing and holds the same for believing one can measure a teachers effectiveness:
The most misleading concept/term in education is “measuring student achievement” or “measuring student learning” or for this article “measuring teacher effectiveness. The concept has been misleading educators into deluding themselves that the teaching and learning process can be analyzed/assessed using “scientific” methods which are actually pseudo-scientific at best and at worst a complete bastardization of rationo-logical thinking and language usage.
There never has been and never will be any “measuring” of the teaching and learning process and what each individual student learns in their schooling. There is and always has been assessing, evaluating, judging of what students learn but never a true “measuring” of it.
But, but, but, you’re trying to tell me that the supposedly august and venerable APA, AERA and/or the NCME have been wrong for more than the last 50 years, disseminating falsehoods and chimeras??
Who are you to question the authorities in testing???
Yes, they have been wrong and I (and many others, Wilson, Hoffman etc. . . ) question those authorities and challenge them (or any of you other advocates of the malpractices that are standards and testing) to answer to the following onto-epistemological analysis:
The TESTS MEASURE NOTHING, quite literally when you realize what is actually happening with them. Richard Phelps, a staunch standardized test proponent (he has written at least two books defending the standardized testing malpractices) in the introduction to “Correcting Fallacies About Educational and Psychological Testing” unwittingly lets the cat out of the bag with this statement:
“Physical tests, such as those conducted by engineers, can be standardized, of course [why of course of course], but in this volume , we focus on the measurement of latent (i.e., nonobservable) mental, and not physical, traits.” [my addition]
Notice how he is trying to assert by proximity that educational standardized testing and the testing done by engineers are basically the same, in other words a “truly scientific endeavor”. The same by proximity is not a good rhetorical/debating technique.
Since there is no agreement on a standard unit of learning, there is no exemplar of that standard unit and there is no measuring device calibrated against said non-existent standard unit, how is it possible to “measure the nonobservable”?
THE TESTS MEASURE NOTHING for how is it possible to “measure” the nonobservable with a non-existing measuring device that is not calibrated against a non-existing standard unit of learning?????
PURE LOGICAL INSANITY!
The basic fallacy of this is the confusing and conflating metrological (metrology is the scientific study of measurement) measuring and measuring that connotes assessing, evaluating and judging. The two meanings are not the same and confusing and conflating them is a very easy way to make it appear that standards and standardized testing are “scientific endeavors”-objective and not subjective like assessing, evaluating and judging.
That supposedly objective results are used to justify discrimination against many students for their life circumstances and inherent intellectual traits.
My teaching career is not standard or public– a few yrs in 1970’s teaching all levels of French at a private high school, plus the last 20 yrs as a for-lang enrichment to PreK’s [some Fr, mostly Spanish].
At the hisch I was observed & mentored by the for-lang dept head, which was terrific [why isn’t this the normal practice?].
These days my boss (head of agency supplying enrichments to regional PreK’s) & PreK directors occasionally observe. They are specialists in teaching PreK generally; their input is welcome & helpful. And in the last 5 yrs, PreK Span is transitioning from a parent-signup pullout class to a regular activity for all (like music, art, phys ed), so classroom teachers often attend: what a bonus! They’re there simply because, either I’m in their classroom, or several classes are combined & extra coverage reqd. So they could just tap away on their iPhones or snooze– doesn’t happen! Knowledge of Span, or for-lang teaching methods irrelevant. They’re observing whether their kids are engaged & what they’re learning, & often shoot me pertinent questions or suggestions as kids are leaving.
In both cases my experience is that sharing info with observing supervisors or teachers in the field is highly beneficial, & perhaps essential. It should be the backbone of teacher development [and, probably, curriculum devpt as well]. “Evaluation” and “accountability” are terms/ concepts that have gotten the tail wagging the dog.
Indiana once again proves the uselessness of standardized testing. Legislators keep trying to find the ‘perfect test’. The newest one, ILEARN, totally bombed since most schools would have received a D or an F. Legislators, in all their brilliance, passed exemptions from the 2019 scores if a school’s received a score lower than their 2018 score.
……………………………………………..
Search for your Indiana school’s 2019 A-F grade
Indiana released 2019 A-F grades for schools and districts Wednesday in a quiet rollout that points to the diminished meaning of the measure.
Last month, lawmakers passed a two-year “hold harmless” provision to protect schools and teachers from the negative consequences of low ILEARN scores. Scores dropped to a new low in the first year of the new state standardized exam.
The hold harmless blocks a school’s grade from falling by allowing it to be calculated using 2018 scores if those scores were higher than the current year. That helped most schools in the state: Only 120 schools scored well enough in 2019 to see their grades increase, a Chalkbeat analysis found.
As widely expected, the exemption means 2019 A-F grades look a little rosier: Most schools — 73% — received an A or B. Fewer schools received an F compared to the year before.
Without the exemption, state officials said most schools would have received a D or an F. That would have affected teacher’s evaluations, and therefore pay, and put many schools on the path to state intervention…
https://chalkbeat.org/posts/in/2020/03/04/search-for-your-indiana-schools-2019-a-f-grade/?utm_source=email_button