Mitchell Robinson, who teaches music at Michigan State University, writes here about the madness of assessing teachers by “value-added” or growth measures, especially when they don’t teach the tested subjects.
State officials listen attentively to the unaccredited National Council on Teacher Quality, which was created by the conservative Thomas B.Fordham Foundation and kept alive by an emergency infusion of $5 million by then-Secretary of Education Rod Paige.
A state official explained why VAM was necessary:
Venessa Keesler, deputy superintendent of accountability services at MDE, said measuring student growth is a “challenging science,” but student growth percentiles represent at “powerful and good” way to tackle the topic. “When you don’t have a pre-and-post-test, this is a good way to understand how much a student has progressed,” she said. Under the new law, 25 percent of a teacher’s evaluation will be based on student growth through 2017-18. In 2018-19, the percentage will grow to 40 percent. State standardized tests, where possible, will be used to determine half that growth. In Michigan, state standardized tests – most of which focus on reading and math – touch a minority of teachers. One study estimated that 33 percent of teachers teach in grades and subjects covered by state standardized tests.
What Dr. Keesler doesn’t seem to understand is that the student growth percentiles she is referring to are nothing more than another name for Value Added Measures, or VAM–a statistical method for predicting students’ academic growth that has been completely and totally debunked, with statements from nearly every leading professional organization in education and statistics against their use in making high stakes decisions about teacher effectiveness (i.e., exactly what MDE is recommending they be used for in teachers’ evaluations). The science here is more than challenging–it’s deeply flawed, invalid and unreliable, and its usefulness in terms of determining teacher effectiveness is based largely on one, now suspect study conducted by a researcher who has been discredited for “masking evidence of bias” in his research agenda.
Dr. Keesler also glosses over the fact that these measures of student growth only apply to math and reading, subjects that account for less than a third of the classes being taught in the schools. If the idea of evaluating, for example, music and art teachers by using math and reading test scores doesn’t make any sense to you, there’s an (awful) explanation: “‘The idea is that all teachers weave elements of reading and writing into their curriculum. The approach fosters a sense of teamwork, shared goals and the feeling that “we’re all in this together,’ said Erich Harmsen, a member of GRPS’ human resources department who focuses on teacher evaluations.”
While I’m all for teamwork, this “explanation” is, to be polite, simply a load of hooey. If Mr. Harmsen truly believed in what I’ll call the “transitive property” of teaching and learning, then we would expect to see math and reading teachers be evaluated using the results of student learning in music and art. Because what’s good for the goose…right?
The truth is, as any teacher knows, for evaluation to be considered valid, the measures must be related to the actual content that is taught in the teacher’s class–you can’t just wave some magical “we’re all in this together” wand over the test scores that miraculously converts stuff taught in band class to wonderful, delicious math data. It just doesn’t work that way, and schools that persist in insisting that it does are now getting sued for their ignorance.
Why should teacher evaluation be standardized when there is so much messy human, social, and economic intervention in the scores that cannot be controlled or measured?
Robinson disputes the value of standardization:
Teachers work with children, and these children are not standardized.
Teachers work in schools, and these schools exist in communities that are not standardized.
And teachers work with other teachers, custodians, secretaries, administrators, school board members, and other adults–none of which are standardized.
So why should teacher evaluations systems in schools in communities as diverse as the Upper Peninsula and downtown Detroit evaluate their teachers using the same system? And why is the finding that “local assessments can vary among ‘teachers at the same grade, in the same school, teaching the same subjects'” a bad thing?
The thing that we should be valuing in these children, schools and communities is their diversity–the characteristics, talents and interests that make them gloriously different from one another. A school in Escanaba shouldn’t look like a school in Kalamazoo, and the curriculum in each school should be tailored to the community in which it resides. The only parties that benefit from “standardizing” education are the Michigan Department of Education and the testing companies that produce these tests, because standardizing makes their jobs easier. Standardizing teaching and learning doesn’t help students, teachers or schools, so why are we spending so much time and money in a futile attempt to make Pearson and ETS’s jobs easier?