NASSP: The Danger of Misguided Evaluations

It is rare to see a high-ranking leader of a major association speak hard truths to power. For her courage and candor, Joann Bartoletti joins the honor roll as a champion of public education.

In the March 2013 issue of NASSP’s “News Leader,” Bartoletti, the executive director of the National Association of Secondary School Principals, blasted the new teacher evaluation systems that were foisted on the nation’s schools by Race to the Top and its highly prescriptive waivers.

She notes that these dubious, non-evidence-based evaluation systems are coming into use at the very time that the Common Core is being implemented. Common Core–untested, never validated, whose consequences are unknown, arriving with not enough time or money for implementation or adequate technology for the computer-based testing–is widely expected to cause test scores to fall. It would be hard, she writes, to “come up with a better plan to discredit and dismantle public education.”

What motives should one attribute to policymakers who wreak havoc on their’s nations public schools and who blithely ignore all warning signs? Bartoletti won’t speculate.

Malice or stupidity? You decide.

She writes:

• A perfect storm is brewing, and it will wreak havoc on the collaborative cultures that principals have worked so hard to build. New teacher evaluation systems have begun making their way into schools, and over the next three years, more than half of states will change the way they assess teachers’ effectiveness. The revised systems come as the result of Race to the Top and NCLB waivers. To be eligible for either, states had to commit to developing new teacher evaluation systems that use student test scores to determine a “significant proportion” of a teacher’s effectiveness. In a January survey of NASSP and NAESP members, nearly half of respondents indicated that 30% or more of their teacher evaluations are now tied to student achievement.

There is no research supporting the use of that kind of percentage, and even if the research recommended it, states don’t have data systems sophisticated enough to do value-added measurement (VAM) well. Still, the test-score proportion on evaluations will increase at a time when we predict that test scores will decrease.

These evaluation systems will be put in place just as the Common Core State Standards assessments roll out in 2014. This volatile combination could encourage many teachers and principals to leave the profession or at least plan their exit strategies. I don’t want to attribute a malicious intent to anyone, but if policymakers were going to come up with a plan to discredit and dismantle public education, it’s hard to think of a more effective one.

Identity Crisis?

One of the most troubling issues, as Jim Popham describes in this month’s Principal Leadership, is that the overhauled evaluations are being designed to serve dual purposes.

Principals want to believe that the evaluations are formative and are inclined to give constructive feedback to teachers to help them improve their instructional practice. Lawmakers, on the other hand, see the evaluations as being summative—a way to identify weaknesses and fire ineffective teachers. Principals are caught in the middle: they want to offer frank feedback but are all too aware that any criticism is a black mark that can be used to deny a teacher’s con- tract renewal or tenure. In this case, killing two birds with one stone—when those birds have about as much in common as a penguin and a pigeon—is extraordinarily ineffective.

And so, principals tread lightly. Although the days when 99% of tenured teachers earned “satisfactory” ratings are long gone, emerging data shows that even with the new evaluations in place, the majority of teachers are still being deemed “effective.” Education Week noted in a February 5 article that at least 9 out of 10 teachers in Michigan, Tennessee, and Georgia received positive reviews under the new measurements.

With little difference in outcomes, it’s hard to justify the extensive training and time com- mitment that the new systems demand. In some districts in Rhode Island, a popular off-the-shelf model requires principals to view 60 hours of video training and pass a test before administer- ing the evaluation tool. If they fail, they’ll have to wait three months to take it again. Other states are developing their own systems that dramatically increase the hours spent assessing teachers.

Tennessee principal and NASSP board member Troy Kilzer devotes nearly six hours to a single teacher’s evaluation, not counting the time spent observing that teacher in the class- room. This figure is similar to the respondents’ answers in the NASSP survey. Almost all (92%)
said they spend anywhere from 6 to 31 or more hours evaluating each teacher.

These evaluations are simply trying to accomplish too much. What’s even worse, principals must apply them across the board—66% of the survey respondents are required to use one instrument for all teachers and staff, includ- ing those in non-tested subjects. School nurses, athletic directors, and school psychologists are expected to be assessed with the same tools. Since when can a nurse’s capacity for empathy be measured by a student’s ability to factor polynomials?

High Anxiety

Although only some states have fully imple- mented the new models, exhausted teachers are showing signs of wear. The “teach-to-the- test” frenzy is compounded by the fact that their evaluations will rely on scores over which teachers have limited control. NASSP’s Breaking Ranks tells about the importance of a positive culture, yet the atmosphere that the new evalua- tion systems create is anything but positive.

Shawn DeRose, an assistant principal in Virginia, said that since the implementation of his state’s new evaluation system this past fall, many teachers in his school have indicated that they feel additional stress. It’s no wonder. Fifth-grade teacher Sarah Wysocki was fired at the end of her second year with the DC Public Schools because her students didn’t reach their expected growth rate in reading and math under the city’s new value-added model. Never mind that she received positive ratings in her observations and was encouraged to share her engaging teaching methods with other district educators. This is hardly an isolated event.

The anxiety levels raise an even more acute challenge for principals in urban, high-poverty schools. No teacher wants to teach in a school with a traditionally low-performing population. Add test scores as a part of their evaluation, and it now becomes impossible to recruit teachers for high-needs schools. But regard- less of a teacher’s placement, the onus is still on principals to ensure that evaluations are fair and meaningful—and that they improve teachers’ capacity to enhance student learning.

NASSP is regularly delivering this message to Congress and the
Department of Education. In meetings with Assistant Secretary for Elementary and Secondary Education Deb Delisle, I’ve shared NASSP’s recommendations and have reinforced that teacher evaluations should serve their intended purpose: to help teachers improve their instructional practice. NASSP is making it glaringly clear to policymakers that if they want to push out inef- fective teachers, there are other ways to go about it. Throwing the entire profession into a tailspin is not only ineffective and mis- guided, but it’s a poor way to play the long game as well.

A.S.Neill says:

March 12, 2013 at 7:18 am

How to kill a long honored profession. And good luck finding replacements even with the Great Recession and TFA. I predict the next crisis in education will be teacher retention. “Malice or stupidity?” Maybe we should add greed and ideology.

LikeLike

John Young says:

March 12, 2013 at 7:48 am

Reblogged this on Transparent Christina and commented:
The principals apparently think DPASS-II is awful too. Surprised?

Not.

thenextlevel2000 says:

March 12, 2013 at 9:44 am

When I first read the statute in our state on the new teacher evaluation, I thought to myself, “If you really want to destroy education as we know it, this is what you would do.”

I am torn between admiring the administrators who optimistically hold their head up and carry-on with a “can do” attitude and angry at those same administrators for not banding together to uniformly oppose what they know will hurt children in their district.

I am positive that the governors who do this understand that our children are only collateral damage as they line the pockets of the corporate benefactors of this change that will simply replace public education with the corporate version of the same exact thing.

NM Teacher says:

March 12, 2013 at 10:15 am

So many of the best teachers and principals are certainly “planning their exit strategies”. We talk about it all the time. Many are going to teach overseas, finding new professions, retiring the moment they’re eligible, or figuring out where they can hide for a few years until this craziness blows over. Incredibly stressful on the very best teachers and principals, and yet no one with any power wants to listen, they say we’re just “married to the status quo”. If they only knew.

Carrie says:

March 12, 2013 at 10:19 am

The ed tech entrepreneurs know that salaries for teachers, social workers, nurses and others make up the large part of a school district budget. They must engineer huge lay offs in order to free up the millions for they want for tablets delivering curriculum and tests to students sitting in large classes.

DC teacher says:

March 12, 2013 at 12:35 pm

For three years DC Public Schools Impact teacher evaluation system counted test scores as 55% of fourth grade and up teacher evaluations. 50% was based on the class the teacher taught, and 5% was based on the school-wide scores. It was lowered beginning in SY2012-2013 because this seemed like “too much.” Yes. Three years later and day after day of stress for three years. And talk about focus on tests. This was/is absolute madness. Additionally, it is sad to say that Principals do not always use their “power” as evaluators to provide support for teachers under stress. In fact, it does happen that a teacher who “questions too much” or “fights for change of any kind” is given low scores that have nothing to do with their teaching ability, but rather because of their advocacy.

Robert Valiant says:

March 12, 2013 at 12:55 pm

Now if NASSP can find local principals who will speak up we might be getting someplace.

Robert Johnson says:

March 12, 2013 at 5:53 pm

I’ll speak up. I, too, have watched the 60 hours of Teachscape (Danielson) videos and passed the proficiency exam. And even though we’ve been very collaborative and open with our staff, there is still apprehension on teachers’ part about the final evaluation and whether or not they’re classified as distinguished, proficient, basic, or unsatisfactory. I’m hopeful (naively optimistic?) that it will result in improved teaching and ultimately learning, but I haven’t yet marked someone as “basic” or “unsatisfactory” (new tool doesn’t take effect until next year) so we’ll see if the union takes the same approach about a willingness to help someone grow and improve or if they’ll just say “he’s out to get them.” I’ll reply again next year and let you know.

LikeLike

- ME says:
  
  March 13, 2013 at 2:17 am
  
  Although appreciative Robert, you are but one.
  
  LikeLike

James Gale says:

March 12, 2013 at 2:45 pm

My district is dealing with an evaluation overhaul right now. Teachers are required to provide 17 different categories of evidence to prove their own effectiveness. It is offensive because it clearly demonstrates a lack of trust on the part of administration. As Pasi Sahlberg says, “accountability is what is left when responsibility goes out the window.” also, it is not differentiated for inexperienced teachers or folks on a teacher contract who don’t regularly work with students. What this overhaul amounts to is an extremely complicated, expensive and time-consuming way to do exactly what has been done for decades. Teachers are not evaluated for competence, they are evaluated for following directions. This system is still susceptible to every bit of bias and irrelevance as the prior system; it just requires a great deal more time, effort, and resources. What we need are experienced, trusted administrators to deliver subjective, yet trustworthy performance evaluations. If administrators are trusted master teachers, and all teachers have the qualifications for the job, then an evaluation simply determines whether or not a teacher is doing his or her job as a responsible public servant. If everybody is qualified, teachers and administrators alike, there is no need for value-added anything, and accountability is assumed, not proven or disproven by a super-evaluation.

Tom McMorran, Ed.D. says:

March 12, 2013 at 3:01 pm

It is time to school our politicians about CCSS and High-Stakes testing.
Here is a day in course level 101.

Tom McMorran
2012 High School Principal of the Year NASSP

Philosophy 101:

In order for an argument to carry weight and cause one not only (1) to believe it, but also (2) to take action based on that belief, the argument must have warrant. There is nothing subtle here. The weakest form of argument is some version of “I am in power and I say so…” Or, in any teen’s mother’s words: “Because I am the parent!”

When the person presenting the argument relies on some authority to shore up his/her argument, then we have a duty to test the reliability of the authority. In philosophy or rhetoric or simply argumentation this is known as an appeal to authority.

Last week Gina, Mary Ann, and I attended another workshop at the Connecticut Association of Schools (CAS). This is the body that is, in theory, an institution that is independent from the State Department of Education. The presentation was made by Dr. Diane Ulman, who is the Chief Talent Officer at the DOE. She was appointed by Commissioner Pryor.

As part of her presentation, Dr. Ulman reminded us that the Governor’s Council, The Gates Foundation, a range of other foundations and 46 states have signed on to CCSS. In other words, she offered an appeal to authority. Now, for an appeal to authority to work, credentials must be established. And any group that has a personal, financial interest in public policy must make their bias known. So, let’s ask a very basic question: Where’s the money? For Pearson, Houghton Mifflin, and other publishing companies the prospects are enormous. Smarter Balance, the private, for-profit company received half a billion federal dollars to develop the next generation of assessments, which will replace the CMT and CAPT and be administered in about 26 states. You may recall the President’s State of the Union Address; he all but bragged about the 4.3 billion for Race to the Top (RTTT) funding, and how it was amazingly inexpensive for the Federal government to get these 46 cash-strapped states to sign on.

So, when you hear the proponents of the Common Core State Standards and High-Stakes Testing appeal to authority, you have a duty to weigh the degree to which the authority has sufficient warrant to be believed. Here, let me try it: Elvis is still alive. Evidence? 50 million Elvis fans cannot be wrong.

Statistics 101:

Before meaningful inferences can be drawn from any data set, the researcher has a duty to ensure that the social phenomenon under consideration has not been conflated with other factors. In other words, if you want to give a test that measures the contributions of a teacher to a student’s growth, you must account for and guard against any other factor that might conflate with the primary inquiry. It works like this:

1. We want to know if the teacher’s skill as a reading teacher leads to observable reading skills in her/his students.
2. Therefore, if we give all students the same reading assessment, we should be able to conduct a comparison between teacher A’s students and teacher B’s students.
3. From that comparison we can tell if one teacher is better than another at teaching reading.

So, what’s wrong with that?
A. If the assessment was designed to measure student performance, it can only be used for teacher evaluation by an act of hopeful extension. If the assessment had been designed to measure teacher performance, then it could only be used to measure student performance indirectly.
B. In order for teacher A to be compared with teacher B, the context for all potentially confounding factors for the experiment must be the same. In other words, the only factor that can be measured is, in this case, reading.

But wait, Tienken, Lynch, Turnanian, and Tramaglini have something to say about this in “Use of Community Wealth Demographics to Predict Statewide Test Results in Grades 6 & 7.”

Here’s the very short version: If you tell these researchers three out-of-school demographic variables, then they can tell you a New Jersey school system’s 6th Language Arts scores on the New Jersey Assessment of Knowledge for grade 6 (NJASK6). Tell them (a) the percentage of lone parent households in the community, (b) the percentage of people with advance degrees, and (c) the percentage of people without a high school diploma, and they can plug those data points into a formula that will predict the scores within an acceptable range.

If confounding factors such as a town’s wealth are predictors of performance, then how can we use a reading assessment designed to measure a student’s performance in order to decide whether or not a teacher has effectively taught the skills or knowledge measured by the test?

Here is another wee complication: In New York the APPR rating system that is a year ahead of Connecticut’s uses a growth over time model, which sounds great. But, if you are the unlucky teacher who earned the highest rating in your first year and then for some reason you “slipped” to proficient in your second year, you have not shown growth over time, have you?

Economics 101:

The foundation of the CCSS argument has been negative comparisons between international assessments of 15 year olds in which Americans appear to come out near the middle of the testing range. The argument runs like this: The future economy needs 21st Century Skills. Other countries are out-scoring us, therefore the strength of our economy is threatened over the next few decades.

But, if we recall our faculty reading of Yong Zhao’s Catching Up, or Leading the Way, we recall that there is an inverse relationship between performance on a standardized international assessment and productivity over time. Yes, that’s right. The same group of 14 yr olds who came in dead last in the First International Math Study (TIMS) is now a group of the 60-somethings who control the American economy, which is still rated among the top three most productive economies according to the World Economic Forum.

So, to make the international comparisons look bad, the proponents of this argument have to place the USA into a comparison with the 58 countries for which there is competitive data. Yikes, it looks like the mid-21st century will be dominated by Bulgaria; didn’t see that coming, but that’s what the tests show. If, on the other hand, one compares the US to the G-20 or G-7 Economies, the negative comparisons cease to be statistically valid.

Also, let’s just pause for a minute here and consider the PISA study of 15 year olds. You have to be 15 to take the test. So, if an American kid averages 170 days of school attendance a year, and among those days are mid-years, finals, and field trips, then let’s say there is a good chance for 140 days of instruction. But Asian countries regularly offer up to 240 days of school, so let’s knock off twenty and call it 220. Should an American student be able to compete with his/her counterparts in math? Well, actually, even on the much-vaunted PISA fully one out of four students performing at level five, the highest level, is an American.

So, if we follow the scores-to-economics argument, we would be likely to engage in behaviors that promote success on a test, but this will lead to lower creativity and productivity in the adult world!

Sociology 101:

Campbell’s Law: 1975 “The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social process it was intended to monitor.”

Here is what Nichols and Berliner have to say at the end of a comprehensive examination of NCLB and high-stakes testing: “We are going to do something unheard of in the history of academic research. In this concluding chapter, we are not going to call for more research. There is absolutely no need for new research on high-stakes testing! Sufficient evidence to declare that high-stakes testing does not work already exists.” (2006, Collateral Damage, p. 175).

CONCLUSION:

1. I am NOT saying that we should have no standards. I am not saying that a standards-based curriculum is a bad thing; in fact, I am in favor of it.
2. I am NOT saying that we shouldn’t desire excellence for all students. I am not saying that all students should be able to have meaningful adult lives.
3. I am NOT saying that teachers shouldn’t link their performance to student achievement. I am not saying that we should avoid standardized assessments.

I AM SAYING that the worn out application of so-called hard-nosed business practices (which I do not believe business men or women apply to their own concerns) have any place in a school. I AM SAYING that there is a better way, and it is for all of us educators to embrace our responsibilities as professionals and act from Informed Professional Judgment. I AM SAYING that we can either define ourselves or accept the so-called reform that is happening to us.

It might be that we have to acknowledge and optimistically embrace the following proposition: The High School Structure that has served us so well is not broken; it is obsolete, and it is time for us to transform it!

Tom

paula says:

March 12, 2013 at 3:04 pm

Holding any teacher responsible for test scores is stupidity with social promotion. This is what happens when people who are not educators and care nothing about public education get control of our schools. These types of inane decisions will happen and we are all the worst for it. Scaring the public into irrational reforms that have nothing to do with increasing student achievement is the way. Educators have to direct educational reform or its meaningless. Just look at the mess with Los Angeles Unified. Every has
been with money trying to influence the way schools reform and operate. Scary.

kafkateachfkateach says:

March 12, 2013 at 3:32 pm

The new and improved teacher evaluations in my district have proven to be nonexistent. It’s March 12th 2013 and we still have yet to receive evaluations and our VAMs for the 2011-12 school year. The state, the district, and the union have been tossing around the stinking pile of value added bogosity like a hot potato. Nobody wants to accept responsibility for the data. Millions of public school dollars have been wasted on designing an evaluation system that is so flawed, cumbersome, and complicated it can’t even be used. You can read more about my quest for VAM here http://kafkateach.wordpress.com/2013/02/07/the-quest-for-vam.

dianerav says:

March 13, 2013 at 10:30 am

Please tell Bill Gates about your fruitless quest to learn your VAM score.

LikeLike

scfeeney says:

March 12, 2013 at 4:32 pm

There are principals who have spoken against some of these destructive efforts! Check out http://www.newyorkprincipals.org.

Brian Ford says:

March 13, 2013 at 7:13 am

Well, it depends on the individual. Some are malicious, some are stupid, some are ignorant, some are economists, some are just greedy, some are smart greedy people who use stupid people and get them to go along, some are self-delusional and see themselves as Messiahs, some are in denial and one is President.

Why is Barack Obama doing this?

Lisa Myers says:

March 13, 2013 at 10:04 am

Never a better time for The Network for Public Education. What corp-ed-reformers have is a unified effort against a common enemy – us. It’s time we do the same. Just for fun, here’s a story told in epic style describing our current dilemma and our hope.

In the Land of Education, the people quaked in fear as rumors of their impending doom spread throughout the kingdom. It was only a matter of time before neighboring warriors descended upon them, making them low-cognitive servants in their own land. Surely the gods, those responsible for structuring academic goals, were fashioning someone to come to the people’s rescue, someone who harbored no fear, a leader eloquent of tongue and heroic in deeds who could slay this elusive monster devouring the students of the kingdom.

[enter Michelle Rhee and fellow ed-reform warriors]

Hear ye, hear ye, my fellow citizens. I have traveled long (2 years) and hard (1 grade level) through the halls of education and return to you today with lips of truth. If we are to survive as a worthy kingdom of scholars, we must slay those who are enslaving our young to the monster of mediocrity. We must purge our kingdom of ineffective teachers, and to do so, all teachers must be subject to trial by fire!

see more at http://lisamyers.org for Network for Public Education to the rescue.

Michael McDermott says:

March 13, 2013 at 2:05 pm

It’s why for those of us on the cusp of retirement it becomes all too easy to confuse APPR with AARP!

Gil Gallant says:

March 14, 2013 at 12:47 pm

The biggest problem with education is there are way too many people making decisions about a topic they have very little understanding of. The Common Core of State Standards was developed by a bunch of business leaders and professors who don’t teach at the elementary level. how can they possible know what is developmentally appropriate for a Kindergartener or a first grader to know or understand. Also, colleges and high schools are the places that should change. They still don’t differentiate instruction as elementary schools have been doing all along.The stand and lecture format of high school and college are why US graduates don’t know how to think for themselves.

S Nix says:

March 15, 2013 at 2:44 pm

Two comments: When we are trying to balance the budget, this continued funding of unnecessary programs does nothing to advance that goal; and too bad there is no extensive evaluation of the office of POTUS of the same kind hoisted on the nation’s hard working teachers and administrators!

searcherb says:

March 23, 2013 at 12:45 pm

One point – Educators could press their leaders to scale back on use of test scores and still meet the RTTT-ESEA Waiver requirements. it is not “test scores” and not “significant proportion” that is required – it’s “student growth” must be a “significant factor.” Test scores must be part of “student growth” in ESEA tested grades and subjects (TGS), but many other measures can also be part of “student growth,” even for the TGS. Many of the state plans go way beyond what is required. USED has never said what amount of “test score” measures must be in the mix of overall “student growth” for it to equate to a “significant factor.”

NASSP: The Danger of Misguided Evaluations

21 Comments Post your own or leave a trackback: Trackback URL

Leave a reply to A.S.Neill Cancel reply

Search All Posts

Previous posts

Recent posts

Top posts

Follow blog via email

Follow blog via RSS reader

Blog Stats

NASSP: The Danger of Misguided Evaluations

Diane Ravitch's Blog

21 Comments Post your own or leave a trackback: Trackback URL

Leave a reply to A.S.Neill Cancel reply

Search All Posts

Previous posts

Recent posts

Blog Topics

Top posts

Follow blog via email

Follow blog via RSS reader

Blog Stats