Vincent Marsala, a National Board Certified Teacher in Ohio, explains what our politicians don’t understand: merit pay and stack ranking don’t work. They don’t work in business and they don’t work in schools.
He writes:
First, teachers will be forced to compete against each other based on student test scores. Eventually, teachers may resent having a special needs/low performing child in class because a student’s inability to do well on tests will reflect poorly on a teacher. Adding the idea of merit pay based on test scores/evaluations, and teachers may resent these students even more. Next, when teachers work together, kids win, but teachers, just like the workers at Microsoft, are human, too. Teachers competing for the highest test score and biggest bonus will in-fight, not collaborate, and instead of freely sharing ideas, teachers, will hide them from each other and ultimately students.
All of these misguided reforms are now hurting students and things will soon get worse. Students are about to be tested more than ever, just so we can get the data needed to stack rank teachers and schools. PARCC’s newly released testing guidance to schools calls for 9¾ hours testing time for third grade, 10 hours for grades 4-5 , 10¾ hours for grades 6-8 and 11 to 11¼ hours for grades 9-12. Of course, this testing schedule does not even account for teacher created tests.
Dealing with this obvious over-testing has brought on a nonsensical answer from Ohio State Superintendent of Public Instruction Dr. Richard A. Ross. In order to reduce testing time, schools may do something that his own department does not recommend. His solution is shared attribution for teachers of art, music, foreign languages and some years of science and social studies. In simple terms, up to 50 percent of these teachers’ ratings, plus pay and hiring and firing decisions may be based on student tests in other subject areas, on students these teachers may have never even seen. Meanwhile, some of these subjects may no longer be taught by certified teachers if the Ohio State Board of Education has its way. The Board wants to eliminate the 5 of 8 rule that demands that school districts hire five full-time teachers in eight areas, including music, art, physical education, library science, nursing and social work, for every 1,000 students. With the dysfunction occurring at the state level because of these types of misguided reforms, is it any wonder why young people are bailing on the profession? According to the U.S. Department of Education’s estimates, teacher-preparation programs enrollments have shrunk by about 10 percent from 2004 to 2012, with California losing approximately 22,000 teacher-prep enrollments, or 53 percent, between 2008-09 and 2012-13.
Teaching is not a simple task that can be easily assessed. While on paper, stack ranking and merit pay sound fine and easy to devise, it will be a debacle. American schools are not in crisis, and collaborating, student-focused teachers are already working hard and producing great results for children every day.

Now if only our whack-job Governor here in NY would read this. Oops, not written by a major corporate sponsor, no chance he’ll ever see it.
LikeLike
Meanwhile, head of the NEA appears to favor “merit pay.” Our national union leaders have failed teachers many steps along the way: https://www.americanprogress.org/issues/education/news/2013/10/25/77986/the-nations-largest-teachers-union-calls-for-revamp-of-teacher-pay-system/
LikeLike
That was from 2013. There is a new president (not that I have confidence in here either).
LikeLike
her not here
LikeLike
Thanks, Duane. That’s at least somewhat heartening.
LikeLike
And the poor souls that teach the neediest, most vulnerable students will be fired due to unjust, false metrics. Then, I suppose they can always count on TFA to keep these kids busy since it’s assumed these students aren’t going too far anyway.
LikeLike
Merit pay and stack ranking will never work in education because they are strategies borrowed from the business world—where they don’t work, either, by the way. Microsoft, which pioneered stack ranking, got rid of the practice a few years ago, calling it the most damaging thing they had ever done to their corporate culture.
What the reformers don’t seem to understand is that teaching is more like parenting than it is like business—and no parent would be judged on how well their kids did on a test, or would even consider favoring one child over another based on how well each child did in school. Teaching is not about test scores—its about meeting kids where they are, getting to know them as persons, and helping them become who they want to be.
Until the reformers understand that teaching is about caring, not earning, things are not going to get any better.
LikeLike
Privatizers always think they’ll get the best part of the public sector and the best part of the private sector but in reality they often get the worst parts of both. When Ohio ed reform started we were told they were going to get rid of those icky teachers union lobbyists in the statehouse. Almost immediately those lobbyists were replaced by charter school lobbyists. I don’t know- is this supposed to be a big improvement? The private sector lobbyists are somehow just better people?
People who want to run private sector businesses should GO DO THAT. Is someone forcing them to work in the public sector? If they so admire the private sector that they seek to turn the public sector into the private sector they’re in the wrong business. They should follow their dreams and start a company that isn’t publicly-funded.
LikeLike
Alfie Kohn would ask, “work to do what”? I think many of the people imposing these policies know they don’t work to improve education (or anything else, for that matter). But if improving is not your goal, but rather destroying, merit pay and stack-ranking both work very well indeed.
LikeLike
When you want to destroy something, why not impose the worst business strategies on education? Failure for public schools is music to their ears.
LikeLike
Dr. Ross’s “solution is shared attribution for teachers of art, music, foreign languages and some years of science and social studies. In simple terms, up to 50 percent of these teachers’ ratings, plus pay and hiring and firing decisions may be based on student tests in other subject areas, on students these teachers may have never even seen.”
Correct. This is one of three “solutions” to the evaluation of TEACHERS by test scores of their students when the teachers are not being evaluated by tests required for the Elementary and Secondary Act and/or a state-wide VAM.
The first “solution” is to use scores from any test administered statewide and meeting psychometric standards, including out-of-state end-of course assessments (e.g. California high school history). Ohio keeps an approved list of these.
A second “solution” is the infamous SLOs (about which I have written), with not one ounce of validity or reliability but heavily marketed by USDE and William Slotnik. This is a version of mangement-by-objectives,1954 from business guru Peter Drucker. The general method is no longer used by savvy business leaders.
In Ohio, Dr. Ross and his advisors decided that the teacher made tests for SLOs were taking more time than required statewide tests producing VAM or other required measures. . So, to reduce the state’s average testing time, he said option 3 would do.
“Solution” three has been used in Florida for some time. A lawsuit about this and related schemes ended with the judge concluding that the process was UNFAIR, but perfectly LEGAL.
So, under option three, about 69 % of teachers–those who do not have job assignments that produce scores from statewide tests– are assigned the school-wide scores of students on ELA and/or math.
Now the kicker is that Dr. Ross and all other CEOs of state education agencies that have signed on for the Common Core can say these scores are perfectly legitimate for evaluating all teachers. Why? Because the Common Core calls for all teachers to improve performances in math and ELA.
In addition to being a cheap and simple solution for evaluating teachers, this move cuts down on testing time for SLOs. So Dr. Ross and his counterparts across the nation could have the same policy, in effect saying the only subjects that count (literally) are math and ELA, possibly science. That is the end game.
Unless I am mistaken, Ohio has already rescinded the 5 of 8 rule that required districts to hire five full-time teachers in eight areas, including music, art, physical education, library science, nursing and social work, for every 1,000 students. Districts and school committees lobbied for this change so they would have more freedom to hire “specials” of their own choosing.
LikeLike
In Florida the Escambia County Florida, the superintendent is trying to terminate a contract with the charter chain Newpoint. In addition to other charges, he maintains Newpoint changed student grades so that the school could get an A rating. Newpoint denies the charges and claims they are the only A rated school in the county. They are going to start with a board of education review. If they cannot resolve the issues, it may become a lawsuit.
Other communities should beware. Once these charters set up shop, it may be difficult to change course, especially if the charter is well connected.
LikeLike
“Solution” three has been used in Florida for some time. A lawsuit about this and related schemes ended with the judge concluding that the process was UNFAIR, but perfectly LEGAL.”
This suit was based on a constitutional challenge, not harm inflicted. The ruling did not set any precedent for future litigation in which a teacher is denied tenure, faces dismissal, or is rated ineffective.
LikeLike
“Stacked Ranking”
Stacked ranking of their bills
Is what they’re really after
With Franklin stacked in hills
That reach up to the rafter
LikeLike
Rank stacking is what smells
from here to heaven high
and what it tells
implies many a bye bye
LikeLike
Reblogged this on Notes from a Frazzled Teacher and commented:
This nation has lost all common sense.
LikeLike
Wow, what a huge straw man from the OP on stack ranking for teachers. Mr Marsala clearly conflated stack rankings with merit pay to raise the ire of teachers everywhere. No legislature or governor or state department of education anywhere has come close to implementing stack rankings.
LikeLike
Bill Gates creation, Microsoft, was the leader in creating stack-ranking systems. Bill Gates funds education with the vision of turning it into a business. Proposals for stack-ranking of teachers have abounded, even if none have been implemented.
In any case, merit pay has been implemented. Tell me how that’s any better? It’s still an intentional competition among teachers (who should be collaborating) based on evaluation. It’s still a system that sets up backstabbing or, at the very least, isolating behavior in an effort to be among the “winners” of the merit pay. Merit pay and stack ranking are just two sides of the same coin.
LikeLike
What’s the OP?
I’m AI so I have trouble deciphering acronyms.
TIA!
LikeLike
Probably “Original Poster” – Vincent Marsala’s article.
LikeLike
It’ll always be Olden Polynice for me!
LikeLike
Or if you’re an 80s gal like me, Ocean Pacific.
LikeLike
I haven’t heard the words “Ocean Pacific” in a long, long time.
LikeLike
Probably better that way. Sorry to spoil your streak.
LikeLike
Stack ranking works by first assigning scores to teachers, then implementing policies that use those scores – firing, pay, promotion, assignments, best butt kisser award.
LikeLike
Why? Stacking rankings and merit pay are exactly the tactics governors and reformers have been using for their deforming practice. There is no evidence that merit pay has effectively uplifted moral of entire body of teachers–not some lucky dozen– whatsoever. That’s ditto to any other developed/developing countries.
LikeLike
I’ll tell you why they’ll always fail, because the people trying to make up the systems are dolts. They are trying to build some all encompassing food grading system on what should be kept and what should be thrown out. Let’s measure fish and beef, randomly assign those numbers to fruits and vegies that are near by and throw those out. Brilliant thinking.
That’s how one solves any inequities in the broad swath of teacher pay, by making it more arbitrary and capricious?
Generally, there is a basic assumption that every teaching position in a state is of equal difficulty and value. Is that a good assumption? Are different subjects, ages, or SES groups of more difficulty and deserving more value?
The testing is peripheral. Let’s compare John 8th grade math teacher to Jane 8th grade math teacher? That’s minutia. They skimmed by some big issues, got lost in the details, misapplied it so idiotically it is an insult to logic and rational thought, and ended up with a big mess.
Keep it simple stupid. Experience, Education, then what other factors?
LikeLike
Here it in Ohio, it is (was?) the prestigious Battelle Memorial Institute under Battelle for Kids that signed on to the VAM junk science bandwagon of ratings and metrics. Shame on them for dragging what used to be a respected organization into the political mud.
LikeLike
Big bucks from Gates plus rebranding as Battelle for Kids.
LikeLike
Some other factors could be attendance and punctuality; accuracy and timeliness of grading/gradebook; meeting deadlines; portfolios of lessons and assessments; and professional development.
I worked for a few years at school that did merit pay and had no pay schedule. Teachers did not stab each other in the back or refuse to cooperate.
Of course, none of their raises were based on the test scores of other people (i.e., students) because that’s just stupid. Just. Stupid.
LikeLike
The grade 11 “ELA” assessment IS on topics in “other areas.” It’s not about literature, at least not more than 10%. Passages are about history, sociology, music, science, arts, etc., etc. If anything, attribution should be shared more often among ELA, SS, the arts, and science– not less.
LikeLike
But what if ELA test performance is determined mostly by the amount and quality of parent-talk, a kid’s IQ, and their doggedness–NOT knowledge and skill that a school can impart? This certainly seems true from what I’ve seen of the SBAC ELA test. Language arts is a discipline not like the others. Reading comprehension, a big part of any ELA test, is not a function of a discrete package of skills a school can impart, but rather a function of the world- and word-knowledge one holds in one’s brain. This can only be acquired slowly over time, and, for good readers, much or most of the knowledge comes from knowledge and vocab-rich environments outside of school. The Hart Risley study showed that kids in professional homes hear 35 million more words than kids in welfare homes by age 4 –35 MILLION. A geography or chemistry test can measure what school can be reasonably expected to impart, but not ELA tests. Thus ELA teachers are being unfairly crucified for low scores they cannot possibly prevent, and Ross now wants the whole staff to suffer for scores that are beyond their control (although as a staff they do stand a slightly better chance of giving a kid the broad knowledge they need to be good readers). Until the truth about ELA is well-understood, ELA teachers will continue to be unjustly faulted (or credited) for scores they have little responsibility for.
LikeLike
Studies in the arts are not required in many high schools.The courses are elective and at best in the good old days they were available and taken by at most 20% of students who attended schools in affluent districts. The last NAEP tests in the arts found that few students had consecutive years of study in the visual arts and music and attempts to assess theater and dance were a wash.
LikeLike
I, for one, am the most uncompetitive person around. I went to a grad school that didn’t rank people. You passed or failed. Few people failed and that’s ok. We were a small, Presbyterian School in Richmond: great faculty and great community. We were progressive in the 80s. So, I did not learn competitiveness. I just don’t compete. I play, but not compete.
I hate ranking kids. Reading today about one little boy I am doing a plan for tomorrow whose main problem is language: he’s hispanic. He has an older sibling whom I am sure does things for him without him asking. They make him out to be extremely low. He isn’t.
Next week are the tests. The tests. Secretly, I have hexed them so the computers all mess up…..lol. I wish. I am just so against insanity.
LikeLike
A reminder why Bill Gates, de facto head of public education in the USofA, doesn’t learn from his own mistakes.
From the Vanity Fair article of just a few years ago, a section entitled “The Bell Curve”:
[start excerpt]
By 2002 the by-product of bureaucracy—brutal corporate politics—had reared its head at Microsoft. And, current and former executives said, each year the intensity and destructiveness of the game playing grew worse as employees struggled to beat out their co-workers for promotions, bonuses, or just survival.
Microsoft’s managers, intentionally or not, pumped up the volume on the viciousness. What emerged—when combined with the bitterness about financial disparities among employees, the slow pace of development, and the power of the Windows and Office divisions to kill innovation—was a toxic stew of internal antagonism and warfare.
“If you don’t play the politics, it’s management by character assassination,” said Turkel.
At the center of the cultural problems was a management system called “stack ranking.” Every current and former Microsoft employee I interviewed—every one—cited stack ranking as the most destructive process inside of Microsoft, something that drove out untold numbers of employees. The system—also referred to as “the performance model,” “the bell curve,” or just “the employee review”—has, with certain variations over the years, worked like this: every unit was forced to declare a certain percentage of employees as top performers, then good performers, then average, then below average, then poor.
“If you were on a team of 10 people, you walked in the first day knowing that, no matter how good everyone was, two people were going to get a great review, seven were going to get mediocre reviews, and one was going to get a terrible review,” said a former software developer. “It leads to employees focusing on competing with each other rather than competing with other companies.”
Supposing Microsoft had managed to hire technology’s top players into a single unit before they made their names elsewhere—Steve Jobs of Apple, Mark Zuckerberg of Facebook, Larry Page of Google, Larry Ellison of Oracle, and Jeff Bezos of Amazon—regardless of performance, under one of the iterations of stack ranking, two of them would have to be rated as below average, with one deemed disastrous.
For that reason, executives said, a lot of Microsoft superstars did everything they could to avoid working alongside other top-notch developers, out of fear that they would be hurt in the rankings. And the reviews had real-world consequences: those at the top received bonuses and promotions; those at the bottom usually received no cash or were shown the door.
Outcomes from the process were never predictable. Employees in certain divisions were given what were known as M.B.O.’s—management business objectives—which were essentially the expectations for what they would accomplish in a particular year. But even achieving every M.B.O. was no guarantee of receiving a high ranking, since some other employee could exceed the assigned performance. As a result, Microsoft employees not only tried to do a good job but also worked hard to make sure their colleagues did not.
“The behavior this engenders, people do everything they can to stay out of the bottom bucket,” one Microsoft engineer said. “People responsible for features will openly sabotage other people’s efforts. One of the most valuable things I learned was to give the appearance of being courteous while withholding just enough information from colleagues to ensure they didn’t get ahead of me on the rankings.”
Worse, because the reviews came every six months, employees and their supervisors—who were also ranked—focused on their short-term performance, rather than on longer efforts to innovate.
“The six-month reviews forced a lot of bad decision-making,” one software designer said. “People planned their days and their years around the review, rather than around products. You really had to focus on the six-month performance, rather than on doing what was right for the company.”
There was some room for bending the numbers a bit. Each team would be within a larger Microsoft group. The supervisors of the teams could have slightly more of their employees in the higher ranks so long as the full group met the required percentages. So, every six months, all of the supervisors in a single group met for a few days of horse trading.
On the first day, the supervisors—as many as 30—gather in a single conference room. Blinds are drawn; doors are closed. A grid containing possible rankings is put up—sometimes on a whiteboard, sometimes on a poster board tacked to the wall—and everyone breaks out Post-it notes. Names of team members are scribbled on the notes, then each manager takes a turn placing the slips of paper into the grid boxes. Usually, though, the numbers don’t work on the first go-round. That’s when the haggling begins.
“There are some pretty impassioned debates and the Post-it notes end up being shuffled around for days so that we can meet the bell curve,” said one Microsoft manager who has participated in a number of the sessions. “It doesn’t always work out well. I myself have had to give rankings to people that they didn’t deserve because of this forced curve.”
The best way to guarantee a higher ranking, executives said, is to keep in mind the realities of those behind-the-scenes debates—every employee has to impress not only his or her boss but bosses from other teams as well. And that means schmoozing and brown-nosing as many supervisors as possible.
“I was told in almost every review that the political game was always important for my career development,” said Brian Cody, a former Microsoft engineer. “It was always much more on ‘Let’s work on the political game’ than on improving my actual performance.”
Like other employees I interviewed, Cody said that the reality of the corporate culture slowed everything down. “It got to the point where I was second-guessing everything I was doing,” he said. “Whenever I had a question for some other team, instead of going to the developer who had the answer, I would first touch base with that developer’s manager, so that he knew what I was working on. That was the only way to be visible to other managers, which you needed for the review.”
I asked Cody whether his review was ever based on the quality of his work. He paused for a very long time. “It was always much less about how I could become a better engineer and much more about my need to improve my visibility among other managers.”
In the end, the stack-ranking system crippled the ability to innovate at Microsoft, executives said. “I wanted to build a team of people who would work together and whose only focus would be on making great software,” said Bill Hill, the former manager. “But you can’t do that at Microsoft.”
[end excerpt]
Link: http://www.vanityfair.com/news/business/2012/08/microsoft-lost-mojo-steve-ballmer
Every single paragraph contains something that could be usefully applied to understanding what is happening in public education today.
But what is the response of the self-proclaimed “education reform” movement to proven failure?
To paraphrase the NJ Commissioner of Education, double down on whatevers.
They’re clueless.
Time to opt out of high-stakes standardize testing. Time to opt in to genuine learning and teaching.
😎
LikeLike
M.B.O is no different from the SLOs used to evaluate teachers in at least 27 states. Only difference is that teachers are required to write their SLOs and these plus the tests for students are approved at the district level. Example: At least 80% of all of my fourth graders will score at or above 9O on a twenty word end of course vocabulary test in music. Production quota for test scores. Bizarre. A computer calculates the performance of the teacher who enters the test scores. You meet the expectation or not. MBO is a failed business practices marketed to schools as SLOs by William Slotnik with help from USDE. The first use was in Denver in 1999 for a pay for performance scheme funded by the Broad Foundation among others.
LikeLike