Archives for category: Teacher Evaluations

The editorial board of the “Journal-News” in the Lower Hudson Valley calls out the absurdity of Goveror Cuomo’s teacher evaluation plan. The deadline for a new plan is June 30, which is impossible.

They understand why parents are angry at the testing system and the governor:

“Declining morale in neighborhood schools is one big reason that many parents boycotted the state tests. How can Cuomo not see the connection?

“Now our leaders are racing to fix the system, but are likely to make it worse. Cuomo and legislative leaders, as part of their budget agreement, gave the state Board of Regents until June 30 to re-create the evaluation system, setting strict rules that tie the Regents’ hands.
Stop it. It’s time for the Board of Regents to take a stand – and stand up to Cuomo. The board should declare that it can’t slap together a viable evaluation system. New York should keep its current system in place and use at least the rest of 2015 to design a system that would promote classroom instruction and hold teachers accoutable.

“Judith Johnson, the Lower Hudson Valley’s new representative on the Board of Regents, has the right idea. “What the governor has put in place makes no sense,” she said. “If you want a scholarly system, you can’t throw it together in 30 to 60 days. If we ignore the science behind teacher evaluations, it’s just a political decision.”

Does the Board of Regents have the backbone to tell the governor and the legislature that they are wrong? Will they stick to science and turn their backs on Cuomo’s vindictive agenda?

Here is a curious turn of events. Just as the federal government is forcing schools across the nation to evaluate and rank teachers using dubious metrics, corporations are beginning to back away from simplistic performance measures. The change reflects the philosophy of business guru W. Edwards Deming, who staunchly opposed merit pay and rankings, on grounds that they demoralized employees and made for a less efficient workplace.

This article appeared in the Wall Street Journal.

The Trouble With Grading Employees

Performance ratings such as ‘meets expectations’ sap workers’ morale,
but firms aren’t sure they can do without them

Can a year’s worth of work be boiled down to a stock phrase like
“meets expectations”?

As companies reinvent management by slashing layers of hierarchy or
freeing workers to set their own schedules, performance ratings—which
grade workers on a 1-5 scale or with labels like “on
target”—stubbornly hang on. Companies like Gap Inc.,Adobe Systems
Inc.and Microsoft Corp. abolished such ratings after leaders decided
they deterred collaboration and stoked staffers’ anxieties. Yet other
companies are having a harder time letting go.

Intel Corp. has long rated and ranked its approximately 105,000
workers on a four-level scale, from “outstanding” to “improvement
required.” Devra Johnson, a human-resources director at the chip
maker, observed that ratings tended to deflate morale in a good chunk
of the 70% of the company’s workforce that receives a “successful”
rating each year—the second-lowest label.

“We’d call them the walking wounded,” she said.

Human-resources managers conducted an experiment to test a new way of
managing performance, allowing 1,700 workers in the HR department to
go unrated, although not without feedback, for about two years,
according to Ms. Johnson.

Managers found they could still differentiate performance and
distribute compensation. However, when Ms. Johnson’s team presented
its findings, company executives weren’t ready to give the labels up,
concerned that forgoing ratings would suck healthy tension out of the
workplace, she said. So the HR department started rating the employees
in the experiment again….

Marc Farrugia, the vice president for human resources at Sun
Communities Inc., is going through the “exhausting” process of
revamping performance management at the owner and operator of
manufactured housing communities. He’s concerned about the accuracy of
the company’s current approach to ratings; some managers just dole out
higher scores in order to maximize bonuses for employees they’re
scared might leave; others give everyone average ratings because it is
easy. Workers complain the ratings aren’t fair and don’t paint a true
picture of their annual performance.

“I’m being more and more convinced that ratings are doing more harm
than good,” Mr. Farrugia said….

Some executives worry that figuring performance measures, such as the
time it takes for restaurant workers to take an order, into reviews
might lack context.

“I have a real love-hate relationship with data,” said Kevin Reddy,the
CEO of fast-casual restaurant chain Noodles & Co. “You can get a false
sense of security if you zero in too closely on a rating system.”

The company moved away from numeric ratings about seven years ago but
still places workers into broad categories like “meets expectations.”
Mr. Reddy said he and his leadership team continue to question whether
they’re doing feedback right and motivating employees.

Jean Martin, a director at research and advisory firm Corporate
Executive Board who works with companies on performance management
systems, said executives are “giving the numbers too much power” by
endlessly debating their worth. An analysis of 30,000 employees by her
organization shows ratings don’t have a direct impact on performance,
she said.

Others say they have evidence showing that workers contribute less
after receiving a poor rating. David Rock, the director of the
NeuroLeadership Institute, a research firm that applies neuroscience
to the workplace, said ratings conjure a “threat response” in workers,
or “a sensation of danger,” especially if they don’t get the number
they expect. And the hangover from a bad rating can last for months,
Dr. Rock said….

Companies that have gotten rid of ratings say their employees feel
better about their jobs, and actually listen to managers’ feedback
instead of obsessing over a number. John Ritchie, a Microsoft
human-resources executive who goes by “J,” said the technology
company’s practice of rating and ranking employees discouraged
risk-taking and collaboration; since discontinuing the practice in
late 2013, teamwork is up, he said.

The internal change mirrors the shift CEO Satya Nadella is working to
effect externally, charming and collaborating with startups and
venture-capital firms so that Microsoft doesn’t get left behind in the
increasingly heterogeneous world of technology.

“We needed to change and everybody knew it,” Mr. Ritchie said of the
new performance management system.

The Gap’s new approach dumps ratings in favor of monthly coaching
sessions and frequent employee-manager conversations. But HR
executives had to convince leaders that the move wasn’t
“sacrilegious,” according to Eric Severson, the company’s co-head of
human resources.

Holly Bonds, a 17-year veteran of the company, said it was strange at
first; she was used to scanning her review for her rating and bonus
number. She now talks more frequently with her manager, so she has a
better idea of where she stands, a process that she’s found less
stressful than worrying about her rating.

“I haven’t missed it,” she said.

Write to Rachel Feintzeig at rachel.feintzeig@wsj.com

Merryl Tisch, chancellor of the Néw York Board of Regents, has delayed implementation of Governor Andrew Cuomo’s draconian and misguided plan to evaluate teachers by test scores.

When Néw York sought Race to the Top money, it promised that test scores would count for 20%. Under pressure from Governor Cuomo, the proportion rose to 40%. Cuomo was angry when almost every teacher was rated effective or highly effective. He wanted to fire teachers. Tisch wrote a letter to Chomo agreeing with his demand to raise the testing proportion to 50%.

The legislature caved during budget negotiations and passed a “matrix” that implies 50% but left the final determination to the Regents. Tisch decided more time was necessary and extended the deadline.

The sad part of this drama is that no one ever refers to research. Numerous studies and reports have refuted the validity of test scores for measuring teacher quality. Start with the American Statistical Association’s statement on VAM. There are too many variables that the teacher does not control that influence test scores.

The current dispute seems to be about whether to misjudge teacher quality sooner or later.

In the midst of a story about a teacher who walked 150 miles to deliver a letter to Governor Cuomo, there was mention of a statement about the opt outs by the State Education Department.

Basically the SED said that the opt outs will not derail its determination to rate teachers based on test scores.

The State Education Department released a statement saying, “We are confident the Department will be able to generate a representative sample of students who took the test, generate valid scores for anyone who took the test, and calculate valid State-provided growth scores to be used in teacher evaluations.”

The SED did not say how it will generate valid ratings for teachers whose students opted out, especially in districts where the majority of students did so; nor did it say how it would generation valid ratings for the 70% teachers who don’t teach the tested subjects. Even if only 10% opted out, how will the SED know if they were high-scoring students or low-scoring students? The SED will succeed in making a process of dubious value even less valid. The SED is determined to do the wrong thing with or without adequate data.

Read More at: http://www.cbs6albany.com/news/features/top-story/stories/as-common-core-testing-enters-second-week-controversy-still-abounds-24810.shtml

Since 2009, when Race to the Top was launched, Arne Duncan has been an avid proponent of evaluating teachers by test scores. Some states evaluate teachers by the scores of students they never taught or subjects they don’t teach. To be eligible for Race to the Top money, states had to agree to evaluate teachers by test scores. To get a waiver from impossible mandates on NCLB, states had to agree to do it.

When Duncan testified, Congresswoman DeLauro asked if he was willing to rethink VAM. He responded that the federal government doesn’t require VAM. Duncan said that while the Feds don’t require VAM, they require evidence of growth in learning.

Sounds like VAM. Can anyone make sense of this?

*I had several spelling errors in the original post, due to composing it on my cellphone in a bumpy car-ride. I fixed them.

New York State education officials released data showing that the top-rated teachers, based on student test scores, are less likely to work in schools enrolling black and Hispanic students.

Did State Education Department officials read the VAM reports showing that VAM is statistically flawed as a measure of individual teachers? Are they aware that less than 20% of black and Hispanic students met the absurd passing mark on the state’s Common Core test for the past two years? Are they aware that test-based accountability discourages teachers from working in high-needs schools? Interesting that the article cites the leader of Michelle Rhee’s organization, TNTP (the Néw Teacher Project), whose goal is to replace experienced teachers with new hires. At the rate these so-called reforms are accepted as credible (despite evidence to the contrary), TNTP will be able to place millions of new hires.

Andrew Cuomo can put one notch on his belt. Carol Burris is stepping down. He better have a very big belt because his hatred for teachers eill drive out many from the profession. who will replace? Does he care? The much-honored principal of South Side High School in Rockville Center decided to retire early because of Cuomo’s punitive law. Morally and ethically, she could not continue to work in the environment he has created.

She said:

“We are now turning our backs on the very experiences that build on our children’s natural strengths in order to pursue higher test scores in this era of corporate reform. We have become blind to indicators of quality that can’t be demonstrated on a scan sheet.

“The opinions of billionaires and millionaires who send their own children to private schools awash in the arts hold more sway than those of us who have dedicated our lives to teaching children. In the words of our chancellor [Merryl Tisch], we who object are “noise.”

“Much to the dismay of Albany, the noise level is on the rise since the passage of a new teacher evaluation system that elevates the role of testing. I am not sure why I was shocked when the legislature actually adopted the nonsensical evaluation plan designed by a governor who is determined to break the spirit of teachers, but I was. What is even more shocking is the legislature’s refusal to admit what they did, which was to create a system in which 50 percent of a teacher’s evaluation is based on test scores. Whether that denial comes from ignorance or willful deceit doesn’t matter. It is inexcusable.

“What will happen to our profession is not hard to predict. Since the state has generated student “growth” scores, the scores of 7 percent of all elementary and middle school principals are labeled ineffective. Likewise, 6-7 percent of Grades 4-8 teachers of English Language Arts and math received ineffective growth scores. That is because the metrics of the system produce a curve.

“Based on the law, we know before even one test is given that at least 7 percent of teachers and principals, regardless of their supervisors’ opinion, will need to be on an improvement plan. They will be labeled either developing or ineffective. We have no idea what growth scores for high school teachers and teachers of the arts will look like — that has been, in the words of Assemblywoman Pat Fahy, “punted” to a State Education Department. Yes, they [state lawmakers] have turned the football over to the folks whom they publicly berate for the botched rollout of the Common Core.

“Well, the legislature has woken a sleeping giant. Around the state today parents are saying “no more.” The robust opt-out movement, which began on Long Island, has now spread across rural and suburban areas in upstate New York as well. Over 75 percent of the students in Allendale Elementary School in West Seneca refused the Common Core tests today. In the Dolgeville district, the number is 88 percent. Over 70 percent of the students in the Icabod Crane Elementary and Middle School refused. On Long Island, 82 percent of Comsewogue students, 68 percent of Patchogue Medford students and 61 percent of Rockville Centre students opted out of the tests. And that is but a sample.

“This is happening because the bond between students and teachers is understood and valued by the parents we serve. They have no stomach for the inevitable increased pressures of testing. Through opt out, they are speaking loud and clear.”

“She is not going away. She was already a leader in the battle against corporate reform. She has written many posts for Valerie Strauss’s “Answer Sheet” blog at the Washington Post. She will write more. Now she is joining the fight to save children and public education from corporate raiders full-time. Hers will be an experienced, wise voice in the fight for democratic public education.

Valerie Strauss analyzes the debate between Chancellor Merryl Tisch and me on MSNBC’s “All In With Chris Hayes.”

She includes the transcript.

What she found odd was Tisch’s resoonse right after I explained that teachers are not allowed to see how individual students answered questions, so the tests have
no diagnostic value. All that teachers see is the students’ scores and how they compare to others. There is no item analysis, no description of students’ weaknesses or strength.

Tisch answered:

“TISCH: Well, I would say that the tests are really a diagnostic tool that is used to inform instruction and curriculum development throughout the state. New York State spends $54 billion a year on educating 3.2 million schoolchildren. For $54 billion a year I think New Yorkers deserve a snapshot of how our kids are doing, how our schools are doing, how our systems are doing. There is a really important data point.”

She began by saying that the Common Core standards and tests would close the achievement gap, although there is no evidence for that claim. Then she said the tests are a valuable diagnostic tool, but they don’t provide enough information to perform that function. Then she said the tests would show how our schools were doing, which I disagree with, because the passing mark was set artificially high, guaranteeing that most children would fail.

Unfortunately I had no opportunity to respond.

The resounding success of the opt out movement in Néw York state prompted a state senator to introduce a bill to exempt the highest-performing districts from Governor Cuomo’s test-based teacher evaluation plan.

Presumably the advocates of the plan hope to take the steam out of the opt out movement. Divide and conquer. Apparently high-stakes will be for the middle class and the poor, not the affluent high-performing districts.

Call it segregated testing. None for the rich. Only for peons.

Miriam Kurtzig Freedman, an attorney who represents public schools in education matters, including testing and special education—and is currently working to reform special education—posted this comment. Her website is http://www.schoollawpro.com.

 

Can we really use student tests to measure teacher effectiveness?

 

Miriam Kurtzig Freedman, M.A., J.D.

 

This is the year! Tests related to the Common Core State Standards (CCSS) are launching across our country. They are designed to measure how well students are learning the CCSS. Meanwhile, some states, with federal encouragement, plan to use them also to measure teacher effectiveness. Is this use valid?

 

There is no shortage of controversy about educational testing and, unfortunately, this controversy includes the opportunity to file lawsuits. The use of student achievement data to also evaluate teacher effectiveness is certainly controversial. Notably, Arne Duncan, the Secretary of Education, gave states a year’s reprieve on implementing this practice. Across the country, teacher unions have called it unfair. My concern is far more basic. It’s about validity.

 

As an attorney who has represented public schools for more than 30 years, I am concerned about this multipurpose use. It may not get us what we need—a valid, reliable, fair, trusted, and transparent accountability system. The tests at issue include the PARCC and SBAC, two multi-state consortia that are funded by the U. S. Department of Education and private funders. They were charged with developing an assessment system aligned to the CCSS by the 2014-15 school year.

 

At last count, these consortia have 27 states and the District of Columbia signed up— affecting 42% of U.S. students according to Education Week.
The media remind us constantly that our ‘failing’ schools need fixing; that, to do so, we should assess student skills and knowledge to help teachers improve instruction; that we also need to evaluate and rate teachers and weed out poor performers. And we are told that these tests can be multipurposed to do all of the above!

 

Sounds good? Actually, it sounds too good to be true. Does this multipurpose use to evaluate teacher effectiveness clear a key psychometric hurdle: test validity?

 

What is test validity?

 

At its core, it is the basic, bedrock requirement that a test measure what it is designed to measure. Thus, if a test is designed to measure how well 3rd graders decode, we judge the test according to how well it does that. Can students decode? If it is designed to be predictive; say, to measure if students are ‘on track’ or progressing toward college or career-readiness, we judge it accordingly. Either way, we must ask if a test whose purpose is to measure what students learn or whether they are ‘on track’ can also be used to measure something else—such as how well teachers teach?

 

So what are these tests’ purposes? For answers, let’s review the PARCC and SBAC websites. First PARCC, the Partnership for Assessment of Readiness for College and Careers:

 

PARCC is a group of states working together to develop a set of assessments that measure whether students are on track to be successful in college and their careers. These high quality, computer-based K–12 assessments in Mathematics and English Language Arts/Literacy give teachers, schools, students, and parents better information whether students are on track in their learning and for success after high school, and tools to help teachers customize learning to meet student needs.

 

PARCC is based on the core belief that assessment should work as a tool for enhancing teaching and learning. Because the assessments are aligned with the new, more rigorous Common Core State Standards, they ensure that every child is on a path to college and career readiness by measuring what students should know at each grade level. They will also provide parents and teachers with timely information to identify students who may be falling behind and need extra help. [Emphasis added]

 

Second, the SBAC, Smarter Balanced Assessment Consortium:

 

The [SBAC] is a state-led consortium working to develop next-generation assessments that accurately measure student progress toward college- and career-readiness. Smarter Balanced is one of two multistate consortia awarded funding from the U.S. Department of Education in 2010 to develop an assessment system aligned to the Common Core State Standards (CCSS)by the 2014-15 school year.

 

The work of Smarter Balanced is guided by the belief that a high-quality assessment system can provide information and tools for teachers and schools to improve instruction and help students succeed – regardless of disability, language or subgroup.

 

Smarter Balanced involves experienced educators, researchers, state and local policymakers and community groups working together in a transparent and consensus-driven process. [Emphasis added]

 

Clearly, these tests’ purpose is to (a) measure student progress on the Common Core State Standards (CCSS) and college or career readiness, (b) give teachers and parents better information about students, and (c) help improve instruction. No mention is made of gauging teacher effectiveness.

 

Yet, questions about the validity of using these tests in this multipurpose way seem to be missing from national discussions, even as other validity issues are raised. For example, questions are raised about score validity when tests are administered in different ways (on a computer or with paper and pencil) and at different times of the year.

 

Also discussed are questions about whether these tests are aligned to the CCSS. The media reports battles among states, unions, and others about how to measure teacher effectiveness through these tests; e.g., through value-added models, student growth percentages, or other approaches. But, questions of basic test validity from the get-go about this multipurpose use of these tests are not part of today’s public discourse.

 

They should be.

 

If we continue on this track of creating high stakes for teachers with tests designed for a different purpose, we may well end up with unintended consequences, including distrust of the system, questionable accountability, and lawsuits.

 

My suggestion? Given the reprieve for states and growing concern among the public about these tests and the CCSS themselves, test consortia and our federal and state governments should take a deep breath and do two things.

 

First, the consortia should remind the public that the purpose of these tests is to measure student achievement on the new CCSS and career and college readiness, provide better information to teachers and parents, and improve instruction.

 

Second, the states (with federal approval and encouragement) that intend to use these results also to evaluate teacher effectiveness must inform the public explicitly about how they intend to validate the tests for this new purpose. They need to provide solid proof that their proposed use, which differs from the stated purpose of these tests, is valid, reliable, and fair. The current silence is worrisome, not transparent, and unwise.

 

This test validity issue needs to be fully aired and resolved satisfactorily before we can begin to tackle the larger issues about the multiple uses of testing. Otherwise, in our litigious land of opportunity, the ensuing battles may be costly and not pretty. Let’s not go there.