Archives for category: Teacher Evaluations

John King inherited a lot of very bad ideas from his predecessor Arne Duncan. One of them is the belief that teacher education programs can be judged by the test scores of the students taught by their graduates. King recently issued regulations cementing the regulations that Duncan began fashioning a few years back. It would be asking too much to expect anyone at the U.S. Department of Education to rethink their failed policies of the past 7 1/2 years.

Fortunately we have a commentary from lawyer Sarah Blaine that explains why the King-Duncan regulations are nonsense. They will increase the nation’s teacher shortage and demoralize those who spend their days trying to teach children.

In the original post, I called this “Arne’s Worst Idea Yet.” Now it is John King’s “worst idea yet.”

It has no validity. It will worsen the problems it is intended to solve.

Sarah Blaine called this proposal “asinine.” Read her entire post.

Here is an excerpt:

“Now, please bear with me. Out here in lawyer-land, there’s a slippery concept that every first year law student must wrap her head around: it’s the idea of distinguishing between actual (or “but for”) causation and proximate (or “legal”) causation. Actual causation is any one of a vast link in the chain of events from the world was created to Harold injured me by hitting me, that, at some level, whether direct or attenuated, “caused” my injury. For instance, Harold couldn’t have hit me if the world hadn’t been created, because if the world hadn’t been created, Harold wouldn’t exist (nor would I), and therefore I never would have been hit by Harold. So, if actual or “but for” causation was legally sufficient to hold someone responsible for an injury, I could try suing “the Creator,” as if the Creator is somehow at fault for Harold’s decision to hit me.

Well, that’s preposterous, even by lawyer standards, right?

The law agrees with you: the Creator is too far removed from the injury, and therefore cannot be held legally responsible for it.

So to commit a tort (legal wrong) against someone else, it isn’t sufficient that the wrong allegedly committed actually — at some attenuated level — caused the injured’s injury (i.e., that the injury would not have happened “but for” some cause). Instead, the wrong must also be proximally related to that injury: that is, there must be a close enough tie between the allegedly negligent or otherwise wrongful act and the injury that results. So while it would be silly to hold “the Creator” legally responsible for Harold hitting me, it would not be similarly silly to hold Harold responsible for hitting me. Harold’s act was not only an actual or “but for” cause of my injury, it was also an act closely enough related to my injury to confer legally liability onto Harold. This is what we lawyers call proximate (or legal) causation: that is, proximate causation is an act that is a close enough cause of the injury that it’s fair — at a basic, fundamental level — to hold the person who committed that injurious act legally responsible (i.e., liable to pay damages or otherwise make reparations) for his act. [As an aside to my aside, if this sort of reasoning makes your head explode, law school probably isn’t a great option for you.]

Well, it appears that Arne Duncan would have failed his torts class. You see, Arne didn’t get the memo regarding the distinction between actual causation and proximate causation. Instead, what Arne proposes is to hold teacher prep programs responsible for the performance of their alumni’s K-12 students (and to punish them if their alumni’s students don’t measure up). Never mind the myriad chains in the causation link between the program’s coursework and the performance of its graduates’ students (presumably on standardized tests). Arne Duncan somehow thinks that he can proximally — fairly — link these kids’ performance not just to their teachers (a dicey proposition on its own), but to their teachers’ prep programs. Apparently Arne can magically tease out all other factors, such as where an alumna teaches, what her students’ home lives are like, how her students’ socio-economic status affects their academic performance, the level of her students’ intrinsic motivation, as well as any issues in the new alumna’s personal life that might affect her performance in the classroom, and, of course, the level of support provided to the new alumna as a new teacher by her department and administration, and so forth. As any first year law student can tell you, Arne’s proposal is asinine, as the alumna’s student’s test results will be so far removed from her teaching program’s performance that ascribing proximate causation from the program to the children’s performance offends a reasonable person’s sense of justice. [Not to mention the perverse incentives this would create for teaching programs’ career advising centers — what teaching program would ever encourage a new teacher to take on a challenging teaching assignment?]”

Perhaps you read the editorial in the New York Times a few days ago, blasting teacher education programs and approving John King’s new regulations to judge them by the test scores of the students who graduate from them. The editorial cites the Gates-funded National Council on Teacher Quality’s claim that 90% of teacher education institutions stink. NCTQ, you may recall, publishes rankings of teacher education programs without ever actually visiting any of them. It just reads the catalogues and decides which are the best and which are the worst, based in part on their adherence to the Common Core and scripted reading programs.

I agree that the entry standards for teacher education programs must be higher, and I would love to see online teaching degree programs shut down. But King’s new rules don’t address entry standards or crummy online programs. Their main goal is to judge teacher education programs by the test scores of the students who studied under the graduates of the programs. They will discourage teachers from teaching in high-needs districts. They will allow the U.S. Department of Education to extend its test-crazed control into yet another sector of American education. This is federal overreach at its dumbest.

John Merrow, who knows much more than the Times’ editorial writer on education (the same person for the past 20 years or more), has a different and better informed perspective.

He writes that the problem is not teacher education but the underpaid, under-respected profession.

The federal government thinks that tighter regulation of these institutions is the answer. After all, cars that come out of an automobile plant can be monitored for quality and dependability, thus allowing judgments about the plant. Why not monitor the teachers who graduate from particular schools of education and draw conclusions about the quality of their training programs?

That’s the heart of the new regulations issued by the U.S. Department of Education this week: monitor the standardized test scores of students and analyze the institutions their teachers graduated from. Over time, the logic goes, we’ll discover that teachers from Teacher Tech or Acme State Teachers College generally don’t move the needle on test scores. Eventually, those institutions will lose access to federal money and be forced out of business. Problem solved!

Education Secretary John B. King, Jr., announced the new regulations in Los Angeles. “As a nation, there is so much more we can do to help prepare our teachers and create a diverse educator workforce. Prospective teachers need good information to select the right program; school districts need access to the best trained professionals for every opening in every school; and preparation programs need feedback about their graduates’ experiences in schools to refine their programs (emphasis added). These regulations will help strengthen teacher preparation so that prospective teachers get off to the best start they can, and preparation programs can meet the needs of students and schools for great educators.”

Work on the regulations began five years ago and reflect former Secretary Arne Duncan’s views.

John Merrow says that the Department is trying to solve a problem by issuing regulations that will make the problem worse. Teacher churn and attrition are at extraordinary high levels. The regulations will not encourage anyone to improve teaching.

He writes:

Strengthen training, increase starting pay and improve working conditions, and teaching might attract more of the so-called ‘best and brightest,’ whereas right now it’s having trouble attracting anyone, according to the Learning Policy Institute, which reported that

“Between 2009 and 2014, the most recent years of data available, teacher education enrollments dropped from 691,000 to 451,000, a 35% reduction. This amounts to a decrease of almost 240,000 professionals on their way to the classroom in the year 2014, as compared to 2009.”

Merrow writes, in the voice of wisdom, a voice that has been non-existent in Washington, D.C., for the past 15 years:

I am a firm believer in the adage, “Harder to Become, Easier to Be.” We need to raise the bar for entry into the field and at the same time make it easier for teachers to succeed. This approach will do the opposite; it will make teaching more test-centric and less rewarding.

This latest attempt to influence teaching and learning is classic School Reform stuff. It worships at the altar of test scores and grows out of an unwillingness to face the real issues in education (and in society). While it may be well-meaning, it’s misguided and, at the end of the day, harmful.

Listen up, New York Times editorial writer!

Doug Garnett is a communications specialist and a regular reader of the blog. He writes here about reading “Policy Patrons,” by Megan Tomkins-Stange.

Been reading Policy Patrons. And it’s given me a different insight.

We all feel like Gates, Broad and others are “dictating” what happens. It’s hard – because they aren’t. What they’re doing is far more subtle but with similar results.

What they’ve done is create a “walled garden” of groups that are all paid to support their position. The list in this article is an example of creating that walled garden – a range of community organizations, researchers, university credibility, etc…

THEN, with the walled garden created, the foundations themselves never have to “tell the government what to do”. They are able to say “well, I know somebody who deals with that – you should talk with them”. Except the foundations have ensured that this “somebody” is somebody who will give the answer they want.

It’s incredibly deceptive – but politicians and press seem incapable of detecting when they’ve been had in this way. Because the “walled garden” of true “ed reform believers” are the only people they end up talking to. In a sense, Gates, Broad, et. al. deliver answers on a silver platter so that state education departments, school districts, politicians, and press don’t have to work hard.

This informal (but massive) walled garden they’ve build believes in testing as management, believes in CCSS, believes in charter schools, and believes that privatizing government services is always good.

As a result, state education bureaucrats NEVER have to wander outside the garden – so they never have to confront uncomfortable truths. (It’s dangerous outside those walls and that threatens one’s career.)

But this also explains why politicians are so shocked when citizens confront them with dissatisfaction with their policies – they’ve been blissfully living inside the Eden of Reform – unaware that they aren’t in touch with reality. I’ve seen this in Oregon. Our legislators cannot believe it when someone rational challenges what they’ve been doing.

It’s a HUGE problem for those of us who believe in public schools and believe in the value of researched answers. Because it’s not illegal what they’ve done. They believe it’s entirely moral. And they think they’re being “good people” by doing it. And it spreads blame by breaking it into tiny bits so no single organization can be blamed for much. Kind of a guaranteed “plausible deniability” clause.

Yet the result is entirely immoral – because it’s the future of our children.

Our reader Laura Chapman reviewed the regulations for teacher education issued by John King’s Department of Education today.

She writes:

“I downloaded the regulations. They are final, include some discussion of comments, but the parts that matter are concentrated in “definitions.” Here you go on the definition of “student growth.”

“Student growth: The change in student achievement between two or more points in time, using a student’s scores on the State’s assessments under section 1111(b)(2) of the ESEA or other measures of student learning and performance, such as

“student results on pre-tests and end- of-course tests;

“objective performance-based assessments;

“student learning objectives;

“student performance on English language proficiency assessments; and

“other measures that are rigorous, comparable across schools, and consistent with State guidelines.

“Teacher evaluation measure: A teacher’s performance level based on an LEA’s teacher evaluation system that differentiates teachers on a regular basis using at least three performance levels and multiple valid measures in assessing teacher performance.

“For purposes of this definition, multiple valid measures must include data on

“student growth for all students (including English learners and students with disabilities) and

“other measures of professional practice (such as observations based on rigorous teacher performance standards, teacher portfolios, and student and parent surveys).

“There is no real difference between ESSA as interpreted by these regulations and the last iteration of regulations in NCLB.

“The persistent reference to student learning objectives (SLOs) and gains between pretests and same year end-of-course tests reflect a profound misunderstanding of teaching, learning, curriculum organization across and within a year, the difference between what may be explored but individuals and subgroups or the whole class and what may be treated as a matter of “mastery” (especially of easy to test content/skill-sets).

“The explicit and implicit assumptions about education are wrong from the get go. The process can be followed but it will mean more of the same invalid stack ratings that have prevailed since 2001.

“Student Learning Objectives–SLOs–are not valid. Recent research from the American Institutes of Research confirms that there is no evidence of gains in student achievement or basis for claims of validity for every grade and subject where those convoluted writing exercises are required.”

Secretary of Education John King refuses to believe that the new federal law restricts his ability to control U.S. education. Today he released regulations that would threaten the federal funding of teacher education programs if their graduates teach low-scoring students.

Randi Weingarten, president of the American Federation of Teachers, blasted King’s overreach and poor judgment

AFT’s Weingarten on Teacher Preparation Programs Regulations

“WASHINGTON—Statement from American Federation of Teachers President Randi Weingarten on the Department of Education’s final regulations for teacher preparation programs.

“It is, quite simply, ludicrous to propose evaluating teacher preparation programs based on the performance of the students taught by a program’s graduates. Frankly, the only conceivable reason the department would release regulations so out of sync with the Every Student Succeeds Act (ESSA) and President Obama’s own call to reduce high-stakes testing is that they are simply checking off their bucket list of outstanding issues before the end of their term.

“The final regulations could harm students who would benefit the most from consistent, high-quality standards for teacher preparation programs. The regulations will create enormous difficulty for teacher prep programs and place an unnecessary burden on institutions and states, which are also in the process of implementing ESSA.

“Instead of designing a system to support and improve teacher prep programs, the regulations build on the now-rejected high-stakes testing system established under NCLB and greatly expanded under this administration’s Race to the Top and waiver programs. It’s stunning that the department would evaluate teaching colleges based on the academic performance of the students of their graduates when ESSA—enacted by large bipartisan majorities in both the House and Senate last December—prohibited the department from requiring school districts to do that kind of teacher evaluation.

“Teacher prep programs need to help ensure that teachers are ready to engage their students in powerful learning and creating an environment that is conducive to learning. These regulations will not help achieve that goal. These regulations do not address ways to help the current status of the teaching profession: the shortages, the lack of diversity or the high turnover.

“While the department has made minor tweaks, the flawed framework remains the same. The regulations will punish teacher prep programs whose graduates go on to teach in our highest-needs schools, most often those with high concentrations of students who live in poverty and English language learners—the exact opposite strategy of what we need. As we brought up in January 2015—in our comments to the department’s proposal— if programs are rated as the department proposes, teacher prep schools will have incentive to steer graduates away from assignments in our toughest schools, and that will only make matters worse.

“If we want to get it right, we should look to countries like Finland, where prospective teachers receive extensive training in their subject matter and teaching strategies combined with clinical training. Finland has no alternative prep programs. Programs are highly selective and free of cost; their graduates go on to work in supportive, professional environments with strong unions, fair pay and benefits, and without high-stakes testing.”


Standardized tests produce results normed on a bell curve. The students who cluster in the bottom half of the bell curves are predominantly poor, children with disabilities, and children of color. The bell curve, by design, never closes. That is why it is fundamentally wrong to rank students, teachers, and schools by a measure that favors the most affluent.

Secretary of Education John King is releasing regulations that will punish education programs if their graduates teach students whose scores are low. “Reformers” are supposed to be aware of the power of incentives, but not Secretary King. He thinks he can scare education programs to focus more on raising test scores. More likely is that teachers will get the message to avoid teaching in schools that enroll students who are impoverished, and that their preparation programs will encourage them to steer clear of the neediest children.

This is the report that appeared this morning in politico education (

TEACHER PREP RULES OUT TODAY: The Obama administration unveiled its long-delayed final regulations governing teacher preparation programs today. The rule preserves much of the administration’s original proposal from 2014, and requires states to develop a rating system for teacher-preparation programs.

– The rule will also eventually punish low-performing programs by cutting off their access to federal TEACH grants that help students pay for teacher training.

– The final rule retains a particularly-controversial component, which holds teacher-preparation programs accountable, in part, for how their graduates perform as teachers, based upon their students’ academic success. However, states will have flexibility in determining how to measure student learning.

– Randi Weingarten, president of the American Federation of Teachers, sharply criticized the regulations, saying in a statement that although the department “has made minor tweaks, the flawed framework remains the same.”

– Weingarten said it was “ludicrous to propose evaluating teacher preparation programs based on the performance of the students taught by a program’s graduates.” And she said the rules ultimately punish teacher prep programs that send graduates into the highest-need schools.

– Chris Minnich, executive director of the Council of Chief State School Officers, said in a statement that the group was “pleased the department listened to feedback and made these regulations stronger.”

– Kate Walsh, president of the National Council on Teacher Quality, said she’s impressed with how much the department kept with its original intent, “which was to insert far greater accountability for program quality.” She added that the effectiveness of the rule “very much depends on states doing their bit to hold programs accountable for quality.”

– “I told people they would never see the light of day,” Walsh said of the rules. “I’m happy to be wrong.”

– Education Secretary John B. King Jr. will be speaking about teacher preparation at the University of Southern California’s Rossier School of Education today. Watch the USC event live, starting at 1 p.m. ET, here.

– Read the regulations here

It takes a comedian or a cartoonist to explain the nutty world of education reform.

Check out this great cartoon by Dilbert, giving a fast explanation of the idiocy of VAM.


PS: Thanks for KrazyTA for sending me the cartoon and also giving the correct link!

Cathy O’Neil has written s new book called “Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy.” I haven’t read it yet, but I will.

In this article, she explains that VAM is a failure and a fraud. The VAM fanatics in the federal Department of Education and state officials could not admit they were wrong, could not admit that Bill Gates had suckered the nation’s education leaders into buying his goofy data-based evaluation mania, and could not abandon the stupidity they inflicted on the nation’s teachers and schools. So they say now that VAM will be one of many measures. But why include an invalid measure at all?

As she is out on book tour, people ask questions and the most common is that VAM is only one of multiple measures.

She writes:

“Here’s an example of an argument I’ve seen consistently when it comes to the defense of the teacher value-added model (VAM) scores, and sometimes the recidivism risk scores as well. Namely, that the teacher’s VAM scores were “one of many considerations” taken to establish an overall teacher’s score. The use of something that is unfair is less unfair, in other words, if you also use other things which balance it out and are fair.

“If you don’t know what a VAM is, or what my critique about it is, take a look at this post, or read my book. The very short version is that it’s little better than a random number generator.

“The obvious irony of the “one of many” argument is, besides the mathematical one I will make below, that the VAM was supposed to actually have a real effect on teachers assessments, and that effect was meant to be valuable and objective. So any argument about it which basically implies that it’s okay to use it because it has very little power seems odd and self-defeating.

“Sometimes it’s true that a single inconsistent or badly conceived ingredient in an overall score is diluted by the other stronger and fairer assessment constituents. But I’d argue that this is not the case for how teachers’ VAM scores work in their overall teacher evaluations.

“Here’s what I learned by researching and talking to people who build teacher scores. That most of the other things they use – primarily scores derived from categorical evaluations by principals, teachers, and outsider observers – have very little variance. Almost all teachers are considered “acceptable” or “excellent” by those measurements, so they all turn into the same number or numbers when scored. That’s not a lot to work with, if the bottom 60% of teachers have essentially the same score, and you’re trying to locate the worst 2% of teachers.

“The VAM was brought in precisely to introduce variance to the overall mix. You introduce numeric VAM scores so that there’s more “spread” between teachers, so you can rank them and you’ll be sure to get teachers at the bottom.

“But if those VAM scores are actually meaningless, or at least extremely noisy, then what you have is “spread” without accuracy. And it doesn’t help to mix in the other scores.”

This is a book I want to read. Bill Gates should read it too. Send it to him and John King too. Would they read it? Not likely.

When this statement first appeared in 2014, I said at the time that it should be on the bulletin board of every public school.

The American Statistical Association explains here why the evaluations of individual teachers should not be based on their students’ test scores.

Here is an excerpt. Read the whole statement, which is only 8 pages long:

It is unknown how full implementation of an accountability system incorporating test-based indicators, such as those derived from VAMs, will affect the actions and dispositions of teachers, principals and other educators. Perceptions of transparency, fairness and credibility will be crucial in determining the degree of success of the system as a whole in achieving its goals of improving the quality of teaching. Given the unpredictability of such complex interacting forces, it is difficult to anticipate how the education system as a whole will be affected and how the educator labor market will respond. We know from experience with other quality improvement undertakings that changes in evaluation strategy have unintended consequences. A decision to use VAMs for teacher evaluations might change the way the tests are viewed and lead to changes in the school environment. For example, more classroom time might be spent on test preparation and on specific content from the test at the exclusion of content that may lead to better long-term learning gains or motivation for students. Certain schools may be hard to staff if there is a perception that it is harder for teachers to achieve good VAM scores when working in them. Overreliance on VAM scores may foster a competitive environment, discouraging collaboration and efforts to improve the educational system as a whole.

Research on VAMs has been fairly consistent that aspects of educational effectiveness that are measurable and within teacher control represent a small part of the total variation in student test scores or growth; most estimates in the literature attribute between 1% and 14% of the total variability to teachers. This is not saying that teachers have little effect on students, but that variation among teachers accounts for a small part of the variation in scores. The majority of the variation in test scores is attributable to factors outside of the teacher’s control such as student and family background, poverty, curriculum, and unmeasured influences.

The VAM scores themselves have large standard errors, even when calculated using several years of data. These large standard errors make rankings unstable, even under the best scenarios for modeling. Combining VAMs across multiple years decreases the standard error of VAM scores. Multiple years of data, however, do not help problems caused when a model systematically undervalues teachers who work in specific contexts or with specific types of students, since that systematic undervaluation would be present in every year of data.

Despite the warning from ASA, which has no special interest and does not represent teachers or public school administrators, many states continue to use this method (called VAM, or value-added measurement or value-added modeling).

States were coerced into adopting this unproven method by the U.S. Department of Education, which said that states had to adopt it if they wanted to be eligible to compete for nearly $5 billion in federal funds in 2009, as every state was undergoing a budget crisis caused by the economic meltdown of fall 2008.

Many states adopted it, and it has not had positive effects in any state.

In Colorado and New York, among others, VAM scores count for as much as 50% of teachers’ evaluation.

A state court in New York ruled this method “arbitrary and capricious” when challenged by fourth grade teacher Sheri Lederman and her lawyer-husband Bruce Lederman.

Some states assign VAM scores to teachers based on students they never taught in subjects they don’t teach.

This is an example of federal and state policy that has no basis in evidence and that has harmed the lives of many teachers. It very likely has caused teachers to leave the profession and contributed to teacher shortages.

This post, with an anonymous author, reviews the research on value-added measurement, with frequent references to those who claim that the rise or fall of test scores is the best way to judge teacher quality.


The basic question he or she addresses is whether the actions of your kindergarten teacher or your third-grade teacher can affect your lifetime earnings, as Raj Chetty and his team asserted in a study a few years back.


The author goes into a lengthy back-and-forth about whether such claims make sense.


But the one essential fact that his post is missing is that 70% of teachers do not teach tested subjects. A district or school can evaluate teachers with VAM only when there are enough years of test scores to document the effects of the teacher over several years. Teachers of subjects other than reading and mathematics in grades 3-8 will never get VAM ratings.


But many states have solved this problem by assigning VAM ratings to the 70%. Their ratings are based on the scores of students they never met in subjects they never taught. This is called an “attributed rating.”


That makes sense, said no one ever.


That may be why Hawaii and Oklahoma have dropped VAM. It is expensive and gives false positives and false negatives. Expect more states to join these two states.