Search results for: "Value-Added"

Hah! This is what we have been waiting for! Economists are now borrowing from the education research literature to develop value-added metrics for physicians. Next, I hope, will be the development of VAMs for lawyers and soon you will hear the screams of outrage not only from the American Medical Association but the American Bar Association. With the economists figuring out metrics to measure these politically powerful professions, teachers won’t be alone in their battle against obsessive compulsive metrical disorder. If only someone would come up with VAM for elected officials! Better yet, how about a VAM for economists? For example, how often do their predictions about the economy come true?


Here is how you measure the value-added of physicians according to the link above from the National Bureau of Economic Research:


“Despite increasing calls for value-based payments, existing methodologies for determining physicians’ “value added” to patient health outcomes have important limitations. We incorporate methods from the value added literature in education research into a health care setting to present the first value added estimates of health care providers in the literature. Like teacher value added measures that calculate student test score gains, we estimate physician value added based on changes in health status during the course of a hospitalization. We then tie our measures of physician value added to patient outcomes, including length of hospital stay, total charges, health status at discharge, and readmission. The estimated value added varied substantially across physicians and was highly stable for individual physicians. Patients of physicians in the 75th versus 25th percentile of value added had, on average, shorter length of stay (4.76 vs 5.08 days), lower total costs ($17,811 vs $19,822) and higher discharge health status (8% of a standard deviation). Our findings provide evidence to support a new method of determining physician value added in the context of inpatient care that could have wide applicability across health care setting and in estimating value added of other health care providers (nurses, staff, etc).”

Audrey Amrein-Beardsley has updated her reading lists on value-added assessment. Most of the studies cited show that it is inaccurate, unstable, and unreliable. The error rate is high. Students are not randomly assigned to teachers. Ratings fluctuate from year-to-year. About 70% of teachers do not teach tested courses. Perhaps that is why other nations do not judge teachers by the rise or fall of the test scores of their students. Unfortunately in this country, at this time, we have a cult worship of standardized testing, which is used to evaluate students, teachers, principals, and schools. People’s lives hang on the right answer. In a just world this practice would be recognized for what it is: Junk science.

Here are her top 15 studies. Open the link to find the top 25. Open the link to find links for all these readings. With Beardsley’s help, you too can be an expert.

American Statistical Association (2014). ASA statement on using value-added models for educational assessment. Alexandria, VA.

Amrein-Beardsley, A. (2008). Methodological concerns about the Education Value-Added Assessment System (EVAAS). Educational Researcher, 37(2), 65-75. doi: 10.3102/0013189X08316420.

Amrein-Beardsley, A., & Collins, C. (2012). The SAS Education Value-Added Assessment System (SAS® EVAAS®) in the Houston Independent School District (HISD): Intended and unintended consequences. Education Policy Analysis Archives, 20(12), 1-36.

Baker, E. L., Barton, P. E., Darling-Hammond, L., Haertel, E., Ladd, H. F., Linn, R. L., Ravitch, D., Rothstein, R., Shavelson, R. J., & Shepard, L. A. (2010). Problems with the use of student test scores to evaluate teachers. Washington, D.C.: Economic Policy Institute.

Baker, B. D., Oluwole, J. O., & Green, P. C. (2013). The legal consequences of mandating high stakes decisions based on low quality information: Teacher evaluation in the Race-to-the-Top era. Education Policy Analysis Archives, 21(5), 1-71.

Darling-Hammond, L., Amrein-Beardsley, A., Haertel, E., & Rothstein, J. (2012). Evaluating teacher evaluation. Phi Delta Kappan, 93(6), 8-15.

Fryer, R. G. (2013). Teacher incentives and student achievement: Evidence from New York City Public Schools. Journal of Labor Economics, 31(2), 373-407.

Haertel, E. H. (2013). Reliability and validity of inferences about teachers based on student test scores. Princeton, NJ: Education Testing Service.

Hill, H. C., Kapitula, L., & Umland, K. (2011). A validity argument approach to evaluating teacher value-added scores. American Educational Research Journal, 48(3), 794-831. doi:10.3102/0002831210387916

Jackson, C. K. (2012). Teacher quality at the high-school level: The importance of accounting for tracks. Cambridge, MA: The National Bureau of Economic Research.

Newton, X., Darling-Hammond, L., Haertel, E., & Thomas, E. (2010). Value-added modeling of teacher effectiveness: An exploration of stability across models and contexts. Educational Policy Analysis Archives, 18(23), 1-27.

Papay, J. P. (2010). Different tests, different answers: The stability of teacher value-added estimates across outcome measures. American Educational Research Journal, 48(1), 163-193. doi:10.3102/0002831210362589

Paufler, N. A. & Amrein-Beardsley, A. (2014). The random assignment of students into elementary classrooms: Implications for value-added analyses and interpretations. American Educational Research Journal, 51(2), 328-362. doi: 10.3102/0002831213508299

Rothstein, J. (2009). Student sorting and bias in value-added estimation: Selection on observables and unobservables. Education Finance and Policy, 4(4), 537-571. doi:

Schochet, P. Z. & Chiang, H. S. (2010). Error rates in measuring teacher and school performance based on student test score gains. Washington DC: U.S. Department of Education.

The studies of value-added measurement keep on coming, and the findings usually show what an utterly absurd idea it to think that teacher quality can be judged by student test scores. In a just world, Arne Duncan would be held accountable for the stupid and harmful theories he has imposed on the nation’s public schools. The U.S. Department of Education has become a malignant force in American education. I cannot think of any time in our nation’s history when public schools and teachers were literally endangered by the mandates coming from Washington, D.C., where the leadership is wholly ignorant of federalism.

This story in Education Week summarizes the latest batch of studies of VAM. some researchers, having made this their area of specialization, continue to prod in hopes of good news.

But look at this:

“In a study that appears in the current issue of the American Educational Research Journal, Noelle A. Paufler and Audrey Amrein-Beardsley, a doctoral candidate and an associate professor at Arizona State University, respectively, conclude that elementary school students are not randomly distributed into classrooms. That finding is significant because random distribution of students is a technical assumption that underlies some value-added models.

“Even when value-added models do account for nonrandom classroom assignment, they typically fail to consider behavior, personality, and other factors that profoundly influenced the classroom-assignment decisions of the 378 Arizona principals surveyed. That, too, can bias value-added results.

“Perhaps most provocative of all are the preliminary results of a study that uses value-added modeling to assess teacher effects on a trait they could not plausibly change, namely, their students’ heights. The results of that study, led by Marianne P. Bitler, an economics professor at the University of California, Irvine, have been presented at multiple academic conferences this year.
The authors found that teachers’ one-year “effects” on student height were nearly as large as their effects upon reading and math. The researchers did not find any correlation between the “value” that teachers “added” to height and the value they added to reading and math. In addition, unlike the reading and math results, which demonstrated some consistency from one year to the next, the height outcomes were not stable over time. The authors suggested that the different properties of the two models offered “some comfort.” Nevertheless, they advised caution.”

So, let’s get this right: teachers’ effects on students’ height were nearly as large as their effect on reading and math.

Perhaps Arne can just arrange to have all teachers fired (except for TFA), close every school (except “no-excuses” charter schools), and turnaround the whole country.

Paul Thomas follows Anthony Cody’s previously cited post by describing the unrelenting attack on teachers, which has intensified with the use of statistically inappropriate measures.

He writes:

“As Cody notes above, however, simultaneously political leaders, the media, and the public claim that teachers are the most valuable part of any student’s learning (a factually untrue claim), but that high-poverty and minority students can be taught by those without any degree or experience in education (Teach for America) and that career teachers no longer deserve their profession—no tenure, no professional wages, no autonomy, no voice in what or how they teach.

And while the media and political leaders maintain these contradictory narratives and support these contradictory policies, value-added methods (VAM) of evaluating and compensating U.S. public teachers are being adopted, again simultaneously, as the research base repeatedly reveals that VAM is yet another flawed use of high-stake accountability and testing.”

Thomas cites review after review to demonstrate that VAM is inaccurate and deeply flawed. Yet the evidence is ignored and VAM is being used as a political weapon by the odd bedfellows of the Obama administration and rightwing governors as well as some Democratic governors, like Andrew Cuomo of New York and Dannell Malloy of Connecticut, to attack teachers. President Obama made a point of praising the Chetty study in his 2012 State of the Union address, not waiting for the many reviews that showed the error of measuring teacher quality by test scores.

Thomas writes:

“The rhetoric about valuing teachers rings hollow more and more as teaching continues to be dismantled and teachers continue to be devalued by misguided commitments to VAM and other efforts to reduce teaching to a service industry.

“VAM as reform policy, like NCLB, is sham-science being used to serve a corporate need for cheap and interchangeable labor. VAM, ironically, proves that evidence does not matter in education policy.”

Race to the Top placed a $4.45 Billion bet that the way to improve schools was to tie teachers’ evaluations to their students’ test scores.

As it happens, the state of Tennessee has been using value-added assessment for 20 years, though the stakes have not been as high as they are now.

What can we learn from the Tennessee experience. According to Andy Spears of the Tennessee Education Report, well, gosh, sorry: nothing.

Spears has a list of lessons learned. Here are the key takeaways:

“4. Tennessee has actually lost ground in terms of student achievement relative to other states since the implementation of TVAAS.

Tennessee received a D on K-12 achievement when compared to other states based on NAEP achievement levels and gains, poverty gaps, graduation rates, and Advanced Placement test scores (Quality Counts 2011, p. 46). Educational progress made in other states on NAEP [from 1992 to 2011] lowered Tennessee’s rankings:

• from 36th/42 to 46th/52 in the nation in fourth-grade math[2]

• from 29th/42 to 42nd/52 in fourth-grade reading[3]

• from 35th/42 to 46th/52 in eighth-grade math

• from 25th/38 (1998) to 42nd/52 in eighth-grade reading.

5. TVAAS tells us almost nothing about teacher effectiveness.

While other states are making gains, Tennessee has remained stagnant or lost ground since 1992 — despite an increasingly heavy use of TVAAS data.

So, if TVAAS isn’t helping kids, it must be because Tennessee hasn’t been using it right, right? Wrong. While education policy makers in Tennessee continue to push the use of TVAAS for items such as teacher evaluation, teacher pay, and teacher license renewal, there is little evidence that value-added data effectively differentiates between the most and least effective teachers.

In fact, this analysis demonstrates that the difference between a value-added identified “great” teacher and a value-added identified “average” teacher is about $300 in earnings per year per student. So, not that much at all. Statistically speaking, we’d call that insignificant. That’s not to say that teachers don’t impact students. It IS to say that TVAAS data tells us very little about HOW teachers impact students.”

Read the whole article.

It is one of the best, most sensible things you will read on value-added assessment. It is a shame that Tennessee has wasted more than $300 million in search of the magic metric that identifies the “best” teachers. It is ridiculous that Congress and the U.S. Department of Education wasted nearly $5 billion to do the same thing, absent any evidence at all. Just think how many libraries they might have kept open, how many health clinics they could have started, how many early childhood programs initiated, how many class sizes reduced for needy kids.

But let’s not confuse the DOE with actual evidence when they have hunches to go on.

E.D. Hirsch, Jr., the founder of the Core Knowledge curriculum, wrote an article opposing value-added teacher evaluation, especially in reading. Hirsch supports the Common Core but thinks it may be jeopardized by the rush to test it and tie the scores to teacher evaluations. He knows this will encourage teaching to the test and other negative consequences.

Hirsch believes that if teachers teach strong subject matter, their students will do well on the reading tests. But he sees the downside of tying test scores to salary and jobs.

He writes:

“The first thing I’d want to do if I were younger would be to launch an effective court challenge to value-added teacher evaluations on the basis of test scores in reading comprehension. The value-added approach to teacher evaluation in reading is unsound both technically and in its curriculum-narrowing effects. The connection between job ratings and tests in ELA has been a disaster for education.”

He is right. Will the so-called reformers who recently became Hirschians listen?

Data hounds continue to search for a measuring stick to identify teacher quality.

They can’t believe they are on a fruitless hunt, like trying to find a barometer or yardstick to say which piece of art is best, which doctor is best, which…… as though human judgment means nothing.

Here is Matt Di Carlo summarizing the research on the instability of VAM, meaning that the best teacher this year might be only average next year, or vice versa.

A little known group called Educators for Shared Accountability designed a rubric for evaluating Secretaries of Education. It incorporates multiple measures.

By its metric, Richard Riley was our best national leader.

Check out Secretary Duncan’s value added rating.

The New York City teacher evaluations were released, and there was nearly no media coverage.

Mayor Bloomberg noticed. Ad Peter Goodman points out on his blog,

“The mayor didn’t like the original law, didn’t like the law which protected teachers from the public release of the scores and doesn’t like the requirement that the details of the plan must be negotiated with the collective bargaining agent, the union.

“On his weekly radio program he made it clear – he has no intention of negotiating a plan – he’ll accept the $250 million cut in state funding unless the union succumbs to all his preconditions. Apparently he “forgot” that the current law prohibits the release of the scores.”

Goodman checked with principals and teachers and they seemed genuinely puzzled by the ratings.

They don’t know what they mean or how they are supposed to help.

“UFT President Mulgrew announced that 6% of teachers were rated “ineffective” and 9% rated “highly effectively.” In order to be charged a teacher must be rated “ineffective” on their overall score or on the VAM and “locally negotiated” section for two consecutive years. When we consider the “instability” of the scores – wide year to year variation – the percentage of teachers impacted will be quite low.”

So very few teachers will be found ineffective, and anyone who is discharged on the basis of these flawed metrics is likely to sue.

Think of the hundreds of millions wasted on this junk science and how the money might have been used to improve schools.

Linda Darling-Hammond and Edward Haertel of Stanford University explain why value-added assessment doesn’t work and how inaccurate it is.

Will John Deasy listen? Will the Gates Foundation listen?

Will the Los Angeles Times, which published their article, stop seeking names to publish inaccurate data about teacher “effectiveness”?


Get every new post delivered to your Inbox.

Join 162,960 other followers