Audrey Amrein-Beardsley has updated her reading lists on value-added assessment. Most of the studies cited show that it is inaccurate, unstable, and unreliable. The error rate is high. Students are not randomly assigned to teachers. Ratings fluctuate from year-to-year. About 70% of teachers do not teach tested courses. Perhaps that is why other nations do not judge teachers by the rise or fall of the test scores of their students. Unfortunately in this country, at this time, we have a cult worship of standardized testing, which is used to evaluate students, teachers, principals, and schools. People’s lives hang on the right answer. In a just world this practice would be recognized for what it is: Junk science.
Here are her top 15 studies. Open the link to find the top 25. Open the link to find links for all these readings. With Beardsley’s help, you too can be an expert.
American Statistical Association (2014). ASA statement on using value-added models for educational assessment. Alexandria, VA.
Amrein-Beardsley, A. (2008). Methodological concerns about the Education Value-Added Assessment System (EVAAS). Educational Researcher, 37(2), 65-75. doi: 10.3102/0013189X08316420.
Amrein-Beardsley, A., & Collins, C. (2012). The SAS Education Value-Added Assessment System (SAS® EVAAS®) in the Houston Independent School District (HISD): Intended and unintended consequences. Education Policy Analysis Archives, 20(12), 1-36.
Baker, E. L., Barton, P. E., Darling-Hammond, L., Haertel, E., Ladd, H. F., Linn, R. L., Ravitch, D., Rothstein, R., Shavelson, R. J., & Shepard, L. A. (2010). Problems with the use of student test scores to evaluate teachers. Washington, D.C.: Economic Policy Institute.
Baker, B. D., Oluwole, J. O., & Green, P. C. (2013). The legal consequences of mandating high stakes decisions based on low quality information: Teacher evaluation in the Race-to-the-Top era. Education Policy Analysis Archives, 21(5), 1-71.
Darling-Hammond, L., Amrein-Beardsley, A., Haertel, E., & Rothstein, J. (2012). Evaluating teacher evaluation. Phi Delta Kappan, 93(6), 8-15.
Fryer, R. G. (2013). Teacher incentives and student achievement: Evidence from New York City Public Schools. Journal of Labor Economics, 31(2), 373-407.
Haertel, E. H. (2013). Reliability and validity of inferences about teachers based on student test scores. Princeton, NJ: Education Testing Service.
Hill, H. C., Kapitula, L., & Umland, K. (2011). A validity argument approach to evaluating teacher value-added scores. American Educational Research Journal, 48(3), 794-831. doi:10.3102/0002831210387916
Jackson, C. K. (2012). Teacher quality at the high-school level: The importance of accounting for tracks. Cambridge, MA: The National Bureau of Economic Research.
Newton, X., Darling-Hammond, L., Haertel, E., & Thomas, E. (2010). Value-added modeling of teacher effectiveness: An exploration of stability across models and contexts. Educational Policy Analysis Archives, 18(23), 1-27.
Papay, J. P. (2010). Different tests, different answers: The stability of teacher value-added estimates across outcome measures. American Educational Research Journal, 48(1), 163-193. doi:10.3102/0002831210362589
Paufler, N. A. & Amrein-Beardsley, A. (2014). The random assignment of students into elementary classrooms: Implications for value-added analyses and interpretations. American Educational Research Journal, 51(2), 328-362. doi: 10.3102/0002831213508299
Rothstein, J. (2009). Student sorting and bias in value-added estimation: Selection on observables and unobservables. Education Finance and Policy, 4(4), 537-571. doi: http://dx.doi.org/10.1162/edfp.2009.4.4.537
Schochet, P. Z. & Chiang, H. S. (2010). Error rates in measuring teacher and school performance based on student test score gains. Washington DC: U.S. Department of Education.

Reblogged this on peakmemory and commented:
“Top 15 Research Articles about VAMs”
LikeLike
This is an impressive list of sources. Will teachers need to become amateur lawyers in order to defend themselves from termination based on faulty employment practices by our supervisors?
Not being a lawyer myself, I hope someone commenting here might address equal protection under the law, or some other legal defense for teachers who are or will be subject to termination using VAM.
Seems pretty unconstitutional to me that math and English teachers are subject to a different set of evaluation standards than a physical education teacher, or even a history teacher for that matter. Somehow, I can’t get the words ‘equal protection’ out of my head. I’m sure a sharp legal mind here can set me straight.
LikeLike
A great list on the VAM scam. Now we need a list for the farce that has been marketed to rate the estimated 70% of teachers who have job assignments for which there are not statewide tests. That farce is known as meeting or attaining an SLO, student learning objective, student growth objective, or variant about which I have posted.
SLOs are a version of Peter Drucker’s principles for management-by-objectives long since abandoned by the most successful companies, but now a requirement in at least 26 states.
All of those unruly teachers who are wandering aimlessly need the discipline of top down command and control management, starting with a writing assignment.
This assignment requires setting targets for learning and “growth” in learning after studying baseline data about students, then identifying the assessments to measure learning, and then putting scores on the post-tests into a computer and bingo, getting a score on your skill in predicting whether your students would meet the targets as well as the outcome of that prediction..
This farce , a proxy for VAM, counts for up to 50% of a teacher’s evaluation in multiple states.
Only one additional point. A USDE commissioned review of the research literature on SLOs produced exactly TWO peer reviewed studies about this convoluted process, nothin connection with pay for performance plans.
There is no evidence to support the reliability or validity of this method of evaluation. Claims that it improves instruction and learning are supported only by circular reasoning. Better instruction means you comply with the SLO. Improved learning means that you are able to teach to the tests that define student learning.
LikeLike
SLOs count for 60% in NYS.
I want to thank you for continuing to bring this issue to the forefront. SLOs (SGOs) are use to evaluate the majority of public school teachers, in place of VAM scoring. Here’s the link to the NYS system:
Click to access slo-guidance.pdf
LikeLike
Two additional points.
1), I urge readers of this blog to buy and read and reread Audrey Amrein-Beardsley’s RETHINKING VALUE-ADDED MODELS IN EDUCATION: CRITICAL PERSPECTIVES ON TESTS AND ASSESSMENT-BASED ACCOUNTABILITY (2014).
2), Consider getting THE ESSENTIAL DEMING: LEADERSHIP PRINCIPLES FROM THE FATHER OF QUALITY W. EDWARDS DEMING (2013, Joyce Orsini, ed.).
Consider how the self-styled leaders of the “new civil rights movement of our time” aka “education reform” customarily and casually employ mathematical intimidation and obfuscation. Then just as one example among many: LAUSD Superintendent John Deasy’s recent use of a 77% graduation rate that was really 67%. Now the following comment by Deming over 30 years ago: “A numerical goal is a number drawn out of the sky. … Anybody can achieve almost anything by distortion and faking, redefinition of terms, running up costs.” (p. 55)
Uh, yeah, like Deasy leaving out the “graduation rate” suppressors so he can juke the numbers!
😡
And from p. 199 (from a piece written in 1992!):
[start quote]
Let’s talk about education for a minute. There is deep concern in the United States today about education. No notable improvement will come until our schools abolish grades (A, B, C, D) in schools from toddlers, on up through the university. Grades are often a forced ranking. Only 20 percent permitted to get As. Thirty percent may get Bs, 30 percent may get Cs, 20 percent may get Ds. Forced ranking. You mean there’s a shortage of good pupils? Well, I don’t think so. Why should there be a shortage? I don’t believe there is a shortage. Only 20 percent? That’s nonsense. Maybe there aren’t any; maybe everybody should get As. Forced ranking is wrong, I believe.
Abolish merit ratings for teachers. Who knows what a great teacher is? Not until years have gone by. Abolish comparison of schools on the basis of scores. The aim is to get a high score, not to learn, but to cram your head full of information. Abolish gold stars for athletics. Indeed, we’re worse off than we thought we were.
[end quote]
Agree or disagree, in whole or in part, but notice just that use of the phrase “forced ranking.” A predictably failed policy that wrecked Microsoft but Bill Gates, Mr. Forced Ranking himself, wants to impose it on the entire USofA via VAM with its conjoined twin, high-stakes standardized testing—even given its well-documented poor track record in the one place he supposedly knows the most about!
Sometimes I think that the best label to apply to the movers and shakers of the charterite/privatizer movement is “thought leaders of failure.”
They not only don’t get it right; they seem incapable of self-correction when they double and triple down on proven megaflops.
Just sayin’…
😎
LikeLike
I hate to do this to you, but we all should have seen this coming 🙂
They think it’s working so great for teachers, they now want to extend it to physicians:
“Like teacher value added measures that calculate student test score gains, we estimate physician value added based on changes in health status during the course of a hospitalization. We then tie our measures of physician value added to patient outcomes, including length of hospital stay, total charges, health status at discharge, and readmission. The estimated value added varied substantially across physicians and was highly stable for individual physicians. Patients of physicians in the 75th versus 25th percentile of value added had, on average, shorter length of stay (4.76 vs 5.08 days), lower total costs ($17,811 vs $19,822) and higher discharge health status (8% of a standard deviation). Our findings provide evidence to support a new method of determining physician value added in the context of inpatient care that could have wide applicability across health care setting and in estimating value added of other health care providers (nurses, staff, etc).”
I don’t know if you talk to any primary care physicians, but they sound exactly like K-12 teachers. They are miserable. They say economic theory (government) and MBA’s (private sector) have consumed their profession and sucked all the joy out of it.
http://www.nber.org/papers/w20534?utm_campaign=ntw&utm_medium=email&utm_source=ntw
LikeLike
One of the leading causes of death and complications following surgery is being released too soon! Bean counters put a monetary value on everything but seemingly know the value of nothing.
LikeLike