Archives for category: Teacher Evaluations

The New York State United Teachers decided not to endorse Andrew Cuomo for re-election, nor anyone else.

NYSUT has 600,000 members and a strong get-out-the-vote operation. It did endorse the Democratic Attorney General Eric Schneiderman and Comptroller Thomas DiNapoli.

“Those who earn endorsements are friends of public education and labor,” NYSUT president Karen Magee said in a statement. “Over the last two years, they earned our support by advocating effectively for our public schools, colleges and health care institutions; listening intently to the concerns and aspirations of our members, and voting consistently the right way.”

Cuomo is a staunch advocate of privately-managed charter schools and has received large campaign contributions from the hedge fund industry, which supports charter schools. 3% of the state’s children are enrolled in charter schools. Cuomo is also a firm advocate for evaluating teachers by the test scores of their students, although he gives no sign of knowing that it has been tried and failed in many other places.

Stephanie Simon of politico.com reports on the story behind Michelle Rhee-Johnson’s decision to step down as leader of StudentsFirst, the organization she founded in 2010.

Although she managed to raise some millions from big donors like the Eli and Edythe BroadFoundation, the Walton Family Foundation, and the Michael Bloomberg Foundation for her efforts to curb collective bargaining, eliminate tenure and promote vouchers and charters, she fell far short of her announced goal of $1 billion.

But even more important, Rhee-Johnson alienated some of her allies in the movement.

“As she prepares to step down as CEO, she leaves a trail of disappointment and disillusionment. Reform activists who shared her vision say she never built an effective national organization and never found a way to use her celebrity status to drive real change.

“StudentsFirst was hobbled by a high staff turnover rate, embarrassing PR blunders and a lack of focus. But several leading education reformers say Rhee’s biggest weakness was her failure to build coalitions; instead, she alienated activists who should have been her natural allies with tactics they perceived as imperious, inflexible and often illogical. Several said her biggest contribution to the cause was drawing fire away from them as she positioned herself as the face of the national education reform movement.

““There was a growing consensus in the education reform community that she didn’t play well in the sandbox,” one reform leader said.

Rhee-Johnson says she intends to devote more time to her family, which some assume means that her husband Kevin Johnson may run for governor or senator of California. Whether Rhee-Johnson will spend more time with her two daughters who live in Tennessee is unclear.

She recently announced her decision to become chairman of her husband’s charter schools. In some states, that would be considered nepotism, but apparently not in California.

The growing recognition of the failure of her style of high-stakes testing and test-based teacher evaluation did not seem to have played a role in her decision to step aside. Probably, living in the corporate reform echo chamber, she was unaware that her prize policies are on the ropes, as parents and teachers join to fight the reign of standardized testing.

Peter Greene writes a farewell letter to Michelle and dissects Campbell Brown’s talking points.

Peter speculates on Michelle’s departure and hopes she understands why she provoked the reactions she did::

“Maybe education was providing too few rewards and too much tempest. People have called you some awful names and said some terrible personal things about you, and though I have called you the Kim Kardashian of education, I don’t condone or support the ugly personal attacks that are following you out the door. But I understand them– you have done some awful, awful things, and I’m not sure that it’s ever seemed, from out here in the cheap seats, that you understand that teachers and students are real, live human beings and not simply props for whatever publicity moment you are staging. I’m not saying that you deserve the invective being hurled at you; I am saying that when you poke a bear in the face repeatedly, it eventually gets up and takes a bite out of you.”

As Michelle leaves, stage right, she hands off the baton as leader of the campaign against “bad teachers” to Campbell Brown, who explained her goals in an opinion piece. Since she is not an educator, he feels he must explain that her effort to take away tenure from “bad teachers” will take away tenure for ALL teachers. He asks how she will judge the quality of teachers. If she really wants to protect those who are caring, dependable, and inspiring, how can she be sure that these are the same teachers whose students get higher scores?

Ultimately, he wants know how removing employment protections will attract more “great” teachers to the classroom.

Campbell would do well to listen to Peter.

Joy Resmovits of Huffington Post reports that Michelle Rhee is stepping down as leader of StudentsFirst, a group she founded in 2010. She is likely to remain a board member. She recently changed her name to Michelle Johnson.

“StudentsFirst was launched on Oprah’s TV talk show in late 2010 and immediately set ambitious goals, such as amassing $1 billion in its first year and becoming education’s lobbying equivalent to the National Rifle Association. Its policy goals focused on teacher quality, teacher evaluations, school accountability and the expansion of charter schools. But the group has failed to achieve some of its major goals. After revising its fundraising goal to $1 billion over five years, the group only netted $62.8 million in total: $7.6 million in its first year, $28.5 million in its second year and $26.7 million between August 2012 and July 2013. The group also has seen much staff turnover, cycling through at least five prominent spokespeople since 2010.

“After the group began, it saw some legislative and electoral successes. It claims credit for changing more than 130 education laws in many states. It has released report cards ranking states on their education policies, supported candidates through political action committees, and lobbied state legislatures and governors on reform issues.”

Although Rhee always claimed to be a Democrat, most of her group’s campaign contributions went to conservative Republicans. Last year, StudentsFirst honored Tennessee State Representative John Ragan as “education reformer of the year,” despite the fact that he was co-sponsor of the infamous “don’t say gay” bill). She opposed unions, tenure, and seniority, and she supported vouchers and charters. She was a leader of the privatization movement as well as the movement to evaluate teachers by test scores. Ironically, her successor in the District of Columbia announced yesterday the suspension of test-based evaluation of teachers, a move supported by the Gates Foundation.

Resmovits speculates that former CNN news anchor Campbell Brown will become the face of the movement to strip due process rights from teachers. StudentsFirst, however, is unlikely to have the national visibility that it had under Rhee’s controversial leadership.

District of Columbia Chancellor Kaya Henderson announced the suspension of the practice of evaluating teachers by their students’ test scores. This practice was considered the signal policy initiative of Henderson’s predecessor Michelle Rhee.

Henderson described the move “as necessary in order to allow students to acclimate themselves to new tests built around the standards established by the Common Core.”

The Gates Foundation applauded the retreat on test-based evaluation, as did Randi Weingarten of the AFT. The U.S. Department of Education released a statement expressing its disappointment. It said: “Although we applaud District of Columbia Public Schools (DCPS) for their continued commitment to rigorous evaluation and support for their teachers, we know there are many who looked to DCPS as a pacesetter who will be disappointed with their desire to slow down.” Since test-based evaluation is the centerpiece of Arne Duncan’s Race to the Top, this is surely a setback for Duncan and his theory of change. On the other hand, D.C.’s test scores have been stagnant since 2009, which does not speak well of test-based evaluation, whether pushed by Michelle Rhee or Arne Duncan.

By the way, Michelle Rhee has apparently changed her name to Michelle (Rhee) Johnson.

This is an important article, which criticizes and deconstructs the notorious VAM study by Chetty et al. I refer to it as notorious because it was reported on the first page of the New York Times before it was peer-reviewed; it was immediately presented on the PBS Newshour; and President Obama referred to its findings in his State of the Union address only weeks after it first appeared.

These miraculous events do not happen by accident. The study made grand claims for the importance of value-added measures of teacher quality, a keystone of Obama’s education policy. One of the authors told the New York Times that the lesson of the study was to fire teachers sooner rather than later. A few months ago, the American Statistical Association reacted to the study, not harshly, but made clear that the study was overstated, that the influence of teachers on the variability of test scores ranged from 1-14%, and that changes in the system would likely have more influence on students’ academic outcomes than attaching the scores of students to individual teachers.

I have said it before, and I will say it again: VAM is Junk Science. Looking at children as machine-made widgets and looking at learning solely as standardized test scores may thrill some econometricians, but it has nothing to do with the real world of children, learning, and teaching. It is a grand theory that might net its authors a Nobel Prize for its grandiosity, but it is both meaningless in relation to any genuine concept of education and harmful in its mechanistic and reductive view of humanity.

CHETTY, ET AL, ON THE AMERICAN STATISTICAL ASSOCIATION’S RECENT POSITION STATEMENT ON VALUE-ADDED MODELS (VAMs): FIVE POINTS OF CONTENTION

by Margarita Pivovarova, Jennifer Broatch & Audrey Amrein-Beardsley — August 01, 2014

Over the last decade, teacher evaluation based on value-added models (VAMs) has become central to the public debate over education policy. In this commentary, we critique and deconstruct the arguments proposed by the authors of a highly publicized study that linked teacher value-added models to students’ long-run outcomes, Chetty et al. (2014, forthcoming), in their response to the American Statistical Association statement on VAMs. We draw on recent academic literature to support our counter-arguments along main points of contention: causality of VAM estimates, transparency of VAMs, effect of non-random sorting of students on VAM estimates and sensitivity of VAMs to model specification.

INTRODUCTION

Recently, the authors of a highly publicized and cited study that linked teacher value-added estimates to the long-run outcomes of their students (Chetty, Friedman, & Rockoff, 2011; see also Chetty, et al., in press I, in press II) published a “point-by-point” discussion of the “Statement on Using Value-Added Models for Educational Assessment” released by the American Statistical Association (ASA, 2014). This once again brought the value-added model (VAM) and its use for increased teacher and school accountability to the forefront of heated policy debate.

In this commentary we elaborate on some of the statements made by Chetty, et al. (2014). We position both the ASA’s statement and Chetty, et al.’s (2014) response within the current academic literature. As well, we deconstruct the critiques and assertions advanced by Chetty, et al. (2014) by providing counter-arguments and supporting them by the scholarly research on this topic.

In doing so, we rely on the current research literature that has really been done on this subject over the past ten years. This more representative literature was completely overlooked by Chetty, et al. (2014), even though, paradoxically, they criticize the ASA for not citing the “recent” literature appropriately themselves (p. 1). With this being our first point of contention, we also discuss four additional points of dispute within the commentary.

POINT 1: MISSING LITERATURES

In their critique of the ASA statement, posted on a university-sponsored website, Chetty, et al. (2014) marginalize the current literature published in scholarly journals on the issues surrounding VAMs and their uses for measuring teacher effectiveness. Rather, Chetty et al. cite only works representing econometrician’s scholarly pieces, apparently in support of their a priori arguments and ideas. Hence, it is important to make explicit the rather odd and extremely selective literature Chetty, et al. included in the reference section of their critique, on which Chetty, et al. relied “to prove” some of the ASA’s statements incorrect. The whole set of peer-reviewed articles that counter Chetty, et al.’s arguments and ideas are completely left out of their discussion.

A search on the Educational Resources Information Center (ERIC) with “value-added” as key words for the same last five years yields 406 entries, and a similar search in Journal Storage (JSTOR, a shared digital library) returns 495. Chetty, et al., however, only cite 13 references to critique the ASA’s statement, one of which was the actual statement itself, leaving 12 external citations in total and in support of their critique. Of these 12 external citations, three are references to their two forthcoming studies and a replication of these studies’ methods; three have thus far been published in peer-reviewed academic journals, six were written by their colleagues at Harvard University; and 11 were written by teams of scholars with economics professors/econometricians as lead authors.

POINT 2: CORRELATION VERSUS CAUSATION

The second point of contention surrounds whether the users of VAMs should be aware of the fact that VAMs typically measure correlation, not causation. According to the ASA, as pointed out by Chetty, et al. (2014), effects “positive or negative—attributed to a teacher may actually be caused by other factors that are not captured in by the model” (p. 2). This is an important point with major policy implications. Seminal publications on the topic, Rubin, Stuart and Zanutto (2004) and Wainer (2004) who positioned their discussion within the Rubin Causal Model framework (Rubin, 1978; Rosenbaum and Rubin, 1983; Holland, 1986), clearly communicated, and evidenced, that value-added estimates cannot be considered causal unless a set of “heroic assumptions” are agreed to and imposed. Moreover, “anyone familiar with education will realize that this [is]…fairly unrealistic” (Rubin, et al. 2004, p. 108). Instead, Rubin, et al. suggested, given these issues with confounded causation, we should switch gears and evaluate interventions and reward incentives as based on the descriptive qualities of the indicators and estimates derived via VAMs. This point has since gained increased consensus among other scholars conducting research in these areas (Amrein-Beardsley, 2008; Baker, et al., 2010; Betebenner, 2009; Braun, 2008; Briggs & Domingue, 2011; Harris, 2011; Reardon & Raudenbush, 2009; Scherrer, 2011).

POINT 3: THE NON-RANDOM ASSIGNMENT OF STUDENTS INTO CLASSROOMS

The third point of contention pertains to Chetty, et al.’s statement that recent experimental and quasi-experimental studies have already solved the “causation versus correlation” issue. This claim is made despite the substantive research that evidences how the non-random assignment of students constrains VAM users’ capacities to make causal claims.

The authors of the Measures of Effective Teaching (MET) study cited by Chetty, et al. in their critique, clearly state, “we cannot say whether the measures perform as well when comparing the average effectiveness of teachers in different schools…given the obvious difficulties in randomly assigning teachers or students to different schools” (Kane, McCaffrey, Miller & Staiger, 2013, p. 38). VAM estimates were found to be biased for teachers who taught more relatively homogenous sets of students with lower levels of prior achievement, despite the levels of sophistication in the statistical controls used (Hermann, Walsh, Isenberg, & Resch, 2013; see also Ehlert, Koedel, Parsons, & Podgursky, 2014; Guarino et al., 2012).

Researchers repeatedly demonstrated that non-random assignment confounds value-added estimates independent of how many sophisticated controls are added to the model (Corcoran, 2010; Goldhaber, Walch, & Gabele, 2012; Guarino, Maxfield, Reckase, Thompson, & Wooldridge, 2012; Newton, Darling-Hammond, Haertel, & Thomas, 2010; Paufler & Amrein-Beardsley, 2014; Rothstein, 2009, 2010).

Even in experimental settings, it is still not possible to distinguish between the effects of school practice, which is of interest to policy-makers, and the effects of school and home context. There are many factors at the student, classroom, school, home, and neighborhood levels that would confound causal estimates that are beyond researchers’ control. Thus, the four experimental studies cited by Chetty, et al. (2014) do not provide ample evidence to refute the ASA on this point.

POINT 4: ISSUES WITH LARGE-SCALE STANDARDIZED TEST SCORES

In their position statement, ASA authors (2014) rightfully state that the standardized test scores used in VAMs should not be the only outcomes of interest for policy makers and stakeholders. Indeed, current agreement is that test scores might not even be one of the most important outcomes capturing a student’s educated self. Also, if value-added estimates from standardized test scores cannot be interpreted as causal, then the effect of “high value-added” teachers on college attendance, earnings, and reduced teenage birth rates cannot be considered causal either as opposed to what is implied by Chetty, et al. (2011; see also Chetty, et al., in press I, in press II).

Ironically, Chetty, et al. (2014) cite Jackson’s (2013) study to confirm their point that high value-added teachers also improve long-run outcomes of their students. Jackson (2013), however, actually found that teachers who are good at boosting test scores are not always the same teachers who have positive and long-lasting outcomes on non-cognitive skills acquisition. Moreover, value-added as related to test scores and non-cognitive outcomes for the same teachers were then, and have since been shown to be, weakly correlated with one another.

POINT 5: MODEL SPECIFICITY

Lastly, ASA (2014) expressed concerns about the sensitivity of value-added estimates to model specifications. Recently, researchers have found that value-added estimates are highly sensitive to the tests being used, even within the same subject areas (Papay, 2011) and the different subject areas taught by the same teachers given different student compositions (Loeb & Candelaria, 2012; Newton, et al., 2010; Rothstein, 2009, 2010). While Chetty, et al. rightfully noted that different VAMs typically yield correlations around r = 0.9, this is typical with most “garbage in, garbage out” models. These models are too often used, too often without question, to process questionable input and produce questionable output (Banchero & Kesmodel, 2011; Gabriel & Lester, 2012, 2013; Harris, 2011).

What Chetty, et al. overlooked, though, are the repeatedly demonstrated weak correlations between value-added estimates and other indicators of teacher quality, on average between r = 0.3 and 0.5 (see also Corcoran, 2010, Goldhaber et al., 2012; McCaffrey, Sass, Lockwood, & Mihaly, 2009; Broatch and Lohr, 2012; Mihaly, McCaffrey, Staiger, & Lockwood, 2013).

CONCLUSION

In sum, these are only a few “points” from this “point-by-point discussion” that would strike anyone even fairly familiar with the debate over the use and abuse of VAMs. These “points” are especially striking given the impact Chetty, et al.’s original (2011) study and now forthcoming studies (Chetty, et al., in press I, in press II) have already had on actual policy and the policy debates surrounding VAMs. Chetty, et al.’s (2014) discussion of the ASA statement, however, should cause others pause in terms of whether in fact Chetty, et al. are indeed experts in the field, or not. What certainly has become evident is that they do not have their minds wrapped around the extensive set of literature or knowledge on this topic. If they had, they may not have come off as so selective, as well as biased, citing only those representing certain disciplines and certain studies to support certain assumptions and “facts” upon which their criticisms of the ASA statement were based.

References

American Statistical Association. (2014). ASA Statement on using value-added models for educational assessment. Retrieved from http://www.amstat.org/policy/pdfs/ASA_VAM_Statement.pdf

Amrein-Beardsley, A. (2008). Methodological concerns about the Education Value-Added Assessment System (EVAAS). Educational Researcher, 37(2), 65–75. doi: 10.3102/0013189X08316420

Baker, E. L., Barton, P. E., Darling-Hammond, L., Haertel, E., Ladd, H. F., Linn, R. L., Ravitch, D., Rothstein, R., Shavelson, R. J., & Shepard, L. A. (2010). Problems with the use of student test scores to evaluate teachers. Washington, D.C.: Economic Policy Institute. Retrieved from http://www.epi.org/publications/entry/bp278

Banchero, S. & Kesmodel, D. (2011, September 13). Teachers are put to the test: More states tie tenure, bonuses to new formulas for measuring test scores. The Wall Street Journal. Retrieved from http://online.wsj.com/article/SB10001424053111903895904576544523666669018.html

Betebenner, D. W. (2009b). Norm- and criterion-referenced student growth. Education Measurement: Issues and Practice, 28(4), 42-51. doi:10.1111/j.1745-3992.2009.00161.x

Braun, H. I. (2008). Viccissitudes of the validators. Presentation made at the 2008 Reidy Interactive Lecture Series, Portsmouth, NH. Retrieved from http://www.cde.state.co.us/cdedocs/OPP/HenryBraunLectureReidy2008.ppt

Briggs, D. & Domingue, B. (2011, February). Due diligence and the evaluation of teachers: A review of the value-added analysis underlying the effectiveness rankings of Los Angeles Unified School District Teachers by the Los Angeles Times. Boulder, CO: National Education Policy Center. Retrieved from nepc.colorado.edu/publication/due-diligence

Broatch, J. and Lohr, S. (2012) “Multidimensional Assessment of Value Added by Teachers to Real-World Outcomes”, Journal of Educational and Behavioral Statistics, April 2012; vol. 37, 2: pp. 256–277.

Chetty, R., Friedman, J. N., & Rockoff, J. E. (2011). The long-term impacts of teachers: Teacher value-added and student outcomes in adulthood. Cambridge, MA: National Bureau of Economic Research (NBER), Working Paper No. 17699. Retrieved from http://www.nber.org/papers/w17699

Chetty, R., Friedman, J. N., & Rockoff, J. (2014). Discussion of the American Statistical Association’s Statement (2014) on using value-added models for educational assessment. Retrieved from http://obs.rc.fas.harvard.edu/chetty/ASA_discussion.pdf

Chetty, R., Friedman, J. N., & Rockoff, J. E. (in press I). Measuring the impact of teachers I: Teacher value-added and student outcomes in adulthood. American Economic Review.

Chetty, R., Friedman, J. N., & Rockoff, J. E. (in press II). Measuring the impact of teachers II: Evaluating bias in teacher value-added estimates. American Economic Review.

Corcoran, S. (2010). Can teachers be evaluated by their students’ test scores? Should they be? The use of value added measures of teacher effectiveness in policy and practice. Educational Policy for Action Series. Retrieved from: http://files.eric.ed.gov/fulltext/ED522163.pdf

Ehlert, M., Koedel, C., Parsons, E., & Podgursky, M. J. (2014). The sensitivity of value-added estimates to specification adjustments: Evidence from school- and teacher-level models in Missouri. Statistics and Public Policy. 1(1), 19–27.

Gabriel, R., & Lester, J. (2012). Constructions of value-added measurement and teacher effectiveness in the Los Angeles Times: A discourse analysis of the talk of surrounding measures of teacher effectiveness. Paper presented at the Annual Conference of the American Educational Research Association (AERA), Vancouver, Canada.

Gabriel, R. & Lester, J. N. (2013). Sentinels guarding the grail: Value-added measurement and the quest for education reform. Education Policy Analysis Archives, 21(9), 1–30. Retrieved from http://epaa.asu.edu/ojs/article/view/1165

Goldhaber, D., & Hansen, M. (2013). Is it just a bad class? Assessing the long-term stability of estimated teacher performance. Economica, 80, 589–612.

Goldhaber, D., Walch, J., & Gabele, B. (2012). Does the model matter? Exploring the relationships between different student achievement-based teacher assessments. Statistics and Public Policy, 1(1), 28–39.

Guarino, C. M., Maxfield, M., Reckase, M. D., Thompson, P., & Wooldridge, J.M. (2012, March 1). An evaluation of Empirical Bayes’ estimation of value-added teacher performance measures. East Lansing, MI: Education Policy Center at Michigan State University. Retrieved from http://www.aefpweb.org/sites/default/files/webform/empirical_bayes_20120301_AEFP.pdf

Harris, D. N. (2011). Value-added measures in education: What every educator needs to know. Cambridge, MA: Harvard Education Press.

Hermann, M., Walsh, E., Isenberg, E., & Resch, A. (2013). Shrinkage of value-added estimates and characteristics of students with hard-to-predict achievement levels. Princeton, NJ: Mathematica Policy Research. Retrieved form http://www.mathematica-mpr.com/publications/PDFs/education/value-added_shrinkage_wp.pdf

Holland, P. W. (1986). Statistics and causal inference. Journal of the American Statistical Association, 81(396), 945–960.

Jackson, K. C. (2012). Non-cognitive ability, test scores, and teacher quality: Evidence from 9th grade teachers in North Carolina. Cambridge, MA: National Bureau of Economic Research (NBER), Working Paper No. 18624. Retrieved from http://www.nber.org/papers/w18624

Kane, T., McCaffrey, D., Miller, T. & Staiger, D. (2013). Have we identified effective teachers? Validating measures of effective teaching using random assignment. Bill and Melinda Gates Foundation. Retrieved from http://www.metproject.org/downloads/MET_Validating_Using_Random_Assignment_Research_Paper.pdf

Loeb, S., & Candelaria, C. (2013). How stable are value-added estimates across
years, subjects and student groups? Carnegie Knowledge Network. Retrieved from http://carnegieknowledgenetwork.org/briefs/value‐added/value‐added‐stability

McCaffrey, D. F., Sass, T. R., Lockwood, J. R., & Mihaly, K. (2009). The intertemporal variability of teacher effect estimates. Education Finance and Policy, 4, 572–606.

Mihaly, K., McCaffrey, D., Staiger, D. O., & Lockwood, J.R. (2013). A
composite estimator of effective teaching. Seattle, WA: Bill and Melinda Gates Foundation. Retrieved from: http://www.metproject.org/downloads/MET_Composite_Estimator_of_Effective_Teaching_Research_Paper.pdf

Newton, X. A., Darling-Hammond, L., Haertel, E., & Thomas, E. (2010). Value added modeling of teacher effectiveness: An exploration of stability across models and contexts. Educational Policy Analysis Archives, 18(23). Retrieved from: epaa.asu.edu/ojs/article/view/810.

Papay, J. P. (2010). Different tests, different answers: The stability of teacher value-added estimates across outcome measures. American Educational Research Journal, 48(1), 163–193.

Paufler, N. A., & Amrein-Beardsley, A. (2014). The random assignment of students into elementary classrooms: Implications for value-added analyses and interpretations. American Educational Research Journal.

Reardon, S. F., & Raudenbush, S. W. (2009). Assumptions of value-added models for estimating school effects. Education Finance and Policy, 4(4), 492–519. doi:10.1162/edfp.2009.4.4.492

Rosenbaum, P., & Rubin, D. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 17, 41–55.

Rothstein, J. (2009). Student sorting and bias in value-added estimation: Selection on observables and unobservables. Education Finance and Policy, (4)4, 537–571. doi:http://dx.doi.org/10.1162/edfp.2009.4.4.537

Rothstein, J. (2010, February). Teacher quality in educational production: Tracking, decay, and student achievement. Quarterly Journal of Economics. 175–214. doi:10.1162/qjec.2010.125.1.175

Rubin, D. B. (1978). Bayesian inference for causal effects: The role of randomization. The Annals of Statistics, 6, 34–58

Rubin, D. B., Stuart, E. A., & Zanutto, E. L. (2004). A potential outcomes view of value-added assessment in education. Journal of Educational and Behavioral Statistics, 29(1), 103–116.

Scherrer, J. (2011). Measuring teaching using value-added modeling: The imperfect panacea. NASSP Bulletin, 95(2), 122–140. doi:10.1177/0192636511410052

Wainer, H. (2004). Introduction to a special issue of the Journal of Educational and Behavioral Statistics on value-added assessment. Journal of Educational and Behavioral Statistics, 29(1), 1–3. doi:10.3102/10769986029001001

Cite This Article as: Teachers College Record, Date Published: August 01, 2014
http://www.tcrecord.org ID Number: 17633, Date Accessed: 8/10/2014 8:23:06 AM

This is a good news story about a state commissioner of education who stood up and said, with quiet determination, that the emperor has no clothes.

That state commissioner is Rebecca Holcombe of Vermont. She wrote a clear and eloquent letter to the parents and caregivers of Vermont, explaining the punitive and incoherent nature of federal education policy, which (under NCLB) requires that every single school in Vermont be labeled low-performing, even though many national and international measures show that Vermont is a high-performing state. She explained that Vermont refused to apply for a waiver from NCLB offered by Secretary Duncan because it would have forced the state to evaluate teachers by their students’ test scores, which is unreliable and unfair to teachers and students.

Commissioner Holcombe wrote that Vermont believes that schools have purposes that are no less important (and perhaps more important) than test scores.

For her thoughtfulness, her integrity, her devotion to children, her understanding of the broad aims of education, and her courage in standing firm against ruinous federal policies, Rebecca Holcombe is a hero of American education. Most people go along with the crowd, even when doing so violates their sense of personal and professional ethics. Not Commissioner Holcombe. If our nation had more state commissioners like her, it would save our children from a mindless culture of test and punish that the federal department of education has imposed on them and our nation’s schools.

This is the letter that State Commissioner Holcombe wrote to every parent and caregiver in Vermont:

“Under the No Child Left Behind Act (NCLB), as of 2014, if only one child in your school does not score as “proficient” on state tests, then your school must be “identified” as “low performing” under federal law. This year, every school whose students took the NECAP tests last year is now considered a “low performing” school by the US Department of Education. A small group of schools were not affected by this policy this year because they helped pilot the new state assessment and so did not take the NECAPs last year. Because these schools had their federal AYP status frozen at 2013 levels, eight schools are not yet identified as low performing by federal criteria. However, had these school taken the NECAPs as well, it is likely that every single school in the state would have to be classified as “low performing” according to federal guidelines.

The Vermont Agency of Education does not agree with this federal policy, nor do we agree that all of our schools are low performing.

In 2013, the federal Education Department released a study comparing the performance of US states to the 47 countries that participated in the most recent Trends in International Mathematics and Science Study, one of the two large international comparative assessments. Vermont ranked 7th in the world in eighth-grade mathematics and 4th in science. Only Massachusetts, which has a comparable child poverty rate, did better.

“On the National Assessment of Educational Progress, Vermont consistently ranks at the highest levels. We have the best graduation rate in the nation and are ranked second in child well-being.

“Just this week, a social media company that compares financial products (WalletHub) analyzed twelve different quality metrics and ranked Vermont’s school system third in the nation in terms school performance and outcomes.

“Nevertheless, if we fail to announce that each Vermont school is “low performing,” we jeopardize federal funding for elementary and secondary education. The “low performing” label brings with it a number of mandatory sanctions, which your principal is required to explain to you. This policy does not serve the interest of Vermont schools, nor does it advance our economic or social well-being. Further, it takes our focus away from other measures that give us more meaningful and useful data on school effectiveness.

“It is not realistic to expect every single tested child in every school to score as proficient. Some of our students are very capable, but may have unique learning needs that make it difficult for them to accurately demonstrate their strengths on a standardized test. Some of our children survived traumatic events that preclude good performance on the test when it is administered. Some of our students recently arrived from other countries, and have many valuable talents but may not yet have a good grasp of the academic English used on our assessments. And, some of our students are just kids who for whatever reason are not interested in demonstrating their best work on a standardized test on a given day.

“We know that statewide, our biggest challenge is finding better ways to engage and support the learning of children living in poverty. Our students from families with means and parents with more education, consistently are among the top performing in the country. However, federal NCLB policy has not helped our schools improve learning or narrow the gaps we see in our data between children living in poverty and children from more affluent families. We need a different approach that actually works.”

What are the alternatives? Most other states have received a waiver to get out from under the broken NCLB policy. They did this by agreeing to evaluate their teachers and principals based on the standardized test scores of their students. Vermont is one of only 5 states that do not have a waiver at this time. We chose not to agree to a waiver for a lot of reasons, including that the research we have read on evaluating teachers based on test scores suggests these methods are unreliable in classes with 15 or fewer students, and this represents about 40-50% of our classes. It would be unfair to our students to automatically fire their educators based on technically inadequate tools. Also, there is evidence suggesting that over-relying on test-based evaluation might fail to credit educators for doing things we actually want them to do, such as teach a rich curriculum across all important subject areas, and not just math and English language arts. In fact, nation-wide, we expect more and more states to give up these waivers for many of the reasons we chose not to pursue one in the first place.

Like other Vermont educators, I am deeply committed to continuously improving our schools and the professional skill of our teachers. I have heard from principals and teachers across the state who are deeply committed to developing better ways of teaching and working with parents and other organizations to ensure that every child’s basic needs are met. If basic needs are not met, children cannot take advantage of opportunities that we provide in school. However, the federal law narrows our vision of schools and what we should be about. Ironically, the only way a school could pass the NCLB criteria would be to leave some children behind – to exclude some of the students who come to our doors. That is something public schools in Vermont will not do.

Matching Our Measures to Our Purpose

Certainly, we know tests are an important part of our tool kit, but they do not capture everything that is important for our children to learn. With this in mind, our State Board of Education clearly outlined five additional education priorities in our new Education Quality Standards, including scientific inquiry, citizenship, physical health and wellness, artistic expression and 21st century transferable skills.

As parents and caregivers, we embrace a broader vision for our children than that defined in federal policy. Thus, we encourage you to look at your own child’s individual growth and learning, along with evidence your school has provided related to your child’s progress. Below are some questions to consider:

• What evidence does your school provide of your child’s growing proficiency?

• Is your child developing the skills and understanding she needs to thrive in school and
the community?

• Are graduates of your school system prepared to succeed in college and/or careers?

• Is your child happy to go to school and engaged in learning?

• Can your child explain what he is learning and why? Can your child give examples of
skills he has mastered?

• Is your child developing good work habits? Does she understand that practice leads to
better performance?

• Does your child feel his work in school is related to his college and career goals?

• Does your child have one adult at the school whom she trusts and who is committed to
her success?

• If you have concerns, have you reached out to your child’s teacher to share your
perspective?

Be engaged with your school, look at evidence of your own child’s learning, and work with your local educators to ensure that every child is challenged and supported, learning and thriving. Schools prosper when parents are involved as the first teachers of their children.

The State’s Obligation to Our Children

Working with the Governor, the State Board, the General Assembly and other agencies, and most importantly, with educators across the state, the Agency of Education will invite schools across the state to come together to innovate and improve our schools. We hope your school will volunteer to help develop and use a variety of other measures that will give parents, citizens and educators better information on student learning and what we can do to personalize and make it better. These measures include:

• collaborative school visits by teams of peers, to support research, professional learning and sharing of innovative ideas,

• personalization of learning through projects and performance assessments of proficiency,

• gathering and sharing of feedback from teachers, parents and students related to school climate and culture, student engagement and opportunities for self-directed learning,

• providing teachers and administrators standards-based feedback on the effectiveness of their instruction,

• developing personalized learning plans that involve students in defining how they will demonstrate they are ready to graduate, and basing graduation on these personalized assessments of proficiency rather than “seat-time”,

• analyzing growth and improvement at the Supervisory level as well as the school level, to identify systems that seem to be fostering greater growth in students, as a way of identifying and sharing promising practices across schools.

Vermont has a proud and distinguished educational history, but we know we can always do better. We are committed to supporting our schools as they find more effective and more engaging ways to improve the skills and knowledge of our children. As we have done before, we intend to draw on the tremendous professional capability of teachers across the state as we work to continuously improve our schools. Our strength has always been our ingenuity and persistence. In spite of federal policies that poorly fit the unique nature of Vermont, let’s continue to work together to build great schools that prepare our children to be productive citizens and contributors to our society

When Campbell Brown’s Vergara-style trial moves to New York City, its star witnesses should testify for the defense, not the plaintiffs, argues Gary Rubinstein of New York City. Tom Kane of Harvard testified in Los Angeles that teachers in New York City were not maldistributed, as they were in Los Angeles. Another star witness for the Los Angeles was Raj Chetty.

Rubinstein reviewed their testimony, and concluded that their testimony could be used by the defense in New York City.

He wrote:

“Also a major witness was Raj Chetty who published a big paper about how students who had teachers with higher value-added scores made more money in life. These conclusions have been challenged pretty convincingly, but still even the President paraphrased the paper in one of his State of The Union addresses.

“So now there will be a Vergara-like trial in New York City. The prosecution has some very heavy hitters. For the defense, I don’t know. The defendants are people like John King. I wouldn’t be surprised if many of the defendants hope to lose the trial and maybe won’t get the best representation to go against a dream team rivaling the OJ Simpson lawyers.

“But the New York trial, with competent defense attorneys, should be much easier to win. First of all, New York has a three year tenure process, which is something that was argued would be a reasonable amount of time, even by the prosecutors, during the Vergara case…..

“Later on, Kane explains to the defense that Chetty also did not find that ‘ineffective’ teachers were disproportionately assigned to poor students.

Q AND WITH RESPECT TO THIS MALDISTRIBUTION OF EFFECTIVE TEACHERS, PROFESSOR CHETTY DID NOT FIND THIS MALDISTRIBUTION IN THE NEW YORK CITY SCHOOLS; CORRECT?

A THAT’S MY READING OF HIS RESULTS, YES.

So it seems that this would make Kane and Chetty pretty bad witnesses for the New York case. Perhaps they will get other witnesses, or they will get Kane and Chetty and hope that the defense doesn’t have (or doesn’t care to do) what it takes to go up against the big hired pro-bono guns. If only I had gone to law school, like I had originally planned, rather than do TFA, I’m sure I could win this case.”

But the biggest problem the prosecutors are going to have is that in the Vergara case, Kane had said in his testimony that poor kids in Los Angeles had a disproportionate percent of ineffective teachers, according to his research. To show how bad it was, he compared it to New York where this same phenomenon did not occur.”

Peter Greene read a column by Joe Klein of TIME magazine about what’s wrong with education, and Greene had a hard time controlling his indignation.

Klein did not like the contract that Mayor de Blasio negotiated with the teachers’ union. What really bothers Klein, he says, is that teachers have something to say about their working conditions. His bottom-line beef, says Greene, is unions.

Greene writes:

“There are lots of things Joe Klein doesn’t get, and many of them are related to education. In the process of railing last week about a de Blasio “giveback” of 150 minutes of special student tutoring time in New York schools, Klein managed to trot out a whole raft of misconceptions and complaints. Here he gets himself all lathered up.

“I’m not going to take Klein to task for slamming assembly-line workers as if they are a bad thing. I know what he means– teachers should act like salaried workers instead of workers paid by the hour. Of course, if he tried to get his doctor or his lawyer to put in extra unbilled hours and be “paid in professional satisfaction,” I think he’d have another complaint to make. So I’m not sure exactly which profession he wants us to act like. Hell, even the oldest profession (I mean, of course, plumbing) charges by the hour…..

“It bothers Klein that the union negotiates things down to the half-minute, but he seems to forget that for every teacher union not saying, “We’ll work long extra hours just out of professional pride,” there’s a school board not saying, “You know what? We’ll just pay you what the work is worth and trust you to give us the hours needed.” Teachers could easily put in every single hour of the week doing the work, and many districts would let them do it, for free. “Wow, you’re working so hard and long we’re going to pay you more. really, we insist,” said no school district ever. Nor do they say, “We’ll trust you to do what’s right and never clock you in and out so we’re sure we get every hour you owe us.” A line has to be drawn somewhere; professionals also do not regularly give away their work for free. I agree that the half-minute is a little silly, but the line still has to be drawn.

“Klein also throws into the pot his assertion that real professionals don’t resist evaluation. This is partly almost true. Real professionals do not resist evaluation by qualified, knowledgeable fellow professionals who are using a fair and accurate measuring instrument. But if Klein’s editor announced “the guys in the mailroom have decided that you will be evaluated on how thick your hair grows in and how much garbage is in your wastebasket,” I don’t think Klein’s reply would be, “I’m a professional. That’s fine.”

“Teachers and our unions are not opposed to evaluation. We are opposed to bad evaluations conducted unfairly using invalid methods developed by amateurs who don’t know what the hell they’re talking about.

“Klein also asserts a bedrock principle for systems that are not working in schools– you don’t scrap them, but you fix them. I was going to hunt down a column in which Klein uses this same argument to vehemently oppose things like, say, letting Eva Moskowitz shove aside public schools to make room for charters. Because, if a public school is struggling, Joe Klein will apparently be there to argue fiercely that you don’t close public schools– you fix them. But my googler seems to be broken. Can somebody help me with that? Kthanks.

“But Klein saves the worst for last. You see, there’s a struggle going on in this country and it’s time to pick sides– either the unions or the students.

“That’s an interesting choice, particularly since these days many teachers are wishing that teacher unions would choose the side of teachers. But really– is that it? The biggest obstacle standing in the path of educating students is teachers’ unions? Teachers unions are out there saying, “We’ve got to smack down those damn students and get them out of our way”?

“I think not. I think in many districts, particularly big messy urban districts, the only adults around to stand up for the interests of the students are the teachers (whose working conditions are the very same as the students’ learning conditions), and the only hope the teachers have of being heard at all is to band together into a group, a union. Consequently, much of what good has happened for students is there not because of some school board largesse but because a teachers’ union (or a group of parents, or both) stood up and demanded it.

“It’s ironic I’m writing this, because I have plenty of beefs with the union. But to assert that making the unions shut up and go away would usher in an era of student greatness and success is just silly.

“Of course, I could be wrong. I would do a search for states that hamstrung or abolished teacher unions and which now lead the nation in school and student excellence. Perhaps there are such places. Unfortunately, my googler is busted.”

I read Jeff Bryant’s interview with the President-elect of NEA, Lily Eskelsen, and I think I love her.

She is smart, strong, and she doesn’t mince words.

She was a classroom teacher for many years, and she speaks from experience teaching many kinds of kids, including kids in special education and kids in a homeless shelter.

She knows that VAM is ridiculous.

She knows that tests can be valuable when used for diagnostic purposes, but harmful when used to pin a ranking on students, teachers, principals, and schools.

She gets it.

Here is a small part of the interview. Jeff asked why NEA delegates voted for a resolution calling on Duncan to resign.

“Bryant: So what’s the frustration for teachers?

“Eskelsen: Here’s the frustration – and I’m not blaming the delegates; I will own this; I share in their anger. The Department of Education has become an evidence-free zone when it comes to high stakes decisions being made on the basis of cut scores on standardized tests. We can go back and forth about interpretations of the department’s policies, like, for instance, the situation in Florida where teachers are being evaluated on the basis of test scores of students they don’t even teach. He, in fact, admitted that was totally stupid. But he needs to understand that Florida did that because they were encouraged in their applications for grant money and regulation waivers to do so. When his department requires that state departments of education have to make sure all their teachers are being judged by students’ standardized test scores, then the state departments just start making stuff up. And it’s stupid. It’s absurd. It’s non-defensible. And his department didn’t reject applications based on their absurd requirements for testing. It made the requirement that all teachers be evaluated on the basis of tests a threshold that every application had to cross over. That’s indefensible.

“Bryant: So any good the Obama administration has tried to accomplish for education has been offset by the bad?

“Eskelsen: Yes. Sure, we get pre-K dollars and Head Start, but it’s being used to teach little kids to bubble in tests so their teachers can be evaluated. And we get policies to promote affordable college, but no one graduating from high school gets an education that has supported critical and creative thinking that is essential to succeeding in college because their education has consisted of test-prep from Rupert Murdoch. The testing is corrupting what it means to teach. I don’t celebrate when test scores go up. I think of El Paso. Those test scores went up overnight. But they cheated kids out of their futures. Sure, you can “light a fire” and “find a way” for scores to go up, but it’s a way through the kids that narrows their curriculum and strips their education of things like art and recess.

“Bryant: Doesn’t Duncan understand that?

“Eskelsen: No. That reality hasn’t entered the culture of the Department of Education. They still don’t get that when you do a whole lot of things on the periphery, but you’re still judging success by a cut score on a standardized test and judging “effective” teachers on a standardized test, then you will corrupt anything good that you try to accomplish.”

Follow

Get every new post delivered to your Inbox.

Join 109,158 other followers