This is an excellent letter to the U.S. Department of Education, which patiently explains the harm caused by value-added modeling (VAM). It was submitted by a Néw York group called “Change the Stakes,” which opposes high-stakes testing. The letter was written by psychologist Dr. Rosalie Friend, a member of Change the Stakes. It is a good source for parents and educators who want to explain why testing is being overused and misused.
USDOE’s Proposed Regs for Teacher Education Programs
Change the Stakes submitted these comments in response to the U.S. Department of Education’s proposal to impose new accountability measures on teacher education programs, https://www.federalregister.gov/articles/2014/12/03/2014-28218/teacher-preparation-issues.
The U.S. Department of Education has proposed that teacher education programs be rated by the employment, placement, and performance of their graduates. Ratings of the performance of graduates would include the test scores of the students who are taught by graduates of those programs.
Change the Stakes (changethestakes.org), an organization of New York City parents and educators promoting alternatives to high-stakes testing, opposes this proposal.
Rating teacher education programs by what teachers do after they leave the programs is unrealistic. The decisions made by graduates and their employers are not determined by the teacher education programs. Teacher education programs are already assessed by professional accrediting boards that understand the nuances of teaching and learning.
The accountability procedures imposed on K-12 schools have diverted astounding amounts of money and time from teaching and learning. The accountability procedures have not led to any measurable improvement in student achievement. Extending these ill-conceived procedures to teacher education programs is counter-productive. Attaching high stakes to evaluation leads to the distortion of the processes that are being evaluated, as documented by Dr. Donald Campbell, the pre-eminent social scientist.
Teaching is a difficult profession. Industrial-type accountability procedures distract from the focus on teaching and learning. We want teachers to learn how to engage children in learning new ideas and using those ideas to reason and solve problems. At the same time, teachers must be able to assist children with developing socially and emotionally. This requires dealing with enormous differences among children’s backgrounds and personalities. Of course, teachers must also be expert in the skills and materials they teach. Teacher education programs must prepare teachers to think on their feet and respond to the ever changing conditions under which they labor, not to drill children for shallow, regimented tests.
Teachers’ working conditions are a major factor in their professional achievement. Social conditions, school culture, school leadership, class assignments, and relationships among colleagues are all important in determining both students’ and teachers’ success. Management expert, W. Edwards Deming, said, “It is the structure of the organization rather than the employees, alone, which holds the key to improving the quality of output.” All these factors are independent of teacher education programs.
Perhaps the most wrong-headed part of the proposal is the use of student test scores in assessing the teachers who graduated from the programs. Using student scores to evaluate teachers and then to use that “so-called” data to rate their teacher education programs is unsound and unacceptable for the following reasons.
Low Reliability of Standardized Test Results
Value-added modeling (VAM) cannot be accurately used for a small sample such as a single class. The aggregation of student test scores to derive a score for an individual teacher has been demonstrated to be wildly unstable, especially while assigning scores to a given teacher from year to year or even from class to class. The American Statistical Association has warned against the use of VAM for teacher evaluation. Using these unreliable figures to draw conclusions about the programs that educated teachers is folly.
Low Validity of Standardized Test Results
Tests cannot adequately account for every factor outside of a teacher’s instruction that impacts how students perform on a test because there are far too many other factors affecting students’ scores. Research shows that whatever teachers’ impact is, it accounts for only 1-14% of student variability in standardized test scores. If the teacher’s score is based on factors other than the teacher’s influence, it is not valid.
Studies since the 1966 Coleman report continue to show that nothing affects student achievement as much as the student’s home. Parents in poor families cannot provide their children with the same social and learning supports and enrichment that affluent and middle-class parents can provide. Furthermore, well-funded schools in prosperous communities consistently get higher test scores than cash starved schools in poverty-stricken neighborhoods.
A teacher’s effectiveness is directly affected by the composition of the class assigned to that teacher even within the same school. What kind of academic background do the children have? Are their goals aligned with the school’s goals? How cooperative are they? How well behaved or self-regulated are they?
Conclusion
The entire process of professional training of an educator is exceptionally complex. While a school of education affects the resulting quality of the professional educator, so much more goes into their success. Any evaluation of such an institution should be developed to be inclusive of all the contributing factors, not simply the ones for which quantitative data (however invalid and unreliable) are available.
Ignoring these additional factors and the research supporting them is an injustice not only to the programs the Education Department plans to rate but also to students, teachers, parents, and communities alike.
Sources
American Statistical Association. (2014). ASA Statement on Using Value-Added Models for Educational Assessment. http://www.amstat.org/policy/pdfs/ASA_VAM_Statement.pdf
Baker, E. L., Barton, P. E., Darling-Hammond, L. D., Haertel, E., Ladd, H. F., Linn, R. L., Ravitch, D., Rothstein, R., Shavelson, R. J., & Shepard, L.A. (2010). Problems with the Use of Student Test Scores to Evaluate Teachers: Briefing Paper 278. Washington, DC: Economic Policy Institute.
Campbell, D.T. (1976). Assessing the Impact of Planned Social Change. Dartmouth College, Occasional Paper Series, #8.
Greene, D. (2013). Doing the Right Thing: A Teacher Speaks. Victoria, Canada: Friesen Press.
Haertel, E.H. (2013) Reliability and validity of inferences about teachers based on student test scores. William H. Angoff Memorial Lecture Series. Princeton, NJ: Educational Testing Service.
Johnson, S.M., Kraft, M.A., & Papay, J.P. (2012). How Context Matters in High-Need Schools: The Effects of Teachers’ Working Conditions on Their Professional Satisfaction and Their Students’ Achievement, Teachers College Record, 114:1-39.
Viadero, D. (2006). Race Report’s Influence Felt 40 Years Later: Legacy of Coleman study was new view of equity. EdWeek [Online] Available http://www.edweek.org/ew/articles/2006/06/21/41coleman.h25.html
These comments were written by Dr. Rosalie Friend, Educational Psychologist and a member of Change the Stakes.
It’s a bit like patiently explaining to a vampire that you object to the loss of blood.
“Change the Stakes” indeed …
Well-said. I was just thinking, how is it even possible to respond to those who could think up such an ill-conceived and ridiculous plan. Are medical schools going to be evaluated by the cholesterol levels of the patients of the doctors that are graduated from the schools?
They are changing the stakes.
They’re making them higher.
I’m happy “Change the Stakes” is speaking out against VAM. I wasn’t sure where to include this, but I wanted to bring up the fact that teachers aren’t allowed to do many of the things we used to be able to do–like a classroom meeting to get to the bottom of problems or to allow a place for kids to feel safe to speak about things happening at home or at school. I am now retired and work closely with the Sudanese population here in Phoenix. I eliminated names, but this is what the person who helps many of our children with problems of education sent me recently.
” _____has told me, several times, that there is a boy in her class who “grinds” against the girls in the class. She thinks the girls don’t really mind because they laugh. She also thinks the teacher(s) know about it and must not think it’s big deal because, she thinks, they don’t really do anything. She says that when he has tried to do it to her, she averts it by saying he “can’t have any.”
She started a facebook account a few weeks ago by lying about her age, as under federal law facebook can’t let kids under 13 have accounts. She immediately “friended” slews of other kids who attend ______ — most of whom are under 13, including the “grinder” boy. Several older boys also ended up friending her. Within two days, she reported at Sunday school that she was being “cyberstalked”. ________, another volunteer with the sunday school friended her to check it out. Lots of overly sexualized and totally inappropriate stuff was being circulated. We talked to her about how 11 year olds just can’t deal with all of this and she agreed to delete her facebook page, with _____ technical assistance.
I’m wondering what the school can do to address any of this. It seems to be turning into a real problem. ______has more documentation of the facebook stuff. The on campus sexualized harrassment definitely needs to be addressed for ______and everyone else’s benefit.
Here’s a resource from DOE that’s on the internet, although you probably have already seen it –
http://www.stopbullying.gov/respond/support-kids-involved/index.html”
This was sent to the child’s team–she is getting counseling for problems she is dealing with. My point in putting this is we have all these troubled children, who are doing things like this, and not getting the help they need, because teacher’s are too busy teaching to the test or are new and don’t understand what our Title I kids (probably others, too) truly need. Title I schools’ funding has also been cut back, so they don’t receive help that they desperately need. Eventually, many of these children end up in juvenile detention or later prison.
Unfortunately, well-reasoned statements about USDE’s proposed requirements for evaluating teacher education programs may not be considered if the writers ignore the regulatory framework already in place and framing the “invitation” to comment.
The comments anyone or any group submits should be framed to acknowledge some statutory requirements already in place– and “intended to ensure that teacher preparation programs produce new teachers who will address areas of need in local educational agencies and States. Congress’s expectations are manifested in statutory requirements that each program provide assurances to the Secretary in its IRC that it is training prospective teachers to fill these needs (sections 205(a)(1)(A)(ii) and 206 of the Higher Education Act).”
“Specifically, institutions of higher education (IHEs ) that conduct teacher preparation programs are required to provide an assurance in the institutional report card that the IHE is providing training to prospective teachers that “responds to the identified needs of the local educational agencies or States where the institution’s graduates are likely to teach based on past hiring and recruitment trends.”
In addition, USDE links the requirements in this Higher Education Act ( reauthorized in 2014) to other in-place requirements from multiple sources such as regulations attached to flows of federal funds from No Child Left Behind (waivers for these granted by USDE) and the infamous data-link project jointly funded by USDE and Gates since 2005 (about which I have commented on the blog).
Here is part of the bureaucratic architecture that flows over into the proposed teacher evaluation regulation.I judge that new thinking is not really wanted, or if so, only at the margins…
“Consistent with teacher-student data link requirements related to the American Recovery and Reinvestment Act (ARRA), State Longitudinal Data Sy STEM program (SLDS), and the ESEA Flexibility initiative, proposed § 612.5(a)(1) would require States to provide data on student learning outcomes, defined as the aggregate learning outcomes of students taught by new teachers trained by each teacher preparation program in the State. States would have the discretion to report student learning outcomes on the basis of student growth (that could factor in variance in expected growth for students with different growth trajectories), teacher evaluation measures, or both. States also would have discretion on whether to use a value-added method of adjusting for student characteristics.”
“Regardless of which method States use to report student learning outcomes, States would be required to link the results of those indicators of teaching skill to the teacher preparation programs with which the teachers are associated.”
Understanding this maze of legislation and interlocking regulations is not easy. I am not an expert on that, but I know that comments that do not address the numbered paragraphs in the proposed regulations are not likely to have much influence.
I also think the comments made by any respondent will be given some consideration based on “political standing,” not just perceived expertise.
USDE’s final regulations always include some brief paragraphs that acknowledge some comments from individuals or groups that USDE officials and staff view as credible. The final regulation do not routinely name specific people or groups. After giving some token attention to these comments, the final draft will have a closing paragraph that settles each of the matters of concern.
Incidentally, only parts of these regulations are being opened for comment, and at the invitation of the Secretary. The key phrase to look for is: “the Secretary invites comments on…..”
The relevant paragraphs opened for comments have numbers. These were not visible on the document I downloaded. They appeared when my computer mouse strayed into the wide margin on the right hand side of the document. It is wise to use these paragraph numbers went you frame any comment.
“Low Reliability . . . and Validity. . . of Standardized Test Results”
By definition of an assessment is invalid it is not reliable and vice-versa. Wilson has shown that educational standards and standardized testing are COMPLETLEY INVALID in his seminal 1997 never refuted nor rebutted destruction of those two educational malpractices. To understand the COMPLETE INVALIDITY AND UNRELIABLILITY of those false concepts see: “Educational Standards and the Problem of Error” found at: http://epaa.asu.edu/ojs/article/view/577/700
Brief outline of Wilson’s “Educational Standards and the Problem of Error” and some comments of mine. (updated 6/24/13 per Wilson email)
1. A description of a quality can only be partially quantified. Quantity is almost always a very small aspect of quality. It is illogical to judge/assess a whole category only by a part of the whole. The assessment is, by definition, lacking in the sense that “assessments are always of multidimensional qualities. To quantify them as unidimensional quantities (numbers or grades) is to perpetuate a fundamental logical error” (per Wilson). The teaching and learning process falls in the logical realm of aesthetics/qualities of human interactions. In attempting to quantify educational standards and standardized testing the descriptive information about said interactions is inadequate, insufficient and inferior to the point of invalidity and unacceptability.
2. A major epistemological mistake is that we attach, with great importance, the “score” of the student, not only onto the student but also, by extension, the teacher, school and district. Any description of a testing event is only a description of an interaction, that of the student and the testing device at a given time and place. The only correct logical thing that we can attempt to do is to describe that interaction (how accurately or not is a whole other story). That description cannot, by logical thought, be “assigned/attached” to the student as it cannot be a description of the student but the interaction. And this error is probably one of the most egregious “errors” that occur with standardized testing (and even the “grading” of students by a teacher).
3. Wilson identifies four “frames of reference” each with distinct assumptions (epistemological basis) about the assessment process from which the “assessor” views the interactions of the teaching and learning process: the Judge (think college professor who “knows” the students capabilities and grades them accordingly), the General Frame-think standardized testing that claims to have a “scientific” basis, the Specific Frame-think of learning by objective like computer based learning, getting a correct answer before moving on to the next screen, and the Responsive Frame-think of an apprenticeship in a trade or a medical residency program where the learner interacts with the “teacher” with constant feedback. Each category has its own sources of error and more error in the process is caused when the assessor confuses and conflates the categories.
4. Wilson elucidates the notion of “error”: “Error is predicated on a notion of perfection; to allocate error is to imply what is without error; to know error it is necessary to determine what is true. And what is true is determined by what we define as true, theoretically by the assumptions of our epistemology, practically by the events and non-events, the discourses and silences, the world of surfaces and their interactions and interpretations; in short, the practices that permeate the field. . . Error is the uncertainty dimension of the statement; error is the band within which chaos reigns, in which anything can happen. Error comprises all of those eventful circumstances which make the assessment statement less than perfectly precise, the measure less than perfectly accurate, the rank order less than perfectly stable, the standard and its measurement less than absolute, and the communication of its truth less than impeccable.”
In other word all the logical errors involved in the process render any conclusions invalid.
5. The test makers/psychometricians, through all sorts of mathematical machinations attempt to “prove” that these tests (based on standards) are valid-errorless or supposedly at least with minimal error [they aren’t]. Wilson turns the concept of validity on its head and focuses on just how invalid the machinations and the test and results are. He is an advocate for the test taker not the test maker. In doing so he identifies thirteen sources of “error”, any one of which renders the test making/giving/disseminating of results invalid. And a basic logical premise is that once something is shown to be invalid it is just that, invalid, and no amount of “fudging” by the psychometricians/test makers can alleviate that invalidity.
6. Having shown the invalidity, and therefore the unreliability, of the whole process Wilson concludes, rightly so, that any result/information gleaned from the process is “vain and illusory”. In other words start with an invalidity, end with an invalidity (except by sheer chance every once in a while, like a blind and anosmic squirrel who finds the occasional acorn, a result may be “true”) or to put in more mundane terms crap in-crap out.
7. And so what does this all mean? I’ll let Wilson have the second to last word: “So what does a test measure in our world? It measures what the person with the power to pay for the test says it measures. And the person who sets the test will name the test what the person who pays for the test wants the test to be named.”
In other words it attempts to measure “’something’ and we can specify some of the ‘errors’ in that ‘something’ but still don’t know [precisely] what the ‘something’ is.” The whole process harms many students as the social rewards for some are not available to others who “don’t make the grade (sic)” Should American public education have the function of sorting and separating students so that some may receive greater benefits than others, especially considering that the sorting and separating devices, educational standards and standardized testing, are so flawed not only in concept but in execution?
My answer is NO!!!!!
One final note with Wilson channeling Foucault and his concept of subjectivization:
“So the mark [grade/test score] becomes part of the story about yourself and with sufficient repetitions becomes true: true because those who know, those in authority, say it is true; true because the society in which you live legitimates this authority; true because your cultural habitus makes it difficult for you to perceive, conceive and integrate those aspects of your experience that contradict the story; true because in acting out your story, which now includes the mark and its meaning, the social truth that created it is confirmed; true because if your mark is high you are consistently rewarded, so that your voice becomes a voice of authority in the power-knowledge discourses that reproduce the structure that helped to produce you; true because if your mark is low your voice becomes muted and confirms your lower position in the social hierarchy; true finally because that success or failure confirms that mark that implicitly predicted the now self evident consequences. And so the circle is complete.”
In other words students “internalize” what those “marks” (grades/test scores) mean, and since the vast majority of the students have not developed the mental skills to counteract what the “authorities” say, they accept as “natural and normal” that “story/description” of them. Although paradoxical in a sense, the “I’m an “A” student” is almost as harmful as “I’m an ‘F’ student” in hindering students becoming independent, critical and free thinkers. And having independent, critical and free thinkers is a threat to the current socio-economic structure of society.
By Duane E. Swacker
The real reforms in teacher education should be as follows:
1) Only the top 10% of graduating students should be accepted into a teaching program.
2) There should be massive tax-driven subsidization of teacher programs.
3) The internship for teaching should last 3 solid years with a DAILY master teacher mentor overseeing the intern. There should be three years before such a person is allowed to step inside a classroom alone and handle a class. Ditto for those studying to become educational leaders.
4) Teaching practicums should be offered by the third year in college starting with supervised small group instruction and there should be permanent real in-house k-12 schools housed on the same campus and partnering with colleges of education.
4) High stakes testing and test scores of students in k-12 should never be used to judge an person majoring in education or the student or the college they attend.
5) Test scores should not drive teacher evaluations; excellence in teaching should drive them.
6) Teachers should be required, as part of their licensing, to demonstrate significant abilities in political advocacy in standing up for the prevention of poverty of children BEFORE they even enter pre-k or nursery school. There should be 2 to 3 courses as required for a teaching degree.
7) To get a teaching degree, teachers should be mandated to take at least one semester in all of the following: critical issues in education, school law, school finance, curriculum development, and leadership skills with organizational and communication theory.
What the reform movement has done by attaching VAM to everyone is utterly destructive and ill informed.
If we do not push back hard, we stand to be a land of at least 50% privatized schools in less than 10 years, with a three tier system: those who get little to no education, those who who get a complete but poor education, and those who get a resplendent, superior, strong education.
Who do we teachers and our unions really think we want to be and where do we want to be 5 years from now? . . . . .
This would go a long way towards supporting teachers, but, the current fervor seems to be that you either have “it” or you don’t, and if you can’t cut it with the few supports in place, you shouldn’t be a teacher no matter how challenging they make it for you as a professional on a daily basis.
Not to mention it would mean backing off of these current expensive reforms that reward those implementing them, to hand the money and reins over to experienced educators.
I don’t see politicians OR private industry backing off the control of the billions of dollars at stake.
I get the distinct impression that those in these 2 areas see children as little walking wads of cash and not future people to be developed to their fullest.
correction: “so-called” data should be so-called data, or so-called “data”
Thanks for writing and submitting this.
“Reason”
“Reason” only works
With reasonable folks
It doesn’t work with jerks
And doesn’t work with jokes
More than 800 comments have been posted at the Federal Register, concerning the U.S. Dept. of Ed.’s proposals under the heading, “Teacher Prep Issues”. The comments are uniformly negative. They can viewed by going to “regulations.gov” and typing, at the search prompt, ED-2014-OPE-0057.
Based on the comments, the anger towards education “reform” in Wash. D.C., is palpable.
LInda, that is not surprising. Will they listen? Do they care what the public thinks?
Duncan’s Dept. of Education will fail as a result of its unethical machinations and its duplicitous motivations.