Audrey Beardsley reveals the answer to the intriguing question: Why is D.C. hiding VAM data? The answer was earlier leaked to blogger and retired math teacher G.F. Brandenburg. Beardsley cites him in this post.
The VAM data show that VAM is junk science. Keep it a secret. D.C. school officials are trying to.
“In Brandenburg’s words: “Value-Added scores for any given teacher jumped around like crazy from year to year. For all practical purposes, there is no reliability or consistency to VAM whatsoever. Not even for elementary teachers who teach both English and math to the same group of children and are ‘awarded’ a VAM score in both subjects. Nor for teachers who taught, say, both 7th and 8th grade students in, say, math, and were ‘awarded’ VAM scores for both grade levels: it’s as if someone was to throw darts at a large chart, blindfolded, and wherever the dart lands, that’s your score.”

It shouldn’t be a surprise given the origins of VAM as posted earlier this week.
William Sanders – thought that educators struggling with student achievement in the state should “simply” use more advanced statistics, similar to those used when modeling genetic and reproductive trends among cattle,
it isn’t too big a jump (no double entendre intended) from modeling “reproductive trends among cattle” to “bull”. After all, some of the same procedures are involved . . .
LikeLike
Where are they hiding the VAM data, among the cattle?
LikeLike
“If I have seen farther, it is by standing on the shoulders of giants” — Isaac Newton, who invented differential calculus and wrote down what are now known as Newton’s Laws of Motion
“If I have seen fodder, it is by standing on the horns of cattle” — William Sanders, who first applied VAM for cattle to teachers and wrote down Sanders’ Laws of Self-promotion and teacher demotion
LikeLike
Heretofore education was the search for “truths” utilizing deep, scholarly research to enhance our best understandings of it. Now, hair brained philosophies, untested, unexamined by such scholarly research usurp the academic processes formerly the basis for education.
Politicians usurp the academic freedoms necessary for democracy. This has been said over and over and over but still does not get through to the general public enough. Only when parents own children find out the negative results do people stand up and scream.
Yet again, this is NOT only in public school education.
LikeLike
“VAMdumness” is also what Gary Rubinstein found when he looked at the data for NY City and is entirely in keeping with ASA’s warnings about the noisy, unstable nature of VAM scores.
The idea that a reputable “education” organization just gave the originator of a VAMdumb number generator a prize for his contributions to ‘education” is actually quite hilarious.
The people who awarded the prize are obviously as clueless as the fellow who thought it a brilliant idea to apply a model for cattle growth to teachers.
“Vamdumb Number Generator”
The VAMdumb number maker
Is really quite a thing
Put teacher in and shake her
And out comes random string
LikeLike
DC officials are probably hiding more data than VAM. The graph for this post plots rankings of teachers on VAM and rankings of scores on the DPS “Teacher Learning Framework” (TLF). Here is the problem with undisclosed data, reliability and validity indicators on the TLF.
The TFL is a 41 page protocol for evaluating teachers. Only the TEACH part of this protocol appears to have been used recently. The TEACH part calls for judgments at four levels of performance (rubrics) on nine teaching standards with 42 detailed criteria, based on formal or informal classroom observations by an evaluator.
In effect, the evaluator is asked to make 168 specific judgments for each evaluative session. A teacher who fails to comply with the 42 detailed criteria is placed into one of five reductive categories ranging from highly effective to ineffective.
In addition to not having a clear system for weighting all of these observation-based criteria, the overall evaluation scheme calls for different frequencies of evaluations–from 2 to 8 per year–depending on a teacher’s career classification. Those classifications also determine the number of formal and informal observations. There are five career classifications. These are not stable. They are contingent on performance measures from the prior two years.
So the graph offers data points all over the place with two variables not fully described, and from a complex evaluation system with multiple criteria, no clear description of how the 42 criteria are weighted other than by descriptive rubrics–a system known to be unreliable. Further, there is clear case for the evaluative validity for teachers of test scores churned through VAM.
In addition, the DC observation/evaluation protocol–TLF–requires teachers to engage in practices long associated with direct instruction, such as mastery of easy-to-define bits and chunks of information, requiring students to demonstrate their clear understanding and MASTERY of the lesson content by being able to state the “learning objective(s) for the lesson and what they learned. (Apparently this must be demonstrated every time the teacher is observed.)
I could not find any references for the reliability and validity of this evaluation scheme. All I found was a list of documents consulted in developing the criteria in the Teacher Learner Framework.
I am currently doing a case study of the teacher evaluation system in Ohio with a focus on this issue of the sheer quantity of judgments that, in the end, placing teachers in one of four or five highly reductive categories from effective to ineffective with the report churned out by a computer.
A recent Wall Street Journal article reported on the corporate quest fotr “talent,” the crude measures of employee promise and actual performance in “high performing” companies especially when the companies want to do leading-edge or ground-braking research, or they are engages in research and development where the outcomes cannot be tracked to individual workers, or in tidy increments through quarterly and annual reports. The gurus in human resource departments and key managers do not have screening and performance measures nearly as tidy the managerial class in education seem to think and demand for teachers.
The evaluation systems in place for teachers are really instruments for managerial control–micromanaging- and surveillance on a scale out of proportion in detail and effort (for everyone) for the educational efficacy of the end result The result is increasingly a simplistic measure of compliance with ideological criteria. These are branded with fancy labels such as “data-driven instruction” or “best practices” or “continuous improvement” or ” our talent management system.”
LikeLike
Laura – I’ll be happy to share my 20yr personal history of state test scores and evaluations. I’ve kept all my Ohio state test result reports – Proficiency, OAT, and OAAs. And I have all my evaluations. I’ll also gladly provide my EVASS info for you.
LikeLike
Laura,
Your last paragraph is a perceptive description of evaluation systems, and whythe fancy labels
LikeLike
Sorry – didn’t mean to submit it yet. Try again . . .
Your last paragraph is a perceptive description of evaluation systems and why the fancy labels are great advertising for those systems.
LikeLike
Gosh, where have we seen this before? In NYC, where the exact same thing happened! A newspaper sued to have the scores released, Gary Rubenstein crunched the numbers, and the scatter plot showed that they were completely RANDOM! And just as bizarre! http://garyrubinstein.teachforus.org/2012/09/15/analyzing-released-nyc-value-added-data-part-6/
LikeLike
VAM is a rusted out Yugo with no wheels or engine yet it’s proponents insist that it’s a Lamborghini Huracan!
LikeLike
It’s a hurricane all right, leaving devastation in its path everywhere it goes.
LikeLike
The entire DC School reform is based on fraudulent/ criminal activity. If states want to stop this cancer from spreading locally. Ask your Senators and Congresspeople to ask for an investigation or audit of DC School reform from the GAO and Justice Dept.
LikeLike
Using Shavelson’s theory/framework, Stephen Klein and others have followed up in Research in Higher Education for 2 decades or more. Here is an abstract from Klein and I like the way he titles it: An Approach…. it is theory and it is still in the “lab” testing stage not ready for implementation at the student level .
quoting abstract: “Over the past decade[actually, two now], state legislatures have experienced increasing pressure to hold higher education accountable for student learning. This pressure stems from several sources, such as increasing costs and decreasing graduation rates. To explore the feasibility of one approach to measuring student learning that emphasizes program improvement, we administered several open-ended tests to 1365 students from 14 diverse colleges. The strong correspondence between hand and computer assigned scores indicates the tests can be administered and graded cost effectively on a large scale. The scores were highly reliable, especially when the college is the unit of analysis; they were sensitive to years in college; and they correlated highly with college GPAs. We also found evidence of ‘‘value added’’ in that scores were significantly higher at some schools than at others after controlling on the school’s mean SAT score. Finally, the students said the tasks were interesting and engaging.”
All of this got worse during the Great Recession… so this has filtered down to the IT office in the Department of Ed and your local schools; please note that he states the reliability when the COLLEGE is the unit of analysis (not the student). When they take these models from higher ed we get 3500 students in Boston poverty area schools tested on “grit”…. with questionnaires devised for older students. There are a lot of implementation flaws. Also, note pressure is from increasing costs and computerized scoring (save$$ money if you believe them — I don’t). The work of research is often grabbed up for a political agenda and the implementation in public schools (before testing the theory and working out the “kinks”) turns out the way we are experiencing things today. I think Klein is also cautious in interpreting “value added” but the politicians and “educates” over-generalize and make bumper stickers and then Arne Duncan takes all the R&D money and squanders it while providing no further resources for improving curriculum or programs (other than “turning around” with a gun to the teacher’s head.)
LikeLike