Peter Greene Deconstructs Think Tankers’ Ideas About Teacher Evaluation

Peter Greene here evaluates a report by two analysts at Bellwether Education, a DC think tank, about how teachers should be evaluated. His post is a model of how to tear apart and utterly demolish the musings of people far removed from the classroom about how things ought to work.

He begins by situating its sponsor:

“I am fascinated by the concept of think tank papers, because they are so fancy in presentation, but so fanceless in content. I mean, heck– all I need to do is give myself a slick name and put any one of these blog posts into a fancy pdf format with some professional looking graphic swoops, and I would be releasing a paper every day.

“Bellwether Education, a thinky tank with connections to the standards-loving side of the conservative reformster world, has just released a paper on the state of teacher evaluation in the US. “Teacher Evaluation in an Era of Rapid Change: From ‘Unsatisfactory’ to ‘Needs Improvement.'” (Ha! I see what you did there.) Will you be surprised to discover that the research was funded by the Bill and Melinda Gates Foundation?”

He reviews what they describe as current trends and pulls each one apart.

Here is an example of a current trend and Greene’s response:

“3) Districts still don’t factor student growth into teacher evals

“Here we find the technocrat blind faith in data rearing its eyeless head again”

The authors say: “While raw student achievement metrics are biased—in favor of students from privileged backgrounds with more educational resources—student growth measures adjust for these incoming characteristics by focusing only on knowledge acquired over the course of a school year.”

“This is a nice, and inaccurate, way to describe VAM, a statistical tool that has now been discredited more times than Donald Trump’s political acumen. But some folks still insist that if we take very narrow standardized test results and run them through an incoherent number-crunching, the numbers we end up with represent useful objective data. They don’t. We start with standardized tests, which are not objective, and run them through various inaccurate variable-adjusting programs (which are not objective), and come up with a number that is crap. The authors note that there are three types of pushback to using said crap.

“Refuse. California has been requiring some version of this for decades. and many districts, including some of the biggest, simply refuse to do it.

“Delay. A time-honored technique in education, known as Wait This New Foolishness Out Until It Is Replaced By The Next Silly Thing. It persists because it works so often.

“Obscure. Many districts are using loopholes and slack to find ways to substitute administrative judgment for the Rule of Data. They present Delaware as an example of how futzing around has polluted the process and buttress that with a chart that shows statewide math score growth dropping while teacher eval scores remain the same.

“Uniformly high ratings on classroom observations, regardless of how much students learn, suggest a continued disconnect between how much students grow and the effectiveness of their teachers.

“Maybe. Or maybe it shows that the data about student growth is not valid.

“They also present Florida as an example of similar futzing. This time they note that neighboring districts have different distributions of ratings. This somehow leads them to conclude that administrators aren’t properly incorporating student data into evaluations.

“In neither state’s case do they address the correct way to use math scores to evaluate history and music teachers.”

After carefully pulling apart the report, here are the conclusions, theirs and his:

Greene reviews their recommendations:

“It’s not a fancy-pants thinky tank paper until you tell people what you think they should do. So Adelman and Chuong have some ideas for policymakers.

“Track data on various parts of new systems. Because the only thing better than bad data is really large collections of bad data. And nothing says Big Brother like a large centralized data bank.

“Investigate with local districts the source of evaluation disparities. Find out if there are real functional differences, or the data just reflect philosophical differences. Then wipe those differences out. “Introducing smart timelines for action, multiple evaluation measures including student growth, requirements for data quality, and a policy to use confidence intervals in the case of student growth measures could all protect districts and educators that set ambitious goals.

“Don’t quit before the medicine has a chance to work. Adelman and Chuong are, for instance, cheesed that the USED postponed the use of evaluation data on teachers until 2018, because those evaluations were going to totally work, eventually, somehow.

“Don’t be afraid to do lots of reformy things at once. It’ll be swell.

“Their conclusion

“Stay the course. Hang tough. Use data to make teacher decisions. Reform fatigue is setting in, but don’t be wimps.

“My conclusion

“I have never doubted for a moment that the teacher evaluation system can be improved. But this nifty paper sidesteps two huge issues.

“First, no evaluation system will ever be administrator-proof. Attempting to provide more oversight will actually reduce effectiveness, because more oversight = more paperwork, and more paperwork means that the task shifts from “do the job well” to “fill out the paperwork the right way” which is easy to fake.

“Second, the evaluation system only works if the evaluation system actually measures what it purports to measure. The current “new” systems in place across the country do not do that. Linkage to student data is spectacularly weak. We start with tests that claim to measure the full breadth and quality of students’ education; they do not. Then we attempt to create a link between those test results and teacher effectiveness, and that simply hasn’t happened yet. VAM attempted to hide that problem behind a heavy fog bank, but the smoke is clearing and it is clear that VAM is hugely invalid.

“So, having an argument about how to best make use of teacher evaluation data based on student achievement is like trying to decide which Chicago restaurant to eat supper at when you are still stranded in Tallahassee in a car with no wheels. This is not the cart before the horse. This is the cart before the horse has even been born.”

jcgrim says:

September 5, 2014 at 1:04 pm

“I mean, heck– all I need to do is give myself a slick name and put any one of these blog posts into a fancy pdf format with some professional looking graphic swoops”

I think this also applies to economists.

SomeDAM Poet (Devalue Added Model) says:

September 5, 2014 at 2:11 pm

” Econo Mist* ”

The fog-bank from economists
Is obfuscating conman trysts
Obliterating common sense
And sabotaging recompense

*The German meaning also works

- Dienne says:
  
  September 5, 2014 at 2:16 pm
  
  How the hell do you just whip these off?? I have nothing to say other than, you’re just freaky brilliant. Keep it coming.
- SomeDAM Poet (Devalue Added Model) says:
  
  September 5, 2014 at 2:32 pm
  
  in the case of economists, they whip themselves (self flagellation)
  
  Don’t need any help from me. 🙂
- Dienne says:
  
  September 5, 2014 at 2:33 pm
  
  🙂
- KrazyTA says:
  
  September 5, 2014 at 11:13 pm
  
  The verdict is in.
  
  As put so eloquently by a very old and very dead and very Greek guy:
  
  “As a vessel is known by the sound, whether it be cracked or not; so men are proved, by their speeches, whether they be wise or foolish.” [Demosthenes]
  
  And as amply demonstrated, SomeDAM Poet is not just wise.
  
  “Against the assault of laughter nothing can stand.” [Mark Twain]
  
  Wisdom. Humor. Reminds me of a few lines from that Tennessee Ernie Ford song, “Sixteen Tons”:
  
  “One fist of iron, the other of steel
  If the right one don’t a-get you
  Then the left one will”
  
  😎
- Titleonetexasteacher says:
  
  September 6, 2014 at 3:06 pm
  
  You are like the Eminem of education
Betsy Marshall says:

September 5, 2014 at 6:55 pm

Where is TE? He is oddly silent?

- SomeDAM Poet (Devalue Added Model) says:
  
  September 5, 2014 at 10:11 pm
  
  TE phone home
- teachingeconomist says:
  
  September 6, 2014 at 7:35 am
  
  I am here wondering why a paper written by a man with a masters in public policy and a women with a degree in community health would lead to a discussion ridiculing the work of people like James Heckman of the University of Chicago.

Joe says:

September 5, 2014 at 1:54 pm

Long story short, we now have the Donald Trumpification of American public education. A disaster in the makings.

Yvonne Siu-Runyan says:

September 5, 2014 at 2:05 pm

What about THOUGHT LEADERS? OY! Same-o, same-o. Insanity.

Linda Johnson says:

September 5, 2014 at 3:15 pm

This “think tank” is about “edujobs” which are basically jobs in the education “reform” sector that do not require direct service to children (i.e. teachers need not apply). It is about siphoning school tax money into private pockets. Fortunately the media and the public are catching on.

Bill Gates could have done so much for education just by familiarizing himself with 50 years of educational research. There is a lot that we know (Hint: What’s good for the privileged child is usually good for the underprivileged). I like the man and still hope that he changes course.

NJ Teacher says:

September 6, 2014 at 6:06 am

You like Bill Gates? I would suggest that your affections are misplaced.

Joanna Best says:

September 5, 2014 at 3:21 pm

It has always bothered me that there are people who influence the classroom who have spent their careers trying to figure out how to avoid the classroom.

SomeDAM Poet (Devalue Added Model) says:

September 5, 2014 at 4:31 pm

…and reality, too

“Divorced from Reality”

I never married reality
So can not be divorced
Reality is not for me
And sure can’t be enforced

- Threatened out West says:
  
  September 5, 2014 at 7:08 pm
  
  You are the total BOMB, DAM poet! Thank you!
- caligirl says:
  
  September 6, 2014 at 12:14 am
  
  You are dylan like – economy of words – either dylan…
- Titleonetexasteacher says:
  
  September 6, 2014 at 3:07 pm
  
  I like this because it is all encompassing: it can be applied to Michelangelo and Bill Gates.

retired teacher says:

September 5, 2014 at 3:41 pm

There’s nothing like the “scientific method.” You’re supposed to look at data to see if it supports or refutes your hypothesis. The problem with faux think tanks is that they are coming up with a conclusion and going to make everything else fit their biased model. Drug companies do this all the time! They want to pin the academic stumbling of some American children on teachers. If they can discredit, demoralize teachers, and attack their due process and pensions, they can swoop in like the vultures they are and pick the bones of American education through privatization. The American Statistical Society has already come up with the unbiased answer of a range of 1-14% of teacher accountability. There’s too many variables, and it’s not significant enough to hang a teacher’s paycheck on

Old Teacher says:

September 5, 2014 at 3:44 pm

The test Vam monster lurks in the fog
Public funds to eat it cultivates
It must be chopped down like a log
Then once more we’ll educate

Duane Swacker says:

September 5, 2014 at 4:07 pm

From the article:

‘Here we find the technocrat blind faith in data rearing its eyeless head again.

But Peter I’m sure that there are at least a few light sensing cells on that head!

GE2L2R says:

September 5, 2014 at 4:13 pm

How is it possible to make rational arguments based on false / flawed hypotheses / assumptions?

Laura H. Chapman says:

September 5, 2014 at 4:57 pm

An economic concept of growth as a “measurable gain” has migrated into federal policies for education. The policy impulse is to simplify the multifaceted character of education and treat the enterprise of teaching and learning as a business in need of proper management to get results. The desired results are defined by forms of learning that can be measured and with a calculation of the rate of learning within a year and year-to-year, comparable to knowing whether profits are increasing—on a trajectory of growth or not.

This economic concept of growth as a “rate of increase” now overrides the educational meanings of human growth and learning—as a multifaceted, dynamic, and interactive process with daily surprises and influences from many sources.

Federal policies treat the economic meaning of growth as a virtue and as an imperative for accountability. This “accountability imperative” is evident in key definitions within RttT regulations and other grant programs. Federal Register. (2009, November 18). Rules and regulations Department of Education: Final Definitions. 74 (221-34), 559751-52.

“Student achievement means (a) For tested grades and subjects: (1) A student’s score on the State’s assessments under the ESEA; and, as appropriate, (2) other measures of student learning, such as those described in paragraph (b) of this definition, provided they are rigorous and comparable across classrooms. (b) For non-tested grades and subjects: Alternative measures of student learning and performance such as student scores on pre-tests and end-of-course tests; student performance on English language proficiency assessments; and other measures of student achievement that are rigorous and comparable across classrooms.”

“Student growth means the change in student achievement for an individual student between two or more points in time.”

“Rigorous” means “statistically rigorous.” Federal Register. (2009, July, 29). Notices 74(144), 37803-37. Retrieved from the Federal Register Online via GPO Access [wais.access.gpo.gov] [DOCID:fr29jy09-148]

The federal definition of an “effective” teacher requires attention to the rates at which student’s scores increase.

“Effective teacher means a teacher whose students achieve acceptable rates (e.g., at least one grade level in an academic year) of student growth (as defined in this notice).

“Highly effective teacher means a teacher whose students achieve high rates (e.g., one and one-half grade levels in an academic year) of student growth (as defined in this notice.”

If should be obvious that calculations to determine “rates” of growth depend on a data system that matches the test scores of individual students and the “teacher of record” for a given student and test. Gates and USDE have poured millions into getting data systems linked and free of crud that will compromise the metrics for accounting.

These integrity of data in these records serve as “baselines” for estimates of the “value-added” by a teacher to the scores of their students and various sub-groups. VAM produce these estimates. SLOs (student learning objectives) are a proxy for VAM until statewide tests for nontested subjects are developed.

Federal definitions mandate “comparable” ratings of teachers regardless of the grade or subject. Learning a foreign language, or math, or learning in dance must be made to look comparable. The bean counters, and bookeepers, and accountants, and statisticians can’t deal with qualitative differences.

Federal policy makers have sought to “normalize” the idea that economic growth is the same as “student growth’ and just an extension of the longstanding metaphor of teaching as nurture, cultivation, gardening (kindergarten)—a child’s garden.

Today, almost every teacher who uses the phrase “student growth” in connection with evaluation has been infected with the federal definition.

Some value-added experts love this easy conflating of the meanings of growth because it makes the convoluted metrics for VAM and SLOs easier to sell… And the silly oak tree analogy one means of doing so. See. http://www.varc.wceruw.org/tutorials/Oak/index.htm

Duane Swacker says:

September 5, 2014 at 6:47 pm

““Highly effective teacher means a teacher whose students achieve high rates (e.g., one and one-half grade levels in an academic year) of student growth (as defined in this notice.”

How completely absurd is that definition considering there is no solid agreed upon definition of a “grade level” nor of what exactly “student growth is except for the circular definition implied in that statement!?!?

Duane Swacker says:

September 5, 2014 at 6:49 pm

Otros pensamientos brillantes.

OPB!

Spanish & French Freelancer says:

September 6, 2014 at 4:07 am

zzzzzzzzzzzzzzzzzzzz Part and parcel of this problem is that economic growth measures have all been skewed toward the short term– the quarterly and annual fiscal reports– through changes in accounting practices underway since the late ’70’s. Mergers & acquisitions, leveraged buyouts, quick turnaround, emergency takeover– get in, get your $ & get out– the sort of predatory financing that set in when long-term financial stability began to leave our shores– the picking of the carcass if you will– characterizes the prevailing ‘economic concept of growth as a measurable gain’. Long-term planning characterizes proper management of education, transportation, infrastructure.

Donna says:

September 5, 2014 at 6:26 pm

I cannot help but think of the state of our world, with wars Ukraine/Russia, Israel/Gaza-Palestine, and the lunatic savages in Syra/Iraq – and these idiot think tanks backed by millionaires have nothing better to do than create bogus one-sided reports and manipulate statistics and spend billions on elections all to undermine US education, break unions, eliminate teachers and instill a curriculum to create dummies. Its shameful. There are more important things going on in the world than to waltz in edu-boob circles.

Betsy Marshall says:

September 5, 2014 at 6:58 pm

Can I quote you Donna? Brilliant!

MathVale says:

September 5, 2014 at 8:11 pm

The idea you can isolate or accurately measure student growth is like a bad rumor in high school you can’t seem to shake.

If a student scores lower on a post test, that means they somehow lost knowledge that you, as a teacher, never taught them.

And what of students several grade levels below the threshold where the test theoretically begins measuring growth? They do not even register, even if they improve several grade levels.

September 5, 2014 at 8:59 pm

More of the absurd definitions and econometric nonsense of the day.
Students are performing on grade level if their scores on a standardized test are at or above the median on a percentile scale (1-99). On a large-scale test, a score at or near the 50th percentile (the median) will usually classify a student as proficient in the skills and subject matter on the test.

Expected growth means that gain-scores of students (on tests in a single subject, such as math or art) are staying in about the same location in a distribution from year to year—below average, average, or above average. For a large number of students, the distribution is likely to resemble a bell or normal curve.

Predicted growth is an inference about a student’s future gain-score, derived from a linear regression analysis of two or more years of that student’s gain-scores. This analysis assumes that past performance will predict future performance. Perhaps, but in education, this is a dismal assumption. It can become a self-fulfilling prophecy. The assumption is so risky that almost every corporate report begins with this caveat: Past performance does not predict future performance.

A student is said to have achieved a year’s worth of growth if his or her gain-score on a test of proficiency is equal to, or greater than, the gain-score made by a 50th percentile student. The same measure is applied to teachers. Teachers in some districts are rated highly effective only if all or most of their students have gain-scores of more than a year’s worth of growth.

References to a year’s worth of growth are fundamentally misleading because the common mental picture of a calendar year is different from a school year (typically 180 days); an instructional year (typically 172 days); and a typical accountability year (130 days from pre-test to post-test).

Academic peers are students whose test scores in a given year are the same or nearly the same. This concept permits comparisons of their gain-scores from the prior year to the current year. Students who make greater gains than their academic peers have an accelerated growth trajectory. Students who fall behind their academic peers need remedial work to keep up. The average of the gain-scores for academic peers in a teacher’s classes is typically used as a measure of the teacher’s productivity and effectiveness. This use requires a studied indifference to other influences on test scores.

A growth trajectory needs a target. Targets for learning need to be set using baseline data so the instruction offered to each student, during a known interval of time, is efficient and has a measurable impact on student learning. Meeting targets for learning is analogous to meeting a sales target or a production quota by a date certain.

Teachers and others who say they are “impacting the growth of their students” are not think-ing about the meaning of words. They are parroting econometric jargon.

Experts associated with Metametrics hope to set growth velocity standards. They describe their theoretical mapping of “aspirational trajectories toward graduation targets” in reading skills as analogous to “modifying the height, velocity, or acceleration respec-tively of a projectile launched in the physical world.” They seek greater precision in setting targets and cut scores for grade-to-grade progress in meeting the CCSS. (Williamson, G. L., Fitzgerald, J., & Stenner, A. J. (2013). The Common Core State Standards’ quanti-tative text complexity trajectory figuring out how much complexity is enough. Educational Researcher, 42(2), 59-69.).

Calibration refers to the quest for precision and consistency in measurement in the context of just-in-time delivery of a result, especially manufacturing.. In education, the term means that evaluators and other monitors have followed specifications in rating performances, presentations, processes, and products. Calibration events are training sessions intended to standardize how raters use or interpret language and to verify that rules for making judgments have been followed with fidelity. Such events are also called trainings or calibrations.

Audits are conducted to verify that calibrations are not needed, that rules have been followed, that data are free of ambiguity, and that low-inference definitions of performances and metrics are used consistently. Questions about the validity of the metrics may be ignored.

Bring to scale means that an educational policy, practice, or product is believed to merit replication in multiple locations, as in manufacturing and franchise systems for a mass market.

Duane Swacker says:

September 6, 2014 at 3:22 am

“Questions about the validity of the metrics may be ignored.”

YEP!

All the edu-babble from the edudeformers, devout devotees of edumetrics, who love their mental masturbation with edumetric memes.

- SomeDAM Poet (devalue Added Model) says:
  
  September 6, 2014 at 1:51 pm
  
  Hey, you have a poem there.
  
  All it needs is a little rearrangement and a title,if I may
  
  “Edu Scrabble”
  
  Edubabble from edudeformers,
  Devotees of edunormers
  Apply their edumathturbation
  To edumetric memes
  Resulting in disasturbation
  Of education themes
- Duane Swacker says:
  
  September 7, 2014 at 8:34 am
  
  I sit in awe of you SDP!
  
  My brain doesn’t work very well in rhyme.
  
  I hope you don’t mind if I save and use that one!
teachingeconomist says:

September 6, 2014 at 7:44 am

Laura,

I am curious about how you determine grades for students in your classes. Does it have nothing to do with comparing how much a student should learn from the class to how much each individual student has learned from the class (measured by examining teacher constructed exams and assignments)?

Nancy Slator says:

September 5, 2014 at 9:18 pm

“more paperwork means that the task shifts from “do the job well” to “fill out the paperwork the right way” which is easy to fake.” I wish that were true. I’m in trouble not because of anything missing during the actual observations of my teaching but because I didn’t do a good enough binder of paperwork documenting that I did a good job of teaching. I need help with how to fake it.

Peter Greene Deconstructs Think Tankers’ Ideas About Teacher Evaluation

36 Comments Post your own or leave a trackback: Trackback URL

Leave a comment Cancel reply

Search All Posts

Previous posts

Recent posts

Top posts

Follow blog via email

Follow blog via RSS reader

Blog Stats

Peter Greene Deconstructs Think Tankers’ Ideas About Teacher Evaluation

Share this:

36 Comments Post your own or leave a trackback: Trackback URL

Leave a comment Cancel reply

Search All Posts

Previous posts

Recent posts

Blog Topics

Top posts

Follow blog via email

Follow blog via RSS reader

Blog Stats