This comment was posted yesterday:

I am a former, part time item writer for a private testing company; I wrote for many different state standards under NCLB. I must say that poorly constructed, confusing, or developmentally inappropriate items undermine the validity of standardized scores and subsequent use in teacher evaluation. When standardized tests are properly constructed, such items which might make it to a field test will almost certainly be vetted during what is typically a two year process. Many items on the Pearson math and ELA administered last April here in NY were written, in my opinion, in an intentionally confusing style using obtuse or arcane vocabulary. The ELA test in particular included confusing item stems and distractors that were not clearly wrong. There were far too many items that turned subjective opinions (most likely; best; author’s intent; etc.) into a “one right, three wrong” format. Many teachers were unsure of the correct answers on a number of vague and fuzzy items.
The math test included many items that were ridiculously convoluted. Although there may be other compelling arguments against VAM teacher evaluations, corrupt test writing, norm referencing (instead of criterion referenced scoring), and manipulating cut scores add up to a rather important set of reasons to invalidate the entire process.