Jennifer Borgioli, whom I met via Twitter and know as DataDiva, has sent me a post about performance assessments in New York. She is responding to an earlier post about the New York Performance Standards Consortium, which has thus far not gotten permission form the state to add 19 schools to its group. The Consortium many years ago won an exemption from all state standardized testing (except for the Regents ELA exam) and relies instead of performance assessments judged by teachers and others. The article cited in the earlier post indicated that the state was reluctant to allow other schools to escape the state testing regime, which Jennifer does not contest. She believes that the state is open to performance assessment within its testing regime.
In your blog, your last paragraph seemed to suggest that NYS wants to discourage the use of performance-based tasks when in fact, I don’t think that’s the case. I cannot speak to the reasons why there is hesitation to approve more schools for performance-based alternatives to Regents but I do know there is room for performance-based tasks for all schools in New York. In effect, there are two types of performance tasks. The large-scale, long-term tasks as you described (portfolio assessments, etc.) that are used for exit criteria and reflect a deep understanding of the content and skills in a given domain or topic and then there are small-scale, shorter on-demand performance tasks that require students to follow a series of steps or tasks in order to generate a product or engage in a performance.
The NYS APPR guidance documents reference performance-tasks at least twice:
F5. We want to use locally‐developed performance tasks for a variety of grades and subjects that would be assessed using a rubric. Is that allowable?
Subject to local negotiation, locally‐developed performance tasks scored by a rubric could be used as a district, regional, or BOCES developed assessment wherever locally developed assessments are allowed as either a comparable growth measure or a locally selected measure provided that such assessments are rigorous and comparable as described above.
G4. Does vested interest rule apply to pre‐tests given to establish a baseline for a SLO?
To the extent practicable, districts or BOCES should ensure that any assessments or measures, including those used for performance‐based or performance task assessments that are used to establish a baseline for student growth are not disseminated to students before administration and that teachers and principals do not have a vested interest in the outcome of the assessments they score
In a number of regions across the state, teachers are working together to design performance-tasks that get to the most critical learning of their content area or course as determined by the New York State Learning Standards in a way that is authentic as possible. These are not large scale tasks full of student choice and authentic assessment but neither are they traditional, multiple choice tests. Their use reflects a commitment by the participating schools to minimize the impact of APPR regulations on students. These assessments are designed by classroom teachers and will be scored by them using rubrics they create. Though we do not have evidence of reliability yet, that will come after the administration of the pre-assessments and inter-rater reliability analysis this fall, there is every reason to believe that these assessments will generate results as consistent as those generated by a machine-scored, publisher-created multiple choice test.
It’s a small move but it’s a start.
Just to be clear the last two paragraphs in italics are not in the state guidance doc, but instead Jennifer’s interpretation of the situation. The state doc is posted here: http://goo.gl/PQ1L9 In addition, Commissioner King has threatened to intervene in any teacher evaluation system that is not “rigorous” enough.
Thanks for clarifying that, Leonie. I think the formatting of Diane’s blog bumped over my quotes. Rigor is always an interesting challenge. Though, I’ve confidence the teachers are designing tasks that are more complex, rich, and deep than any multiple choice tests will ever be.
I do not see how Jennifer can claim that a single constructed response item can generate results as consistent as those of a multiple choice test.
More valuable? Perhaps. More valid? Could be. More consistent? Not really. Inter-rater reliability does not address Item x Test-Taker interaction. The sampling of the domain would FAR less broad, leading to inference issues, too.
Yes, we need constructed response items to get into problem solving, reasoning and writing. And yes, those three things could be the most important things about formal education. We need constructed response.
There are are real reasons to include multiple choice questions. Because they take students less time, they allow us to sample more broadly from the content domain (people care a lot about content knowledge in most subjects) and they reduce variance by increasing the number of items.
Unless we are willing to devote a lot more time to testing, we need to include multiple choice items. One to three constructed response items simply cannot be as consistent 25-40 multiple choice items.
I am not sure that unwillingness of the state to approve performance assessments in general is as much an issue as the commissioners unwillingness to back off on the weight of the standardized state testing in teacher evaluations. He was pressuring (threatening?) districts and unions to agree to such systems before new state tests were even ready, the most recent tests were not of good quality overall(by his own admission), and in the near future word has it we will move to throughout the year PARCC assessments. So it doesn’t appear our own ed commissioner’s focus is on quality assessments or the people who are always ready to help with the concept of what students need and the many ways they demonstrate success. It appears his focus is on fast and sloppy, and get rid of people who know what fast and sloppy is.