I posted earlier today about a new Xerox machine that is being marketed to “read” and grade student essays. Not to score bubble tests, but to grade essays. Granted, this is not a new idea. There are now different companies selling machines to grade student writing. I have seen demonstrations of this technology, and I can’t shake the feeling that this is not right.
Why? I am not opposed to technology. But here is the nub of my discomfort. I am a writer. The moment I realized I was a writer was when I discovered many years ago that I write for an audience. I think of my reader(s). If I am writing for a tabloid, I write in a certain style. If I am writing for the New York Times, I write in another way. If I am writing a letter to a family member, another style. If I am writing for a scholarly journal, something else. When I write for this blog, I have a voice different from the voice in my books. I don’t know how to write for a machine.
Robert Shepherd reminded me how important the audience is for a writer when he posted this comment about the Xerox grading machine:
“The slick piece of marketing collateral that Xerox produced for this product features, most prominently, a picture of a smiling teacher bent over to help a smiling student. But the promise of the product is precisely the opposite–that teacher feedback will be eliminated (automated).
“Clearly, it’s a fairly simple matter to create technologies that correct multiple-choice and other so-called “objective” tests. More troubling is the promise that the technology will score “constructed response” items (in non-EduSpeak, writing). Let’s be clear about this. There is no existing system that can read, as that term is understood when it is predicated of a human being. What creators of such software can do is to correlate various features of pieces of writing that can easily be recognized by software to outcomes assigned those pieces of writing by human teachers.
“So, one might come up with some formula involving use in the piece of writing of terms from the writing prompt, average sentence length, average word length, number of spelling errors, number of distinct words used, frequency of words used, etc., that yields a score that is highly correlated with scores given by human readers/graders using a rubric. At a whole other level of sophistication, one might create a system that has a parser and that does rudimentary checking of grammar and punctuation. Some of that is easy–e.g., does each sentence begin with a capital letter? Some of it is rather more difficult (a system that correctly identifies all and only those groups of words that are sentence fragments would have to be a complete model of grammatical patterns for well-formed sentences in English).
“Who knows whether the Xerox system is that sophisticated. One cannot tell whether it is from the marketing literature, which is a concatenation of glittering vagaries. But even if one had a perfect system of this kind that almost perfectly correlated with scoring by human readers, it would still be the case that NO ONE was actually reading the student’s writing and attending to what he or she has to say and how it is said. The whole point of the enterprise of teaching kids how to write is for them to master a form of COMMUNICATION BETWEEN PERSONS, and one cannot eliminate the person who is the audience of the communication and have an authentic interchange.
“Since these writing graders first started appearing, I have read an enormous amount of hogwash about them from people who don’t understand that we don’t yet have artificial intelligences that can read. Instead, we have automated systems for doing various tasks that stand in lieu of anyone doing any reading.”