Georg Lind: Our Stupid Standardized Tests

Georg Lind is a retired psychologist.

“As a psychologist I recommend to abandon all tests based on Classical and Modern “test theories.” But I am not sure whether my colleagues at APA and AERA will agree with me. They make a living on applying traditional tests. Even those who critically examine test usage do not question their validity and their use in principle. They have not only vested interests but have not heard of possible alternatives to which they could switch. Critical scholars like Alan Schoenfeld, professor of math didactics and former president of APA, warn us of the use of psychometric methods but all they suggest is a moratorium of tests. I think we can do better.

“I am a retired German professor of psychology, having specialized in experimental and psychometric methods, besides my involvement in the study of moral-democratic competence and its application in education. Already during my study at university I developed some suspicion against Classical Test Theory and its modern variations (IRT, Rasch-scaling), on which nearly all tests are based. The better I understood these “theories” the more I discovered that they have nothing to do with scientific psychology. Prevailing test theories are a modern form of Vodooism with sacred rituals which are to make the people believe that our sorting and evaluating of people is something rational, scientific. It is not.

“Prevailing test theories fail an important standard of sound science: they cannot be falsified by data, they are immune against reality. If a test yields some anomalies, its items are replaced until the data fit the statistical dogma of reliability – regardless of the damage this “item analysis” does to the overall validity of the test. Because test makers have no real understanding of what they measure they cannot answer the basis question of validity: Does the test really measure what we intent to measure? Instead they invent all kinds of “validities” in order to save their assumptions.

“No wonder that these tests have all failed. They have little, if any, “prognostic validity”. Even much criticized teacher grading is a better predictor of college success. Moreover, no support can be found for the allegation that their use would improve teaching and learning. I have analyzed many studies of the effects of the high-stakes-testing which began with the Head Start program in 1965, the year when I was exchange student in the US. I could not find any support for this allegation. Some small, short-term increases of test scores occurred but they could be fully explained by growing test-wiseness and cheating. Therefore, tests have to be replaced by new versions at an ever faster rate.

“Then it was the first time I had to take a test as a school student. In Germany we had no multiple choice tests in school until PISA started. I was surprised how easy it was to get an A. To answer a 90-minute test, it took me just ten minutes. I did not know many of the answers, I just made guesses. Only much later I understood why my school-mates worked harder but got lower test scores. It was BECAUSE they worked harder. For me tests were just fun like cross-word puzzles. I was not obliged to get credits. For them tests were high-stakes. They scared the hell out of them and confused them. Peter Sacks has shown how test anxiety, students’ background and test scores are connected. Tests cannot compensate for student poverty, bad teacher-education and poor curriculum. On the contrary, they even seem to deepen these disadvantages.

“But, if tests are based on well-elaborated teaching goals and on sound psychology, and if they are used anonymously, they can be a great help for improving curriculum and teaching methods. If tests are not used for evaluating people (which I believe is a human rights issue), but for evaluating teaching method and content, and for improving teacher education programs, they can be a real blessing. I have shown how a valid test can help to multiply the effect size of methods for teaching moral competence. Just google for the experimentally designed Moral Competence Test. Its construction principle, Experimental Questionnaire, can be easily adapted for other fields of teaching.”

Dienne says:

April 5, 2017 at 11:17 am

I’m sorry, I know I’ve gotten awfully cynical, but this guy just sounds like he’s peddling his own wares. Those standardized tests are baaaad. But mine are good because they’re based on “sound theory”. But he’s talking about teaching and testing “moral competence”. What the heck? What is that? If you surveyed 100 people about what “moral competence” is, you’d get 110 different answers. You can’t assess, let alone measure, anything that can’t be defined (and even many things that can be defined still can’t be measured). He can keep his test of “moral competence” away from my kids, thanks.

LikeLike

ciedie aech says:

April 5, 2017 at 1:29 pm

Yes. Seems to me the most telling line in Lind’s criticism can be pared down just a bit to truly reveal the most dangerous problem with testing: “THEY MAKE A LIVING on applying…tests.

LikeLike

Máté Wierdl says:

April 6, 2017 at 9:22 am

Not sure, we want to throw out every idea with the word “test” in it. Pscychologists use the word “test” left and right without realizing that tests and measurements, by now, have a really bad rep. Freud’s language got hijacked by the dark side.

I don’t think he proposes to “measure” in the trivial sense. He wants to figure some things out, like what he calls moral competence

General definition of moral competence:

The ability to resolve problems and conflicts on the basis of moral principles through deliberation and discussion instead of violence, deceit, and force.

Though he does use, unfortunately for him, the word “measure” instead of “figures out”. Bad framing again. 🙂

Operational definition (MCT):

The MCT measure the ability to rate arguments by their moral quality rather than other criteria like opinion-agreement.

http://www.uni-konstanz.de/ag-moral/mut/mjt-engl.htm

LikeLike

- jrkrideau says:
  
  April 16, 2017 at 12:05 pm
  
  Minor complaint. Freud’s language got hijacked by the dark side.
  It was not Freud whom most (all?) psychologists do not consider to ever have been a psychologist.
  
  You are thinking about people like Alfred Binet who, I think, would have been horrified by some current misuses of “tests'”. I am not sure but I suspect he would have been a bit horrified about some uses of adaptations of his test and their use in the USA 0 or 80 years ago.
  
  I have no idea what “moral competence” is but I am pretty sure it is not a “test” as most psychologists understand the word. We might use the work “instrument” or even “measure” but not “test”. And we probably would be very dubious too.
  
  LikeLike

Marian Cruz says:

April 5, 2017 at 11:28 am

Pay attention educators!

LeftCoastTeacher says:

April 5, 2017 at 8:12 pm

Indeed. Pay attention, policy makers!

LikeLike

joe prichard says:

April 5, 2017 at 11:37 am

I liked what Georg Lind had to say. I must confess—I had a bit of trouble understanding test anxiety and anonymous tests. I can understand why test anxiety would cause fear and confusion. I wonder how anonymous test results can be certain that those taking them had an incentive to worry about their answers….sort of like when we are on the internet, and are presented with a pop up test to help companies learn stuff….I often just do what I have to to get them out of the way.

retired teacher says:

April 5, 2017 at 11:55 am

A big problem of many of the tests is that they are only offered via computer. There have been many glitches and technical problems, or districts do not have the money to run the tests on up to date computers. These tests are inappropriate for young students, and they put poor students at a bigger disadvantage. Despite the fact that research shows, students do better on pencil, paper tests, we continue with unfair computer testing designed for the ease of testing companies.

LikeLike

retired teacher says:

April 5, 2017 at 11:50 am

Ever since the monetization of our schools, test scores have been misused and abused by policymakers. Results of testing have not been used to inform or improve anything. Scores have been used to harm students, limit their choices and promote punitive retention policies. Scores have been misused to close public schools and enhance privatization. Scores have been misused against teachers and have ended careers. They have been misused against traditional teacher preparation programs in order to promote fake market based “training.” Scores have been used to target that which corporations and billionaires want to destroy while many of our policies and representatives have used high stakes testing to undermine public education.

retiredbutmissthekids says:

April 5, 2017 at 5:44 pm

In terms of the “standardized” tests–nothing new that our friend, Duane, hasn’t already told us via Wilson.
And, Duane, you can never repeat it enough! (Always new readers here.)

Laura H. Chapman says:

April 5, 2017 at 6:16 pm

How about giving the Moral Competence test to Trump, his hirelings, and members of Congress who support him.

I take that back. Any test of moral competence scares me.

The post does seem to be promoting a specific test while also being critical of some circular reasoning that is rampant in the testing industry.

Richard P. Phelps (@RichardPPhelps) says:

April 7, 2017 at 9:10 am

He’s wrong or misleading on all counts. The theories can be falsified and there’s plenty of evidence where he says none exists (e.g., http://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ). Of course, tests can be misused and often are (e.g. Common Core). And, test developers and users can behave unethically, just like anyone else. A standardized test is a tool, and like any other tool can be used beneficially or not depending on human decisions.

jrkrideau says:

April 16, 2017 at 12:14 pm

I don’t live in the USA but I get the impression that most ‘large-scale’ tests in the USA are misused due in most part to a lack of knowledge of what a good test can and cannot do. Oh and statistical uncertainty seems to be unknown.

It appears that the confidence intervals are usually too large for the use the test is put to. This may not mean there is anything wrong with the test, just that the users don’t have a clue of what they are using.

LikeLike

Georg Lind: Our Stupid Standardized Tests

13 Comments Post your own or leave a trackback: Trackback URL

Leave a reply to Máté Wierdl Cancel reply

Search All Posts

Previous posts

Recent posts

Top posts

Follow blog via email

Follow blog via RSS reader

Blog Stats

Georg Lind: Our Stupid Standardized Tests

Diane Ravitch's Blog

13 Comments Post your own or leave a trackback: Trackback URL

Leave a reply to Máté Wierdl Cancel reply

Search All Posts

Previous posts

Recent posts

Blog Topics

Top posts

Follow blog via email

Follow blog via RSS reader

Blog Stats