From a reader:
“You might be interested in a related discussion list post “The Defiant Parents: Testing’s Discontents – Response to Hunt” [Hake (2014)]. The abstract reads:
***************************************************
ABSTRACT: In a post “Re: The Defiant Parents: Testing’s Discontents” [Hake (2014)], I pointed to the “vigorous leadership, voluminous messaging, and pro-public-/anti-private-education positions of (a) Diane Ravitch and (b) FairTest .
Then I commented that neither appeared to be informed regarding the virtues of rigorous measurement of students’ higher-order learning by means of zero-stakes formative evaluation “designed and used to improve an intervention, especially when it is still being developed” [JCSEE, copied onto p. 132 by Frechtling et al. (2010) at .
In response, Russ Hunt (2014) at wrote: “The virtues of rigorous testing aren’t really the point: it’s how the tests are administered and what uses they’re put to that Ravitch and FairTest (and I) are concerned with.”
However, it IS to the point for many of those who wish to enhance students’ higher-level learning. Modesty forbids mention of these examples:
1. “Lessons from the Physics Education Reform Effort” [Hake (2002)] at ;
2. “The Physics Education Reform Effort: A Possible Model for Higher Education” [Hake (2005)] at ;
3. “Should We Measure Change? Yes!” [Hake (2007a)] at (2.5 MB);
4. “Re: pre-to-post tests as measures of learning/teaching” [Hake (2008a)] at ;
5. “Design-Based Research in Physics Education Research: A Review” [Hake (2008b)] at (1.1 MB);
6. “The Impact of Concept Inventories On Physics Education and Its Relevance For Engineering Education” [Hake (2011a)] at (8.7 MB);
7. “SET’s Are Not Valid Gauges of Students’ Higher-Level Learning #2” [Hake (2011b)] at ;
8. “The NRC Finally Comes to Its Senses on Improving STEM Education” [Hake (2013a)] at ; and
9. “Can the Cognitive Impact of Calculus Courses be Enhanced?” Hake (2013b)] at (2.7 MB).
***************************************************
To access the complete 66 kB post please click on .
Regards,
Richard Hake, Emeritus Professor of Physics, Indiana University; Honorary Member, Curmudgeon Lodge of Deventer, The Netherlands; President, PEdants for Definitive Academic References which Recognize the Invention of the Internet (PEDARRII); LINKS TO: Academia ; Articles ; Blog ; Facebook ; GooglePlus ; Google Scholar ; Linked In ; Research Gate ; Socratic Dialogue Inducing (SDI) Labs ; Twitter .
“Physics educators have led the way in developing and using objective tests to compare student learning gains in different types of courses, and chemists, biologists, and others are now developing similar instruments. These tests provide convincing evidence that students assimilate new knowledge more effectively in courses including active, inquiry-based, and collaborative learning, assisted by information technology, than in traditional courses.” – Wood & Gentile (2003).
REFERENCES [URLs shortened by and accessed on 31 Jan 2014.]
Hake, R.R. 2014. “The Defiant Parents: Testing’s Discontents – Response to Hunt,” online on the OPEN! AERA-L archives at . The abstract and link to the complete post are being transmitted to several discussion lists and are on my blog “Hake’sEdStuff” at with a provision for comments.
Mead, R. 2014. “The Defiant Parents: Testing’s Discontents,” New Yorker, 23 January; online at .
Wood, W.B., & J.M. Gentile. 2003. “Teaching in a research context,” Science 302: 1510; 28 November; online as a 213 kB pdf , thanks to Ecoplexity .”
Hake writes: “Modesty forbids mention of these examples:” followed by 9 citations. Odd kind of modesty.
🙂
Irony, friend, irony.
Over use of high stakes testing has trained students that learning is about right and wrong answers, not the learning process. After a decade of high stakes testing, students are unwilling to be risk takers in the learning process and they demonstrate weak problem solving skills. A standardized test at each level, elem, middle and high school would suffice.
Why do we even need that? NO standardized testing! It teaches us nothing. However, I DO agree that the decade of testing has conditioned students. I have a hard time getting students to answer questions that are longer than a sentence.
Testing, especially bubble tests are lousy no matter how you look at them. However, if connected to the classroom, they can have value as a snap shot in time. They must, however, be compared with other assessments as well as classroom activities as a piece of the puzzle. And they MUST be pre and post, nothing else truly assesses a student that you have in your class. And thirdly, they must be short and not time consuming. Remember, they are snap shots in time. http://savingstudents-caplee.blogspot.com/2013/12/accountability-with-honor-and-yes-we.html
“Testing, especially bubble tests are lousy no matter how you look at them”?? Not everyone agrees. Psychometrician Wilson & Bertenthal (2005) wrote (p. 94):
“Performance assessment is an approach that offers great potential for assessing complex thinking and learning abilities, but multiple choice items also have their strengths. For example, although many people recognize that multiple-choice items are an efficient and effective way of determining how well students have acquired basic content knowledge, many do not recognize that they can also be used to measure complex cognitive processes. For example, the Force Concept Inventory . . . [Hestenes et al. 1992] . . . is an assessment that uses multiple-choice items to tap into higher-level cognitive processes.”
REFERENCES
Halloun, I., R.R. Hake, E.P. Mosca, & D. Hestenes. 1995. “Force Concept Inventory (1995 Revision),” online (password protected) at http://bit.ly/b1488v, scroll down to “Evaluation Instruments.” Currently available in 21 languages: Arabic, Chinese, Croatian, Czech, English, Finnish, French, French (Canadian), German, Greek, Indonesian, Italian, Japanese, Malaysian, Persian, Portuguese, Russian, Spanish, Slovak, Swedish, & Turkish.
Hestenes, D., M. Wells, & G. Swackhamer. 1992. “Force Concept Inventory,” Phys. Teach. 30: 141-158; online (except for the test itself) at http://bit.ly/b1488v. For the 1995 revision see Halloun et al. (1995).
Wilson, M.R. & M.W. Bertenthal, eds. 2005. “Systems for State Science Assessment,” Nat. Acad. Press; online at http://bit.ly/f6WFeg.
Hopefully, the type of testing that is appropriate for 2nd graders, and the type of testing that is applicable for 11th grade math and science students, are two different discussions.
My thoughts exactly. Too often talk of teaching and learning means high school “subjects” rather than anything meaningful to my 1st graders and me.
This completely ignores the premises, context and political economy in which so-called education reform is unfolding.
Healthy food doesn’t grow in poisoned waters, and while it’s common sense that challenging, no-stakes exams can inform and improve teaching, it’s naive to think that those who are pushing and benefitting from the testing-as-curriculum regime would respond to this with anything other than a condescending smirk.
As a physics teacher I am familiar with these tests. Usually they are designed to diagnose misconceptions. And they are very good at that. However they are low stakes and diagnostic. And limited in scope. They do not test, and nor do they claim to, everything about the student. So really Hake is out of touch with the landscape of the K-12 testing mess.
For anybody else who wonders WHAT that was, here is Prof Hake’s Google Scholars page, all in one link.
http://scholar.google.com/citations?user=10EI2q8AAAAJ&hl=en
So, poking around in the citations, I can see that Hake did investigate the relationship between inquiry methods and physics test scores in 1998, and “pre/post-test data using the Halloun–Hestenes Mechanics Diagnostic test or more recent Force Concept Inventory is reported for 62 introductory physics courses enrolling a total number of students N= 6542.”
That paper accounts for the bulk of the citations since, possibly because it’s there to cite and has the same search terms as the current discussion. I wouldn’t mind a discussion about it, although I’d hope the professor also held forth on supercondictivity if I was picking up the tab.
Prof Hake, nobody is attacking low stakes testing research, unless it’s the high stakes testing industry. Your assertion that Diane Ravitch is unfamiliar with the field is off the mark. We’re just discussing an entirely different animal these days. You would readily admit that your finding was limited and you wouldn’t rush in with armed federal marshals and demand the authority to hold all those universities and high schools accountable to a new, universally enforced master metric you proposed to design, would you?
Unfortunately WordPress ruined my post “The Defiant Parents: Testing’s Discontents – Response to Hunt” by removing all the URL’s because they were surrounded by angle brackets – anathema to WordPress. Here’s my post with the angle brackets removed:
Some subscribers might be interested in a discussion list post “The Defiant Parents: Testing’s Discontents – Response to Hunt” [Hake (2014)]. The abstract reads:
***************************************************
ABSTRACT: In a post “Re: The Defiant Parents: Testing’s Discontents” [Hake (2014)] at http://bit.ly/1mYwWoa, I pointed to the “vigorous leadership, voluminous messaging, and pro-public-/anti-private-education positions of (a) Diane Ravitch and (b) FairTest http://www.fairtest.org/.
Then I commented that neither appeared to be informed regarding the virtues of rigorous measurement of students’ higher-order learning by means of zero-stakes formative evaluation “designed and used to improve an intervention, especially when it is still being developed” [JCSEE, copied onto p. 132 by Frechtling et al. (2010) at http://bit.ly/1aYcgYn.
In response, Russ Hunt (2014) at http://bit.ly/1hSsq6L wrote: “The virtues of rigorous testing aren’t really the point: it’s how the tests are administered and what uses they’re put to that Ravitch and FairTest (and I) are concerned with.”
However, it IS to the point for many of those who wish to enhance students’ higher-level learning. Modesty forbids mention of these examples:
1. “Lessons from the Physics Education Reform Effort” [Hake (2002)] at http://bit.ly/aL87VT;
2. “The Physics Education Reform Effort: A Possible Model for Higher Education” [Hake (2005)] at http://bit.ly/9aicfh;
3. “Should We Measure Change? Yes!” [Hake (2007a)] at http://bit.ly/d6WVKO (2.5 MB);
4. “Re: pre-to-post tests as measures of learning/teaching” [Hake (2008a)] at http://bit.ly/MmPxwp;
5. “Design-Based Research in Physics Education Research: A Review” [Hake (2008b)] at http://bit.ly/9kORMZ (1.1 MB);
6. “The Impact of Concept Inventories On Physics Education and Its Relevance For Engineering Education” [Hake (2011a)] at http://bit.ly/nmPY8F (8.7 MB);
7. “SET’s Are Not Valid Gauges of Students’ Higher-Level Learning #2” [Hake (2011b)] at http://bit.ly/jLZaz5;
8. “The NRC Finally Comes to Its Senses on Improving STEM Education” [Hake (2013a)] at http://bit.ly/154M5yf; and
9. “Can the Cognitive Impact of Calculus Courses be Enhanced?” Hake (2013b)] at http://bit.ly/1loHgC4 (2.7 MB).
10. Added on 16 March 2014: “Teaching and physics education research: bridging the gap” [Fraser et al. (2014) at http://bit.ly/1qITBqi.
***************************************************
To access the complete 66 kB post please click on http://bit.ly/1lqiR4u.
Richard Hake, Emeritus Professor of Physics, Indiana University; Honorary Member, Curmudgeon Lodge of Deventer, The Netherlands; President, PEdants for Definitive Academic References which Recognize the Invention of the Internet (PEDARRII); LINKS TO: Academia http://bit.ly/a8ixxm; Articles http://bit.ly/a6M5y0; Blog http://bit.ly/9yGsXh; Facebook http://on.fb.me/XI7EKm; GooglePlus http://bit.ly/KwZ6mE; Google Scholar http://bit.ly/Wz2FP3; Linked In http://linkd.in/14uycpW; Research Gate http://bit.ly/1fJiSwB; Socratic Dialogue Inducing (SDI) Labs http://bit.ly/9nGd3M; Twitter http://bit.ly/juvd52.
“Physics educators have led the way in developing and using objective tests to compare student learning gains in different types of courses, and chemists, biologists, and others are now developing similar instruments. These tests provide convincing evidence that students assimilate new knowledge more effectively in courses including active, inquiry-based, and collaborative learning, assisted by information technology, than in traditional courses.” – Wood & Gentile (2003).
REFERENCES [URLs shortened by http://bit.ly/ and accessed on 31 Jan 2014.]
Hake, R.R. 2014. “The Defiant Parents: Testing’s Discontents – Response to Hunt,” online on the OPEN! AERA-L archives at http://bit.ly/1lqiR4u. The abstract and link to the complete post are being transmitted to several discussion lists and are on my blog “Hake’sEdStuff” at http://bit.ly/1iVpgmf with a provision for comments.
Wood, W.B., & J.M. Gentile. 2003. “Teaching in a research context,” Science 302: 1510; 28 November; online as a 213 kB pdf http://bit.ly/SyhOvL, thanks to Ecoplexity http://bit.ly/152aFQ9.
Can any rigorous high-or low stakes test given to a diverse population of school children in the United States be valid and reliable?
“The virtues of rigorous testing aren’t really the point: it’s how the tests are administered and what uses they’re put to that Ravitch and FairTest (and I) are concerned with.”
Uh…not true. The virtues of rigorous testing are a problem. They have merit only to reformers.
Thanks Jon.
Harvard needs rigorous tests.
That is for selection purposes
Public schools need diagnostic tests to help students and teachers, not rigorous tests
XO,
Ang
My online dictionary defines “rigorous” as “extremely thorough, exhaustive, or accurate.”
No one that I know of ever claimed that the “Force Concept Inventory” [Hestenes et al. (1992)] was “rigorous” in that sense. The FCI is a *diagnostic test* that measures only the most rudimentary understanding of the basic concepts of Newtonian mechanics.
In “Concept Inventories Alone Don’t Gauge ‘Good Teaching’,” I wrote:
“In the editor suppressed ‘Interactive-engagement methods in introductory mechanics courses’ [Hake (1998b)] I pointed out on p. 14 that among desirable outcomes of the introductory physics course that . . . . .[[average pre-to-posttest normalized gain]]. . .. *does not* measure directly are students’:
(a) satisfaction with and interest in physics;
(b) understanding of the nature, methods, and limitations of science;
(c) understanding of the processes of scientific inquiry such as experimental design, control of variables, dimensional analysis, order-of-magnitude estimation, thought experiments, hypothetical reasoning, graphing, and error analysis;
(d) ability to articulate their knowledge and learning processes;
(e) ability to collaborate and work in groups;
(f) communication skills;
(g) ability to solve real-world problems;
(h) understanding of the history of science and the relationship of science to society and other disciplines;
(i) understanding of, or at least appreciation for, “modern” physics;
(j) ability to participate in authentic research.
REFERENCES [All URLs shortened by http://bit.ly/ and accessed on 16 March 2014.
Hake, R.R. 1998a. “Interactive-engagement vs traditional methods: A six-thousand-student survey of mechanics test data for introductory physics courses,” Am. J. Phys. 66: 64-74; online as an 84 kB pdf at http://bit.ly/9484DG . See also the crucial but generally ignored companion paper Hake (1998b).
Hake, R.R. 1998b. “Interactive-engagement methods in introductory mechanics courses,” online as a 108 kB pdf at http://bit.ly/aH2JQN. A crucial companion paper to Hake (1998a). Submitted on 6/19/98 to the “Physics Education Research Supplement” (PERS) of the American Journal of Physics, but rejected by its editor on the grounds that the very transparent, well organized, and crystal clear Physical-Review-type data tables were “impenetrable”!
Hake, R.R. 2012a. “Concept Inventories Alone Don’t Gauge ‘Good Teaching’,” online on the OPEN! AERA-L archives at http://bit.ly/VeDTWI. Post of 28 Nov 2012 13:24:52-0800 to AERA-L and Net-Gold. The abstract and link to the complete post are being transmitted to several discussion lists and are also on my blog “Hake’sEdStuff” at http://bit.ly/V6PSzT with a provision for
comments.
Halloun, I., R.R. Hake, E.P. Mosca, & D. Hestenes. 1995. “Force Concept Inventory (1995 Revision),” online (password protected) at http://bit.ly/b1488v, scroll down to “Evaluation Instruments.” Currently available in 21 languages: Arabic, Chinese, Croatian, Czech, English, Finnish, French, French (Canadian), German, Greek, Indonesian, Italian, Japanese, Malaysian, Persian, Portuguese, Russian, Spanish, Slovak, Swedish, & Turkish.
Hestenes, D., M. Wells, & G. Swackhamer. 1992. “Force Concept Inventory,” Phys. Teach. 30: 141-158; online (except for the test itself) at http://bit.ly/b1488v. For the 1995 revision see Halloun et al. (1995).
That’s certainly the way I learn best. Tests that don’t make one nervous during the course but which show you what you don’t know. Physics students may, however, be a self-selected group of motivated persons.
I suspect physics students may be about like engineering students. They were used to professors asking incredibly complex problems and then grading on a curve where a 60% was an “A.” I am totally revealing my dinosaur status since that is what used to happen back in the 60s and 70s at the college level. Being a liberal arts major, I could not understand how a grade of 60 could be considered an “A.” I would really like to hear from someone who actually went through the experience I am only telling about as a horrified onlooker.
Think normal curve.
I get that Harlan. I had trouble wrapping my head around giving a test where 60% was at the top of the range. We all had those percentages so ingrained in our psyche that it would take major reeducation to accept what was previously considered to be a failing or close to failing grade as excellent work. I can see this reeducation as valuable to engineering students who were dealing with extremely complex problems (without the aid of computers). It was like a rite of passage: “This ain’t high school anymore, Dorothy.” Those who made the mental shift succeeded in becoming engineers.
“I had trouble wrapping my head around giving a test where 60% was at the top of the range.”
Try making sense of hitting averages in baseball.
That’s a very good analogy–batting averages. Thank you. Engineering students had to accept a whole new paradigm of grading practices. If they stuck with the old understanding, no one would become an engineer. By changing the perception of proficiency, they could avoid the frustration of continually being told that your performance is inadequate.