At my request, Pasi Sahlberg has written comments on the latest international test scores. Sahlberg is a prominent Finnish educator and author of the award-winning book “Finnish Lessons.”
Sahlberg writes:
International testing mania
This week educators around the world got a new opportunity to benchmark their students’ performance to their international competitors when The International Association for the Evaluation of Educational Achievement (IEA) released the results of TIMSS (Trends in Mathematics and Science Study) mathematics and science of 63 countries and PIRLS (Progress in Reading Literacy Study) in 48 countries. The United States took part in both of these studies that tested how well 4th grade children can read and what 4th and 8th grade students know about mathematics and science in school.
The media gave rather blunt headlines of the U.S. performance in these international tests this week. “U.S. students continue to trail Asian students in math, reading, science” wrote the Washington Post and “Competitors Still Beat U.S. in Tests” reported the Wall Street Journal. Only to be Number One seems to be enough for America media. Similar headlines were published in Canada, New Zealand and Norway – all countries with lower than expected results.
But a glance at participating countries’ national averages reveals some interesting aspects of American students performance in the 2011 TIMSS and PIRLS studies. 4th grade Americans scored high in science and reading and a bit lower in mathematics (7th, 6th and 11th respectively). Ahead were only East Asian countries (South Korea, Singapore, Hong Kong and Japan) and Finland. Americans 4th graders did better than most of their European peers in all tested areas.
Eighth grade American students also did well, hitting 9th in mathematics and 10th in science. Here again, before the U.S. came East Asians, Finns and, perhaps against the odds, Russians. This may seem to some in America as not good enough. But it is good to remember that according to historical data, American education has never been good if the criterion is performance in international studies. IEA has tested students in mathematics and science since the 1960s, the U.S. being one of the permanent participants. Over the half century, as Yong Zhao has concluded, American students performance in international mathematics and science tests has improved from the bottom to above international average.
Another interesting revelation in TIMSS 2011 is amazingly high performance of some U.S. states that took part in that study as ‘countries’. For example, 4th grade pupils in Florida performed above Canadian provinces of Alberta, Ontario and Quebec in reading, science and mathematics and were on par with Finland, except in science. Furthermore, 8th grade students in Massachusetts, Minnesota and Colorado were better than high performing Hong Kong in science. If Massachusetts, Minnesota, North Carolina, Indiana and Colorado were countries, they would all fit into top ten in 8th grade mathematics. Not bad at all.
Less than a month ago Pearson Corporation published another international study that compared 40 selected education systems, among them the U.S. and Finland. This study was based on an analysis of cognitive skills and educational attainment in these countries. In a way, “The Learning Curve” as Pearson has named its study, is a composite index that combines different sources of data and information. Finland was the winner in this new index, the U.S. being 17th. There has been some controversy concerning this study because it has been architectured and conducted by a commercial firm that has high interest in pushing forward certain education reforms around the world.
To make the scene of international comparisons even more complex, there is yet another international study, one that has gained more momentum and popularity than any other study. This is the Programme for International Student Assessment, better known as PISA, administrated by the Organisation for Economic Cooperation and Development (OECD). PISA was first run in 2000 and it tests 15-year-olds competences in math, literacy, and science in 34 OECD countries and similar number of non-member countries. It is conducted in three-year cycles – the results of PISA 2012 will be launched in December 2013.
In my book Finnish Lessons (2011) I highlighted early trends of American and Finnish students’ performance in reading, mathematics and science literacy. Findings were rather interesting. The U.S. students’ performance trend in PISA 2000 to 2006 was declining, similarly to all other countries that were infected by GERM (global educational reform movement of competition, choice, testing and privatization) in the 1990s. At the same time Finland’s scores in all areas were improving. Overall, as many people in the U.S. now know, American students have been left behind by most other OECD countries according to PISA test.
All that is said above invites two important questions. First, how is it possible that different international studies that compare education systems by having a particular look at students’ learning outcomes lead to such different results? Who is right? What do these studies really tell us? Second, are these studies in the end really able to inform policy-makers and guide education reforms in coherent ways so that teachers and students would have better opportunities to succeed? Do they help politicians to understand the nature of human learning?
Well, TIMSS and PISA are technically different studies, although they both build on similar measurement methodology. Simplified distinction of these two studies is that where TIMSS tests students’ mastery of what have been taught from the curricula, PISA assesses how students can use those knowledge and skills that they were taught in new situations. These both are student assessment studies. Pearson’s “The Learning Curve” index is different kind that consists of different indicators and is therefore a composite index. The problem with any study that relies on composite index is that it is open to designer manipulation. “Global Economic Competitiveness Index” and “The Best Country in the World” are good examples, just like “The Learning Curve.”
One may also conclude that these international standardized tests are becoming global curriculum standards. Indeed, OECD has observed that its PISA test is already playing an important role in national policy making and education reforms in many countries. Schools, teachers and students are now prepared in advance to take these tests. Learning materials are adjusted to fit to the style of these assessments. Life in many schools around the world is becoming split into important academic study that these tests measure, and other not-so-important study that these measurements don’t cover. Kind of a GERM in grand scale.
Pasi Sahlberg
Helsinki FINLAND
Wouldn’t it be wonderful if we measured and valued student scores on empathy, community service, citizenship and sense of fairness?
Reblogged this on Capitan Typo's Adventures in Education and commented:
A timely reminder of the impact standardised testing has on school curriculum. Now on an international level!
This is a sobering read. I struggle to make it through the last few days before we break for the 2 week holiday (Excitement is in the air). I am reminded daily by my 600 students that there is more to life than the 3 Rs. Their social/emotional needs are constant, and I try to keep a balance, so they can be successful in school. Articles like this help remind me not to get too caught up in the latest testing trend and to focus on what I know is the big picture for these students. The hard part is supporting teachers to understand how to navigate through the latest B.S. The students are the easy part. 🙂
We need to focus on the success of students after high school. Too much talk about test scores.
“The U.S. students’ performance trend in PISA 2000 to 2006 was declining, similarly to all other countries that were infected by GERM (global educational reform movement of competition, choice, testing and privatization) in the 1990s.”
We need to be inoculated against this GERM, fast!
Finnish miracle: fata morgana?
Finnish students’ achievement (15 y) declined significantly: study of University Helsinki
University of Helsinki – Faculty of Behavioral Sciences, Department of Teacher of Education Research Report No 347Authors: Jarkko Hautamäki e.a. Learning to learn at the end of basic education: Results in 2012 and changes from 2001
S.: The change between the year 2001 and year 2012 is significant. The level of students’ attainment has declined considerably: under the mean of the scale used in the questions. The difference can be compared to a decline of Finnish students’ attainment in PISA reading literacy from the 539 points of PISA 2009 to 490 points, to below the OECD average. The mean level of students’ learning-supporting attitudes still falls above the mean of the scale used in the questions but also that mean has declined from 2001.
Since 1996, educational effectiveness has been understood in Finland to include not only subject specific knowledge and skills but also the more general competences which are not the exclusive domain of any single subject but develop through good teaching along a student’s educational career. Many of these, including the object of the present assessment, learning to learn, have been named in the education policy documents of the European Union as key competences which each member state should provide their citizens as part of general education (EU 2006).
In spring 2012, the Helsinki University Centre for Educational Assessment implemented a nationally representative assessment of ninth grade students’ learning to learn competence. The assessment was inspired by signs of declining results in the past few years’ assessments. This decline had been observed both in the subject specific assessments of the Finnish National Board of Education, in the OECD PISA 2009 study, and in the learning to learn assessment implemented by the Centre for Educational Assessment in all comprehensive schools in Vantaa in 2010.
The results of the Vantaa study could be compared against the results of a similar assessment implemented in 2004. As the decline in students’ cognitive competence and in their learning related attitudes was especially strong in the two Vantaa studies, with only 6 years apart, a decision was made to direct the national assessment of spring 2012 to the same schools which had participated in a respective study in 2001.
The goal of the assessment was to find out whether the decline in results, observed in the Helsinki region, were the same for the whole country. The assessment also offered a possibility to look at the readiness of schools to implement a computer-based assessment, and how this has changed during the 11 years between the two assessments. After all, the 2001 assessment was the first in Finland where large scale student assessment data was collected in schools using the Internet.
The main focus of the assessment was on students’ competence and their learning-related attitudes at the end of the comprehensive school education, but the assessment also relates to educational equity: to regional, between-school, and between- class differences and to the relation of students’ gender and home background to their competence and attitudes.
The assessment reached about 7 800 ninth grade students in 82 schools in 65 municipalities. Of the students, 49% were girls and 51% boys. The share of students in Swedish speaking schools was 3.4%. As in 2001, the assessment was implemented in about half of the schools using a printed test booklet and in the other half via the Internet. The results of the 2001 and 2012 assessments were uniformed through IRT modelling to secure the comparability of the results. Hence, the results can be interpreted to represent the full Finnish ninth grade population.
Girls performed better than boys in all three fields of competence measured in the assessment: reasoning, mathematical thinking, and reading comprehension. The difference was especially noticeable in reading comprehension even if in this task girls’ attainment had declined more than boys’ attainment. Differences between the AVI-districts were small. The impact of students’ home-background was, instead, obvious: the higher the education of the parents, the better the student performed in the assessment tasks. There was no difference in the impact of mother’s education on boys’ and girls’ attainment. The between-school-differences were very small (explaining under 2% of the variance) while the between-class differences were relatively large (9 % – 20 %).
The change between the year 2001 and year 2012 is significant. The level of students’ attainment has declined considerably. The difference can be compared to a decline of Finnish students’ attainment in PISA reading literacy from the 539 points of PISA 2009 to 490 points, to below the OECD average. The mean level of students’ learning-supporting attitudes still falls above the mean of the scale used in the questions but also that mean has declined from 2001.
The mean level of attitudes detrimental to learning has risen but the rise is more modest. Girls’ attainment has declined more than boys’ in three of the five tasks. There was no gender difference in the change of students’ attitudes, however. Between-school differences were un-changed but differences between classes and between individual students had grown. The change in attitudes—unlike the change in attainment—was related to students’ home background: The decline in learning-supporting attitudes and the growth in attitudes detrimental to school work were weaker the better educated the mother. Home background was not related to the change in students’ attainment, however. A decline could be discerned both among the best and the weakest students.
The results of the assessment point to a deeper, on-going cultural change which seems to affect the young generation especially hard. Formal education seems to be losing its former power and the accepting of the societal expectations which the school represents seems to be related more strongly than before to students’ home background. The school has to compete with students’ self-elected pastime activities, the social media, and the boundless world of information and entertainment open to all through the Internet. The school is to a growing number of youngpeople just one, often critically reviewed, developmental environment among many.
The change is not a surprise, however. A similar decline in student attainment has been registered in the other Nordic countries already earlier. It is time to concede that the signals of change have been discernible already for a while and to open up a national discussion regarding the state and future of the Finnish comprehensive school that rose to international acclaim due to our students’success in the PISA studies.
Reblogged this on Blogcollectief Onderzoek Onderwijs and commented:
Pasi Sahlberg schreef deze post op verzoek van Diane Ravitch. Zijn opmerkingen gaan over het Amerikaanse onderwijs vergeleken met andere landen. Het zou interessant zijn door zijn ogen te kijken naar de TIMSS en PIRLS scores van Nederland. Sahlberg stelt twee nuchtere vragen, die direct van toepassing zijn op de Nederlandse situatie:
“First, how is it possible that different international studies that compare education systems by having a particular look at students’ learning outcomes lead to such different results? Who is right? What do these studies really tell us? Second, are these studies in the end really able to inform policy-makers and guide education reforms in coherent ways so that teachers and students would have better opportunities to succeed? Do they help politicians to understand the nature of human learning?”
Wonderful qestions! One more very important quesstion: How are students (schools) chosen to participate or do they volunteer? Bad data beget bad data!
Great post.
YES, inoculate against GERM, ASAP.
Since they all seem to be drinking the same Koolaid, maybe the GERM vaccine could be be put into the Koolaid to stop the spread of this horrible virus. Reminds me of Jim Jones and his Utopian society.
Tests, assessments, evaluations – they are not bad in themselves. They are bad when they dictate what we do next and when the test that was meant to assess how successfully we did what we thought was important becomes the source of the things we think are important. GERM is an example of this latter in action. But I still am not able to find any significant explanation of the substantial differences between PISA and PIRLS. Countries that have done relatively well on PISA are appalling on PIRLS and vice versa. Could it be that kids who like to think creatively and apply what they know (PISA) simply are not wired right to do tests that just ask them to regurgitate the taught curriculum? Could it be that in the US (and especially in Florida) kids have got so good at drill and kill education that they can blitz a multiple choice, fill the gap assessment but stumble when asked to think deeply or creatively?
The following blog (2006) contains a lot of critique on internatioanal surveys and is perhaps welcome here.
Willem Smit
Introduction
0. In the Netherlands an educational war is raging as of about 1996. Innovators and traditionalists fight each other in public debates of which the end is not in sight. The traditionalists are organized into an association named Beter Onderwijs Nederland, BON, (Better Education for the Netherlands) that registered more than 4000 members, mostly teachers in all sectors of the educational system, within one year of its existence. Among other activities BON runs its own site http://www.beteronderwijsnederland.nl and initiates and supports discussions with their opponents, mostly educational managers and politicians. In so far as discussions are sensitive to outcomes of relevant educational research BON tries to construct a knowledge base for good education. One chapter of this knowledge base critically surveys the outcomes of PISA and TIMSS. Dutch results were until now rather positive, though less so with each successive trial, outcomes that enabled to silence BON-critique on innovations rather effectively by stating: “But internationally our results are good.”
The following paragraphs are translated from the Dutch text to be found at the BON-site: http://www.beteronderwijsnederland.nl/?q=node/1340
All official PISA (2000, 2003, 2006) and TIMSS (1995, 1999, 2003) reports of many sorts, adding to many thousands of pages, are free downloadable at the PISA and TIMSS-sites mentioned in the References. Other texts cited can be consulted at the BON-site.
As critique on PISA and TIMSS increases into an equally unsurveyable quantity the reader is advised to start with publications that rubricate and summarize critique: Smithers (2004); Prais (2003, 2004), the PISA-response of Adams (2003); Jahnke (2006, especially Wuttke‟s chapter); Haahr (2005); Hagemeister (2006); Topping (2003); Bender (2003).
1. The most important and everywhere to be heard critique concerns the validity of the test items for the national curricula. Added to that is the more usual and substantial critique concerning badly constructed and badly translated items (Hagemeister, 1999; Baumert‟s replique, 1999; Bender, 2003, par.3.2). TIMSS, in contrast with PISA, makes use of test items more closely derived from the national curricula as printed in book form. These items try to answer the question: what did they learn? It is quite normal that test items of international surveys come under heavy fire; transnational validity obviously is hard to construct. In a country like the USA, where a Math war between realistic (contextual) and traditional (conceptual) mathematics is raging for many years, even a national curricular validity of items is very difficult to reach and not to find. Of course, even without Math Wars and controversial innovations curricula within countries may differ substantially (Bracey, 2000; Clarke, 2001, par. 3). This being the case, test items always will connect better to curricula of country A than of country B, causing variance in the scores not controlled and for. Positions on the country lists of PISA and TIMSS (see paragraph 2 hereafter) change inexplicably and substantially in time (Bender, in Jahnke, 2006, p.193; Smithers, par. 71 – 74). Sometimes countries with widely different curricula end up ex aequo (i.e. Belgium and the Netherlands, TIMSS 2003, De Lange, fig. 5, p. 19); countries with about equal curricula very differently (Flanders and the Walloon provinces of Belgium). Explanations are missing so these country lists are no good. “One number doesn‟t tell all”. Indeed, in the first TIMSS-report this is noticed (TIMSS II, Baumert, J et al., 1997: TIMSS – Mathematisch-naturwissenschaftlicher Unterricht im internationalen Vergleich. p. 18 ff.), but in later reports this sensible reticence is lacking.
2. PISA will not have anything to do with transnational validity (Smithers, par. 20) and instead introduces the terms: Literacy (reading, RL; mathematics, ML; science, SL) and “skills for life”. For a comparison of PISA and TIMSS-items see Smithers, par. 24 t/m 30. The differences seem to be remarkably small. These new constructs made bad things worse. Even de Lange (2006), chair of the PISA Expert Group Mathematics does not succeed in explaining the precise meaning of math literacy for math, nor for daily life either. The same judgment applies to the repeated and verbose attempts undertaken in the final and national reports. This almost certainly means only one thing. Literacies are competencies (Bender, 2003, par. 3.3 t/m 3.7; Bender, 2004): non-definable, difficult to instruct and untestable learning goals, that, when taken seriously by the teacher, cause all education into a state of weightlessness very fast.
A PISA-test, disconnected from the national curriculum, is not a direct measurement of the quality of that curriculum and will not succeed in comparing the educational systems of countries in this respect. Without curricular validity no country lists and no feedback of results to this curriculum. Not even some gain is made by instructional practice because the test items try to measure intangible competencies.
The positions of the countries on the PISA and TIMSS-lists have but little meaning (Haahr, p. 33 ff.).
This has also been conceded by the authors of the first PISA-report (PISA, 2000, Knowledge and Skills for life, p. 26 en p. 212). Again, as with TIMSS, this standpoint disappeared from later reports without a trace or comment. Neither TIMSS, nor PISA offers explanations for unexpected and shifting positions of nations (Smithers, par. 113). Both surveys lack the theoretic stamina needed for that. (Smithers, par. 106).
3. What does PISA measure? Its items resemble the American Scholastic Aptitude Test items and those of the Dutch CITO-test. Indeed, with some other organizations CITO produces PISA-items, an advantage for the Netherlands and a disadvantage for countries that have to deal with translations. Meyerhöfer even remarks that some PISA-items come from Dutch schoolbooks (Jahnke, p. 135,). PISA then seems to measure an unsharp mix of intelligence, knowledge and experience, in other words: common sense (Prais, 2003, p. 141 – 145). Smithers draws attention to the fact that the mean scores for the three literacy’s are about equal within each country. He therefore concludes that PISA test general reading skills (Smithers, par. 50; 92 – 95). When PISA-scores are indeed under the influence of intelligence all outcomes must be reconsidered, unless of course one is of the opinion, shared by some educational innovators, that intelligence is a competence that is teachable and learnable in schools. In the many official PISA and TIMSS-reports not a single word is spent on this important question.
4. Naturally the educational systems of the participating countries differ on many more aspects than the content and didactics of the curricula. The outcomes are under the influence of all kinds of educational and cultural conditions in which countries differ (Le Tendre, et al. 2001). To mention but a few: availability of instructional material, the wealth and educational spending of nations, discipline in the classroom, the different importance attributed to subjects, working conditions and salaries of teachers, the status of their profession, class size, the certification of teachers, the social-economic-status of parents, experience with multiple choice questions (not in use in many countries, used in about 30% of the test items), the presence and extent of migrant groups of pupils with language deficits. All these and other not mentioned factors have their effects on teaching and learning and therefore on the outcomes of these international surveys. When controlling for these factors is faulty or even missing, then interpretations and conclusions suffer correspondingly and country lists lose their meaning (Bracey, 2000; Wang, 2001).
5. What follows is a small selection of examples of that happening.
5.1 The sampling of respondents and schools caused unsolved problems in some countries (Smithers, 2004, par. 31 – 42; Prais, 2003, p. 145 – 152). Swedish TIMSS-respondents, for example, are so much older than their German counterparts that the position of the two countries on the list is thereby explained (Bender, in Jahnke, 2006, p. 192). More on this in Wuttke, in Jahnke, 2006, par. 8, p. 114. Between countries there are large differences in percentages pupils not taking the test (i.e. test refusers, drop-outs). Not enough information on these groups was gathered to make sure that the replacements were representative (Prais, 2004, p. 571). Much more critical remarks on sampling problems in Collani (2000, par. 3 en 4) who states that, apart from the Netherlands, fifteen other countries should have been removed from the final PISA 2000 report when the PISA-standards would have been applied (ibid. par. 5). Standards were overrun more often, i.e. concerning the rules for exclusion of participation of special groups of pupils and concerning the rules for handling incomplete answer booklets (Wuttke, in Jahnke, 2006, par. 3 – 5 and 7).
5.2 Differences between country scores can sometimes be explained by causes other than covered by PISA and TIMSS. De Lange (2006, p. 20, TIMSS-table 6) reports that in Singapore, winner of the MathLit competition, pupils spend far more time on these subjects than pupils in other countries. As table 6 shows the factor „student time spend‟ neatly predicts the outcomes in Math. The factor was however not controlled for although the data were available.
5.3 Pupils with a language deficit caused by a migrant background score of course lower than is normal. For these lower scores see Smithers, par. 68 -70, table 13. This table also gives the very dissimilar percentages of such respondents for each country: Finland 1,2%; Japan 0,1%; Germany 15,2%; Switzerland 20,7%, Canada 20,6%. Moreover countries apply very dissimilar entrance policies, i.e. selection on earlier schooling of the applicants (Ireland and Canada for instance), or concerning the right of family reunion etc. As a rule emigrants from Asian and East-European countries are better educated than emigrants from other countries. Immigrants in English speaking countries as a rule have smaller language deficits than immigrants in countries like the Netherlands and Germany. This factor has not been handled with due care in the surveys (Hagemeister, 2006, par. 7; also in Jahnke, 2006; Bender, 2005, par. 1.5.).
5.4 The importance of class size is underestimated by PISA due to high PISA-scores of countries like Japan and Korea in combination with big classes in these countries. However, research supplies ample evidence that class size is an important predictor of learning effects and that weak pupils especially profit by a big teacher – pupil ratio. For more on this see Hagemeister, 2006, par. 2.
5.5 The intransparancy and possible inadequate application of psychometric statistics (i.e. Item Response Theory) is treated and criticized by Prais, (2003, Annex, p. 159), Goldstein (2004) and Von Collani (2001). This point deserves more attention than it gets, like the next one. In the national report of the USA (PISA – Outcomes of Learning, 2002, p. 11) the Reading Literacy country list shows a large middle group of twenty countries, including the Netherlands and Germany, which do not differ significantly from the USA mean score. The leading group counts three countries (Finland, Canada, New-Zealand) and there is a tail group of four. Other national reports, i.e. Germany‟s, (PISA 2000, Zusammenfassung zentraler Befunde, p. 13) show a middle group of six countries and Germany in the tail group, just because one changed from USA mean score to OECD mean score. Inconsistencies like this one abound in the national reports.
It is clear throughout that the differences between countries are small, just 10% of the total variance in the scores is located between countries. It is quite possible that this partly explains the earlier mentioned unexplained shifts in country positions (par. 1).
5.6 PISA has a not really hidden agenda promoting comprehensive schools and so called equity and equality. Finland and some other high scoring countries have comprehensive schools. Finland also has high quality, high status teachers and an excellent remedial system, among other factors that have a positive influence on learning results. More on this by Naylor (2004). Smithers, (par. 61 – 67; 107 – 110) demonstrates some inadequacies of the PISA measures of equality. The political taste of PISA must remain undiscussed.
TIMSS and PISA scores may offer useful information to individual countries, Prais gives an example of this (2003, p. 144). However they don‟t lend themselves for comparisons of the quality of national educational systems (Smithers, 111 – 114). Comparing such diversity as offer countries like Finland, Singapore, Germany, USA and Turkey along one scale of country lists is utterly impossible with a few elementary instruments as used by PISA and TIMSS.
Willem Smit
References
Official PISA-reports are downloadable from http://www.pisa.oecd.org
TIMSS-reports from the NCES-site: http://nces.ed.gov/timss/index.asp
Adams, R. J. (2003). Response to ‘Cautions on OECD’s recent educational survey (PISA)’. Oxford Review of Education, vol. 29, no. 3, 2003.
Bender, P. (2003) Die etwas andere Sicht auf die internationalen Vergleichsuntersuchungen TIMSS, PISA und IGLU.
Bender, P. Die etwas andere Sicht auf den mathematischen Teil der internationalen Vergleichsuntersuchungen PISA sowie TIMSS und IGLU. DMV-Mitteilungen 12-2/2004.
Bender, P. (2005). Neue Anmerkungen zu alten und neuen PISA-Ergebnissen und Interpretationen.
Bender; more from Bender at http://math-www.uni-paderborn.de/~bender.
Baumert, J. e.a. (1999). Konzeption und Aussagekraft der TIMSS-Leistungstests. Response to Hagemeister (1999).
Bracey, G. W. (2000). The TIMSS “Final year” study and report: A critique. Educational Reseacher, Vol. 29, May 2000, p. 4 – 10.
Clarke, D. (2001). Developments in international comparative research in mathematics education: Problematising cultural explanations.
Collani, E. von (2001). OECD PISA – An example of stochastic illiteracy? Economic Quality Control, vol. 16, no. 2, 227 – 253.
Downes, S. (2005). Understanding PISA. Turkish Online Journal of Distance Education. 2005, vol. 6, no. 2, art. l.
Goldstein, H. (2004). International comparisons of student attainment: some issues arising from the PISA study.
Haahr, J. H. e.a. (2005). Explaining student performance. Evidence from the international PISA, TIMSS and PIRLS surveys.
Hagemeister, V. (1999). Was wurde bei TIMSS erhoben? Eine Analyse der empirischen Basis von TIMSS.
Hagemeister, V.; Kritische Anmerkungen zum Umgang mit den Ergebnissen von PISA (2006).
Jahnke, T. & Meyerhöfer, W., Hrsg. (2006). Pisa & Co. Franzbecker Verlag, Berlin.
Lange, J. de. Mathematical literacy for living from OECD-PISA perspective. (2006).
Le Tendre, G. K. et al. (2001). Teacher’s work: Institutional isomorphism and cultural variation in the U.S., Germany, and Japan. Educational Researcher, vol. 30, no. 6, p.3 – 15.
Naylor, F. (2004). The Trojan horse within. Current Concerns, no.l, 2004.
Prais, S. J. (2003). Cautions on OECD’s recent educational survey (PISA). Oxford Review of Education, vol. 29, no. 2, 2003.
Prais, S. J. (2004). Cautions on OECD’s recent educational survey (PISA): Rejoinder to OECD’s response. 0xford Review of Education, vol. 30, no. 4, 2004.
Smithers, A. (2004); England’s education. What can be learned by comparing countries? University of Liverpool.
Topping, K. e.a. (2003). Policy and practice implications of PISA 2000. Report of the International Reading Association, PISA Task Force.
Wang, J. (2001). TIMSS primary and middle school data: some technical concerns. Educational Researcher, vol. 30, no. 6, p. 17 – 21.
Wuttke, J. (2006) Fehler, Verzerrungen, Unsicherheiten in der PISA-Auswertung. In: Jahnke, T. PISA & Co, p. 101 – 154.
(2006)
Willem Smit