The National Assessment Governing Board, the federal agency in charge of the NAEP assessments, is aware that the achievement levels (Basic, Proficient, Advanced) are being misused. They are considering tinkering with the definitions of the levels. NAGB has invited the public to express its views. Below is my letter. If you want to weigh in, please write to NAEPALSpolicy@ed.gov and Peggy.Carr@ed.gov. Responses must be received by September 30.
My letter:
Dear NAEP Achievement-Level-Setting Program,
As a former member of the National Assessment Governing Board, I am keenly interested in the improvement and credibility of the NAEP program.
I am writing to express my strong support for a complete rethinking of the NAEP “achievement levels.” I urge the National Assessment Governing Board to abandon the achievement levels, because they are technically unsound and utterly confusing to the public and the media. They serve no purpose other than to mislead the public about the condition of American education.
The achievement levels were adopted in 1992 for political reasons: to make the schools look bad, to convey simplistically to the media and the public that “our schools are failing.”
The public has never understood the levels. The media and prominent public figures regularly report that any proportion of students who score below “NAEP proficient” is failing, which is absurd. The two Common Core-aligned tests (PARCC and SBAC) adopted “NAEP Proficient” as their passing marks, and the majority of students in every state that use these tests have allegedly “failed,” because the passing mark is out of reach, as it will always be.
The National Center for Educational Statistics (NCES) has stated clearly that “Proficient is not synonymous with grade level performance.” Nonetheless, public figures like Michelle Rhee (who was chancellor of the DC public schools) and Campbell Brown (founder of the website “The 74”) have publicly claimed that the proficiency standard of NAEP is the bar that ALL students should attain. They have publicly stated that American public education is a failure because there are many students who have not reached NAEP proficient.
In reality, there is only one state in the nation–Massachusetts–where as much as 50% of students have attained NAEP Proficient. No state has reached 100% proficient, and no state ever will.
When I served on NAGB for seven years, the board understood very well that proficient was a high bar, not a pass-fail mark. No member of the board or the staff expected that some day all students would attain “NAEP Proficient.” Yet critics and newspaper consistently use NAEP proficient as an indicator that “all students” should one day reach. This misperception has been magnified by the No Child Left Behind Act, which declared in law that all students should be “proficient” by the year 2014.
Schools have been closed, and teachers and principals have been fired and lost their careers and their reputations because their students were not on track to reach an impossible goal.
As you well know, panels of technical experts over the years have warned that the achievement levels were not technically sound, and that in fact, they are “fatally flawed.” They continue to be “fatally flawed.” They cannot be fixed because they are in fact arbitrary and capricious. The standards and the process for setting them have been criticized by the General Accounting Office, the National Academy of Sciences, and expert psychometricians.
Whether using the Angoff Method or the Bookmarking Method or any other method, there is no way to set achievement levels that are sound, valid, reliable, and reasonable. If the public knew that the standards are set by laypersons using their “best judgment,” they would understand that the standards are arbitrary. It is time to admit that the standard-setting method lacks any scientific validity.
When they were instituted in 1992, their alleged purpose was to make NAEP results comprehensible to the general public. They have had the opposite effect. They have utterly confused the public and presented a false picture of the condition and progress of American education.
As you know, when Congress approved the achievement levels in 1992, they were considered experimental. They have never been approved by Congress, because of the many critiques of their validity by respected authorities.
My strong recommendation is that the board acknowledge the fatally flawed nature of achievement levels. They should be abolished as a failed experiment.
NAGB should use scale scores as the only valid means of conveying accurate information about the results of NAEP assessments.
Thank you for your consideration,
Diane Ravitch
NAGB, 1997-2004
Ph.D.
New York University
ALSO:
The National Superintendents Roundtable wrote a letter.
I urge you to read this here.
The letter documents the many scholarly studies criticizing the NAEP achievement levels.
Here is an excerpt:
“NAGB hired a team of evaluators in 1990 to study the process involved in developing the three levels. A year later the evaluators were fired after their draft report concluded that the process “must be viewed as insufficiently tested and validated, politically dominated, and of questionable credibility.”
“In 1993, the U.S. General Accounting Office labeled the standard-setting process as “procedurally flawed” producing results of “doubtful accuracy.”
“In 1999, the National Academy of Sciences reported the achievement-level setting procedures were flawed: “difficult and confusing . . . internally inconsistent . . . validity evidence for the cut scores is lacking . . . and the process has produced unreasonable results.”
“Shortly after No Child Left Behind was signed into law in 2001, Robert Linn, past president of the American Educational Association and of the National Council on Measurement in Education, and former editor of the Journal of Educational Measurement called the “target of 100% proficient or above according to the NAEP standards is more like wishful thinking than a realistic possibility.”
“In 2007, researchers concluded that fully a third of high school seniors who completed calculus, the best students with the best teachers in the country, could not clear the proficiency bar. Moreover, they added, fully 50 percent of those who scored “basic” in twelfth grade math had achieved a bachelor’s degree (a proportion comparing favorably with four-year degree rates at public universities).
“The Buros Institute, named after the father of Mental Measurements Yearbook, criticized the lack of a validity framework for NAEP assessment scores in 2009 and recommending continuing “to explore achievement level methodologies.”
“Fully 30 percent of 12th-graders who completed calculus were deemed to be less than proficient, said a Brookings Institution scholar in 2016, a figure that jumped to 69 percent for pre-calculus students and 92 percent for students who completed trigonometry and Algebra I. These data “defy reason” and “refute common sense,” he concluded.
“Finally, the NAS study to which the proposed rule responds took note in 2016 of the “controversy and disagreement around the achievement levels, noting that Congress has insisted since 1994 that the achievement levels are to be used on a trial basis until on objective evaluation determined them to be “reasonable, reliable, valid, and informative to the public.”
“In the Roundtable’s judgment, such an objective evaluation has yet to be completed and a determination that the achievement levels are “reasonable, reliable, valid, and informative to the public” has yet to be seen.
“Linking studies conclude most students in most nations cannot clear “proficiency” bar
“The Roundtable points also to research studies dating from 2007 to 2018 indicating NAEP’s proficiency bar is beyond the reach of most students in most nations. When Gary Phillips of the American Institutes of Research (and former Acting Commissioner of NCES) asked how students in other nations would perform if their international assessment results were expressed in terms of NAEP achievement levels, his results were sobering. The results demonstrated that just three nations (Singapore, the Republic of Korea, and Japan) would have a majority of their students clear the NAEP bar in 8th-grade mathematics, while Singapore alone could meet that standard (more than 50% of students clearing the bar) in science.
“Subsequently Hambleton, Sireci, and Smith (2007) and also Lim and Sireci (2017) reached conclusions similar to those of Phillips.”
The fact is that “NAEP proficiency” is an impossible goal for most students. To recognize that does not lower standards. It acknowledges common sense.
Not every runner will ever run a four-minute mile. Some will. Most wont.
“My strong recommendation is that the board acknowledge the fatally flawed nature of achievement levels AND STANDARDIZED TESTING. They should be abolished as a failed experiment.”
A minor/major correction, eh!
The last three lines of the excerpt sum up the NAEP Proficiency standard succinctly.
A magnificent letter! Wow. Well done!
Second that, Bob. They have to listen to you, Diane. You are entirely correct and knowledgeable (and this issue is key). I suspect that if they don’t listen to you, someone named Bill paid them not to.
Anthem for the New Feudal Order
All hail to our thought leader, Bill,
And his cure for society’s ills:
Just stack rank the proles
By hiring some trolls
To inure them to drills on skills
And sap their impertinent wills.
The Bill & Melinda Gates Foundation:
All your base are belong to us.
“The standards and the process for setting them have been criticized by the General Accounting Office, the National Academy of Sciences, and expert psychometricians.”
First, isn’t “expert psychometricians” an oxymoron?
It’s good that the Johny Come Latelies have joined in. Where were they when all this nonsense was getting started at the beginning of the century? Better late than never though, eh!
This is nothing new, folks. Noel Wilson showed us all in 1997 of the onto-epistemological errors and falsehoods and psychometric fudgings involved in the standards and testing malpractice regime that render any conclusions drawn from those malpractices to be COMPLETELY INVALID. The credit should go to Wilson, Hoffman and others who had pointed out those problems long ago. I have posted the following summary of Wilson’s never refuted nor rebutted work many times, however, the summary, clarified by Noel for me, does not even begin to touch everything that he has shown us in his classic “Educational Standards and the Problem of Error” found at: http://epaa.asu.edu/ojs/article/view/577/700
Brief outline of Wilson’s “Educational Standards and the Problem of Error” and some comments of mine. (updated 6/24/13 per Wilson email)
A description of a quality can only be partially quantified. Quantity is almost always a very small aspect of quality. It is illogical to judge/assess a whole category only by a part of the whole. The assessment is, by definition, lacking in the sense that “assessments are always of multidimensional qualities. To quantify them as unidimensional quantities (numbers or grades) is to perpetuate a fundamental logical error” (per Wilson). The teaching and learning process falls in the logical realm of aesthetics/qualities of human interactions. In attempting to quantify educational standards and standardized testing the descriptive information about said interactions is inadequate, insufficient and inferior to the point of invalidity and unacceptability.
A major epistemological mistake is that we attach, with great importance, the “score” of the student, not only onto the student but also, by extension, the teacher, school and district. Any description of a testing event is only a description of an interaction, that of the student and the testing device at a given time and place. The only correct logical thing that we can attempt to do is to describe that interaction (how accurately or not is a whole other story). That description cannot, by logical thought, be “assigned/attached” to the student as it cannot be a description of the student but the interaction. And this error is probably one of the most egregious “errors” that occur with standardized testing (and even the “grading” of students by a teacher).
Wilson identifies four “frames of reference” each with distinct assumptions (epistemological basis) about the assessment process from which the “assessor” views the interactions of the teaching and learning process: the Judge (think college professor who “knows” the students capabilities and grades them accordingly), the General Frame-think standardized testing that claims to have a “scientific” basis, the Specific Frame-think of learning by objective like computer based learning, getting a correct answer before moving on to the next screen, and the Responsive Frame-think of an apprenticeship in a trade or a medical residency program where the learner interacts with the “teacher” with constant feedback. Each category has its own sources of error and more error in the process is caused when the assessor confuses and conflates the categories.
Wilson elucidates the notion of “error”: “Error is predicated on a notion of perfection; to allocate error is to imply what is without error; to know error it is necessary to determine what is true. And what is true is determined by what we define as true, theoretically by the assumptions of our epistemology, practically by the events and non-events, the discourses and silences, the world of surfaces and their interactions and interpretations; in short, the practices that permeate the field. . . Error is the uncertainty dimension of the statement; error is the band within which chaos reigns, in which anything can happen. Error comprises all of those eventful circumstances which make the assessment statement less than perfectly precise, the measure less than perfectly accurate, the rank order less than perfectly stable, the standard and its measurement less than absolute, and the communication of its truth less than impeccable.”
In other words all the logical errors involved in the process render any conclusions invalid.
The test makers/psychometricians, through all sorts of mathematical machinations attempt to “prove” that these tests (based on standards) are valid-errorless or supposedly at least with minimal error [they aren’t]. Wilson turns the concept of validity on its head and focuses on just how invalid the machinations and the test and results are. He is an advocate for the test taker not the test maker. In doing so he identifies thirteen sources of “error”, any one of which renders the test making/giving/disseminating of results invalid. And a basic logical premise is that once something is shown to be invalid it is just that, invalid, and no amount of “fudging” by the psychometricians/test makers can alleviate that invalidity.
Having shown the invalidity, and therefore the unreliability, of the whole process Wilson concludes, rightly so, that any result/information gleaned from the process is “vain and illusory”. In other words start with an invalidity, end with an invalidity (except by sheer chance every once in a while, like a blind and anosmic squirrel who finds the occasional acorn, a result may be “true”) or to put in more mundane terms crap in-crap out.
And so what does this all mean? I’ll let Wilson have the second to last word: “So what does a test measure in our world? It measures what the person with the power to pay for the test says it measures. And the person who sets the test will name the test what the person who pays for the test wants the test to be named.”
In other words it attempts to measure “’something’ and we can specify some of the ‘errors’ in that ‘something’ but still don’t know [precisely] what the ‘something’ is.” The whole process harms many students as the social rewards for some are not available to others who “don’t make the grade (sic)” Should American public education have the function of sorting and separating students so that some may receive greater benefits than others, especially considering that the sorting and separating devices, educational standards and standardized testing, are so flawed not only in concept but in execution?
My answer is NO!!!!!
One final note with Wilson channeling Foucault and his concept of subjectivization:
“So the mark [grade/test score] becomes part of the story about yourself and with sufficient repetitions becomes true: true because those who know, those in authority, say it is true; true because the society in which you live legitimates this authority; true because your cultural habitus makes it difficult for you to perceive, conceive and integrate those aspects of your experience that contradict the story; true because in acting out your story, which now includes the mark and its meaning, the social truth that created it is confirmed; true because if your mark is high you are consistently rewarded, so that your voice becomes a voice of authority in the power-knowledge discourses that reproduce the structure that helped to produce you; true because if your mark is low your voice becomes muted and confirms your lower position in the social hierarchy; true finally because that success or failure confirms that mark that implicitly predicted the now self-evident consequences. And so the circle is complete.”
In other words students “internalize” what those “marks” (grades/test scores) mean, and since the vast majority of the students have not developed the mental skills to counteract what the “authorities” say, they accept as “natural and normal” that “story/description” of them. Although paradoxical in a sense, the “I’m an “A” student” is almost as harmful as “I’m an ‘F’ student” in hindering students becoming independent, critical and free thinkers. And having independent, critical and free thinkers is a threat to the current socio-economic structure of society.
Please Diane, take the next logical and ethical step and come out against the standardized testing malpractice, not just changing wording, or removing the “high stakes”.
An invalid process is just that and to continue to use such fatally flawed processes that harm all students, well. . . .
An invalid process is just that and to continue to use such fatally flawed processes that harm all students…
“In a rational world the thirteen sources of invalidity, developed in many cases
by reframing and repositioning the accepted scholarship in the field of
assessment, should be sufficient to halt the conceptual blindness, the blatant
suppression of error, the subtle fudges, and the myth of certainty that permeates
the “science” and expertise of categorising people. Full acceptance and
individual specification of even one of these sources could revolutionise current
practice. However, as the study indicates, the world in which assessment
resides is far from that rational world to which much of the writing in this thesis
appeals.”
Thus, PREACH dissent (opt out) but PRACTICE obedience
(give tests)..
continues the conceptual blindness and harm to students.
Damn Betsy,Charters,TFA, Repubs, and Rooskies…
Amazing the amount of wisdom one finds in Wilson’s dissertation, eh, NB!
“the world in which assessment
resides is far from that rational world to which much of the writing in this thesis
appeals.” Ain’t dat da truth!
And in the preach/practice aspect it is the practice (really all that matters as preaching usually goes in one ear and out the other-one can verify from personal experience, eh) is what does the most harm-GAGA Good German attitudes thrive in the public education realm.
“First, isn’t “expert psychometricians” an oxymoron?”
Maybe they should be called “psychomeretricians.”
Now that’s one of the best neologisms I’ve come across in years! Love it!
Psycomagicians; all it takes is one wave of the cut-score pen to work their magic.
And as far as the supe adminimals’ recommendations?
Same ol same ol, eh!
Changing the wording does nothing to address the fundamental invalidities involved in NAEP. But I don’t expect anything more from them. Their vested interests (in terms of generous salaries and benefits that far surpass anything teachers get) will always come first, even with their rhetoric of “serving the children” (yeah, serving the students up for carving by data mongers and pretend ‘reformers”).
I’ll leave it at that with this thought:
“Should we therefore forgo our self-interest? Of course not. But it [self-interest] must be subordinate to justice, not the other way around. . . . To take advantage of a child’s naivete. . . in order to extract from them something [test scores, personal information] that is contrary to their interests, or intentions, without their knowledge [or consent of parents] or through coercion [state mandated testing], is always and everywhere unjust even if in some places and under certain circumstances it is not illegal. . . . Justice is superior to and more valuable than well-being or efficiency; it cannot be sacrificed to them, not even for the happiness of the greatest number [quoting Rawls]. To what could justice legitimately be sacrificed, since without justice there would be no legitimacy or illegitimacy? And in the name of what, since without justice even humanity, happiness and love could have no absolute value?. . . Without justice, values would be nothing more than (self) interests or motives; they would cease to be values or would become values without worth.”—Comte-Sponville [my additions]
Your letter is a “mic drop” to the NAEP board. You should be “officially” known as the “notorious” DSR. Proficiency levels are totally an artificial construct determined by politics, not fact. Even I recall from the reading assessment courses I took many years ago that scale scores were the most statistically reliable results in standardized testing, even if I don’t really know why.
Because something is psychometrically “statistically reliable” doesn’t mean that it is onto-epistemologically valid. NAEP scores suffer all the invalidities pointed out by Wilson who has shown us why. . . his work has never been refuted or rebutted. See above post for link.
The best that can be said about the reliability of NAEP is that it is reliably invalid.
Duane, in the best of all possible worlds there would be no big standardized test. I hope one day we will get there. Until then NPE just posted this interesting article from Forbes abut the false assumption that high tests scores equate with a better life. https://www.forbes.com/sites/petergreene/2018/09/20/is-the-big-standardized-test-a-big-standardized-flop/#54ee316d4937
Yes, have read that one before. As usual, what Peter has written is spot on. Thanks for highlighting it so others can read it.
But I can’t agree that “in the best of all possible worlds”. . . mainly because I don’t deal in “possible worlds” but the real one we are in now and in that very real world, very real harms and violations of being are perpetuated on students every single day due to the, as Peter calls it, BS Test corrupting influences on the teaching and learning process.
To put off waiting for that “best possible world” is just another sad excuse to to not do anything about those very real harms. The students can’t wait, they only get one go around and we are allowing abuse after abuse to pile up on their being. And that is, indeed, very sad in any possible world.
What is a mic drop?
Scale scores show whether scores went up or down.
They do not contain a value judgment.
The usual way of stating the difference:
Scale scores show what students know and can do
Achievement levels describe (based in human judgment) what students SHOULD know and be able to do.
The latter is subjective and arbitrary.
A “mic drop” is what is referred to in pop culture when someone gives a performance that is so good it leaves the audience gobsmacked. It’s a compliment or another way to say you did an outstanding job summarizing the problems with the NAEP and they way in which information is misreported to the public.
All scores contain a value judgment.
The purpose of scale scores is to attempt to equate the scores from one test to the scores of another similar test. All the inherent value judgments that went into making each test are still there, the scale scores paper over differences that may or may not be present when one student takes one test and then takes another and/or when different students take different versions of the supposed same test and the scores are scaled.
Scale scores are an attempt to make separate tests equal when in reality, by definition they are not. And that isn’t a value judgment???
It sure is a value judgment.
Scale scores are just one of the many psychometric fudges that serve to hide the various invalidities that Wilson discusses in his seminal dissertation.
No, scale scores do not “show what the students know and can do”.
“There are two types of test scores: raw scores and scaled scores. A raw score is a score without any sort of adjustment or transformation, such as the simple number of questions answered correctly. A scaled score is the results of some transformation applied to the raw score.
The purpose of scaled scores is to report scores for all examinees on a consistent scale. Suppose that a test has two forms, and one is more difficult than the other. It has been determined by equating that a score of 65% on form 1 is equivalent to a score of 68% on form 2. Scores on both forms can be converted to a scale so that these two equivalent scores have the same reported scores. For example, they could both be a score of 350 on a scale of 100 to 500.” (from wiki on test scores)
That “some transformation” is chock full of “value judgments”.
Agree. Mic drop.
Thanks for the clarification, Duane.
If we need to have this type of testing, at least it should be one part of a general investigation of education that includes interviews and discussions with students, teachers, parents, and general citizens.
The least we should do is to remove designation of scores that oversimplify what they mean. I think I will write a letter.
Roy,
Do you really want a federal program to interview parents, students abd Teachers? Why? How many?
That’s a gazzillion font size “IF”, Roy.
I am against large font size.
Diane: it seems to me that anything short of research beyond a test misses many of the intangibles of an education. Whether that means agents of the federal government in the form of interviews by educational researchers doing leg work or some other way to gauge how people feel,about the quality of education, it is necessary to do something to go beyond a test. As an agent of the federal government working for the agriculture department in the 1930s, my Uncle William counted corn ear worms in West Virginia.
Pop IQ tests have been universally rejected by real researchers into human intelligence. Some of them accept assessments that are interview based. They argue about the numbers. In education, we are given numbers based on very limited tests.
I argue in favor of testing that can pinpoint exactly what a child does or does not know or know how to do. If there exists a math test, for example, that requires the sequencing of a variety of steps, we cannot say which part of the process that produced a wrong answer went awry. Such a test is of little use to teachers, because it does not give us useful information.
“I argue in favor of testing that can pinpoint exactly what a child does or does not know or know how to do.”
I know of no test that was ever able to do what you desire-not even my classroom teacher made tests.
What someone knows is fluid, not static, subject to change from moment to moment with the act of testing itself influencing that knowledge. To attempt to “pinpoint exactly what a child does or does not know or know how to do” is a fool’s errand, an impossibility or as Wilson might put it “vain and illusory”. It cannot be done.
“Such a test is of little use to teachers, because it does not give us useful information.”
Which points to what I consider to be a fundamental problem in why we assess students. For me, the number one consideration in any student assessment is “How can this assessment help the student to better understand her/his own learning?” Any information gleaned about the student’s learning by the teacher should rightly be seen as a secondary consideration. Unfortunately, the current zeitgeist of assessment has those two functions backward giving short shrift to the former if it gives any consideration at all.
By placing the teacher into a diagnostician’s role instead of one of actual teaching, the current testing obsession serves to “take the fun” out of learning by introducing an unnecessary relational aspect of the teacher as a healer or one who cures students mental deficiencies and not as a scholarly guide, mentor and/or instructor in the pedagogical arts, in other words a “true teacher”.
Many here have pointed out that testing correlates more highly with income level than any other factor. Exactly what this means is indefinite. At the very least, it means that those who report higher income levels deliver more acceptable responses on the tests.
At its most insidious, this comes from years of one group of people being consistently rewarded by the system of doing things, adding to the growing gap between the rich and the poor.
It would, therefore, seem that making any score on any test known to the test taker or to others associated with his education is very detrimental.
Tell us about justice, Duane.
Not enough space HU. As you know the concept of justice has been discussed, dissected and argued for at least the last two and a half millenia in Western thought.
However (and thanks for the prompt) may I suggest that you read Ch. 3 “Justice Concerns and Educational Malpractices” in my book “Infidelity to Truth: Education Malpractice in American Public Education”. From that chapter:
“Combining our justice concerns with the fundamental purposes of education as described above we can establish a guiding principle with which to judge educational practices and outcomes: An educational policy and/or practice is just when it promotes the welfare of the individual so that each person may savor the right to life, liberty, the pursuit of happiness, and the fruits of their own industry.
Well. . . you asked, HU! 🙂
Way back before there was a governing board for NAEP, I worked on the first and second visual arts assessment. NAEP was then under the governance of the Education Commission of the States. Using their resources and networks, the lead staff at ECS recruited an “advisory group” of visual art and music educators who had written, taught about, or developed assessments in those subjects. Music education had a deeper history of testing than the visual arts. The two groups worked independently except for some technical coordination from ECS.
The initial group of designers of items for the visual arts wisely included scholars who had inerest and expertise in the arts outside of the Western tradition as well as the commercial arts. Even so, in the early 1970s there were slim pickings among educational researchers in the visual arts who had any familiarity with testing and were also attuned to what Phillip Jackson would call “Life in Classrooms.”
The most useful information from the first and second assessments came from some of background questions. These included the extent to which 9,13-and 17-year-olds had art classes in school, encouragement to make art or visit art venues (museums, galleries) and the like. All of the technical documents for the first assessment were wrapped into the second assessment. The second assessment allowed for some comparisons.
There were no definitions of “proficiency.” Metrics corresponded to the types of items (constructed response; choose the best option; simple art and design tasks. The percentages of students who offered an “acceptable response” were reported by item type along with the criteria for ratings.
The “exercises” were arranged in booklets, and no student encountered the complete range of exercises. As is customary, some items from the first assessment were used again in the second, enabling a comparison of responses at two points in time.
The visual arts and music assessments are not now and have never been a priority for NAEP. Since those early days, cost has been used to justify cuts in the sample (now only grade 8) and assessments are only scheduled every ten years or so. Efforts to assess dance and theater were not pursued after one fateful try to include “all the arts that mattered to the National Endowmwnt for the Arts. The testing was too costly and there was difficulty in meeting the criteria for sampling.
Below is a link to the text of that long ago report, to which I was a contributor. This black and white version is a bit muddy but it does include many of the questions and some of the exercisesthat called for drawings by students.
Art and Young Americans, 1974-79: Results from the Second National Art Assessment. https://files.eric.ed.gov/fulltext/ED212538.pdf
If you are interested in more about the history of NAEP testing in the visual arts, I recommend this 2011 peer reviewed article: NAEP and Policy: Chasing the Tail of the Assessment Tiger. http://tombrewergallery.com/docs/Diket&Brewer10.pdf Secondary analyses of data from NAEP in the visual arts are replete with examples of the influence of parental wealth on opportunities to study and be engaged in art. Here is another article (critical of standards and testing) if you have access to a library. Eisner, Elliot. W. 1999. The national assessment in the visual arts. Arts Education Policy Review 100 (6): 16–20.
I sent them Diane’s letter to some of my collagues who have worked on NAEP. I hope some will respond to the invitation to comment.
Classroom teachers have the “real-time” information. Those tests are ridiculous.
YEP!
Our 100,000 public schools
have NEVER
and
will NEVER
have students with . . .
standardized intellects
standardized psyches
standardized talents
standardized abilities
standardized anatomies
standardized health histories
standardized experiences
standardized parents
standardized families
standardized home lives
standardized neighborhoods
standardized friends
standardized influences
standardized opportunities
standardized teachers
standardized curricula
standardized pedagogies
standardized motivations
standardized support systems
standardized goals and aspirations
Yet somehow,
the HOLY GRAIL of
the 21st century
education reform movement
was
standardization.
Any doubt as to why it has
FAILED?
How come more than 50% of Massachusetts students are “proficient” according to NAEP’s standards? Shouldn’t there be an interest in this phenomenon? Sandra Stotsky