New Study: New SAT is Biased Against Females

October 25, 2017 //

Testing companies typically invest in an in-depth review of test questions to assure that the questions are not biased against any group or subgroup. The testers are very concerned about bias, even hidden ones.

A new study asserts that the “new SAT” is gender biased.

Here is Mercedes Schneider’s take on the SAT’s disadvantaging of girls.
The “new SAT” is David Coleman’s latest project, following the Common Core fiasco.

I wrote a book about the process by which test publishers use “bias and sensitivity reviews” to identify and screen out questions and test items that disadvantage groups. I was critical of the way that the reviewers went overboard, eliminating any word or phrase, or image that anyone might consider insensitive. Feminists wanted to delete any word that identified anyone by gender, religious fundamentalists wanted to remove any reference to witches, pumpkins, Halloween, disobedient children. The book is called “The Language Police,” and it looked at censorship from left and right.

So, knowing that the SAT is subject to analysis for every sort of bias and disadvantage, what gives?

Any advice for David C?

Categories Accountability, Bias, Education Industry, Testing

33 Comments Post your own or leave a trackback: Trackback URL

Duane E Swacker says:

October 25, 2017 at 2:36 pm

To answer the first question:

Comparing two onto-epistemologically invalid tests (as proven by Noel Wilson in his 1997 dissertation “Educational Standards and the Problem of Error”)-the old SAT with the new SAT-can result only in invalid comparisons. Mental masturbation all of it. Eliminate the SAT and ACT for the error and falsehood filled malpractices that give false results and we don’t have to worry about the supposed effect of those invalidities on anyone.

Second question:

Don’t give a shit about David C, the blowhard who should have never have gotten into the supposed ed reform movement to begin with.

LikeLike

Reply
- Bob Shepherd says:
  
  October 25, 2017 at 3:17 pm
  
  Mr. Coleman knew who was writing the checks. LOL
  
  LikeLiked by 1 person
  
  Reply
- Mamie Krupczak Allegretti says:
  
  October 25, 2017 at 7:25 pm
  
  Duane,
  Quit relating ANY of this stuff to “masturbation!” It’s not THAT pleasant!! 🙂 🙂 🙂
  
  LikeLiked by 1 person
  
  Reply
  - dianeravitch says:
    
    October 25, 2017 at 7:58 pm
    
    Duane,
    
    I agree with Mamie.
    
    Please monitor your language on this blog. You have your own blog where you can let the curse words fly.
    
    LikeLike
  - Duane E Swacker says:
    
    October 26, 2017 at 9:31 am
    
    The cuss word used was meant to be a jab at David C’s usage of said word (which has been discussed many times prior on this blog). Nothing more.
    
    LikeLike
  - SomeDAM Poet says:
    
    October 26, 2017 at 11:57 am
    
    We should let Mamie have the last word, of course, but I don’t read her comment as taking offense with Duane’s word use.
    
    On the contrary, I read it as playing along with it:
    
    In other words, “what people at College Board do with their tests might be pleasurable, but NONE of it is THAT pleasant”
    
    And Duane,
    
    Next time use the proper spelling that starts with “math”
    
    LikeLike
  - Mamie Krupczak Allegretti says:
    
    October 26, 2017 at 2:44 pm
    
    Yes, that is what I meant, Poet. I was joking. However, I do think Duane goes a little overboard with the masturbation references. It takes much more to offend me though. 🙂 🙂
    
    LikeLike
retired teacher says:

October 25, 2017 at 3:04 pm

Coleman should just give up on the SAT. It has always been biased against females. Colleges should use a student’s GPA instead. It is a better predictor of college success. Females have traditionally performed lower than males on the SAT. Yet, females tend to outperform males in college.

LikeLike

Reply
Bob Shepherd says:

October 25, 2017 at 3:19 pm

It’s always been the case that high-school grades were better predictors, past the first year in college, of success in college. The SAT is an obscenity.

LikeLiked by 1 person

Reply
- ciedie aech says:
  
  October 26, 2017 at 3:54 pm
  
  In my years of working with very low-income an mostly non-dominant-culture kids, I learned that along with grades a reliable predictor of a student’s subsequent success in college was her/his attendance.
  
  LikeLiked by 1 person
  
  Reply
LeftCoastTeacher says:

October 25, 2017 at 4:11 pm

Very easy way to produce biased results while still minding the language police: ask questions about fire trucks, race cars, and video games. Since Coleman doesn’t give a _____ about what people feel, I am sure he has everyone reading and answering questions about computer code. Maybe coding robot space warriors that fly monster trucks… well, you get the idea. My advice to Coleman, it’s okay to have feelings. Doesn’t make you less macho.

LikeLike

Reply
- Bob Shepherd says:
  
  October 25, 2017 at 4:57 pm
  
  That should have been past the first semester.
  
  Let me repeat that. Studies have long shown that high-school grades are far better predictors, past the first semester, of success in college than is the SAT. That university administrators continue to use the SAT is laughable, incredibly stupid. Many have wised up and tossed the thing. But others are callow paper pushers who are happy to have any sorting mechanism.
  
  LikeLike
  
  Reply
sx04 says:

October 25, 2017 at 4:18 pm

I read the SAT bias report carefully. The new SAT math section reduced the gender gap for the highest scoring students. Eliminating the essay increased it. Is this bias or a balancing act?

Years ago I was chair of the College Board SAT Advisory Committee. We grappled with the differences between the highest scoring males and females then. We grappled with data that showed higher course grades for girls but a lower percentage of girls achieved the highest SAT scores. We wondered whether girls were more ‘responsible’ about their course work and boys were more creative. We wondered whether gender differences were a cultural phenomenon.

There are no real answers to these questions. It’s irksome to think that boys outscore girls at the very top of the score scale. It can have some impact on admission to very competitive schools or programs. If this is bias, it can be reduced by adding the essay back into the score scale.

I have a granddaughter who finds it cool to go with a friend to her math teacher’s room at lunch. They do ‘weird’ math problems together. They love it. Maybe my grandson will play with expressing complex thoughts in essays rather than simply ‘writing papers’?? Or maybe these gender differences are on the whole real. At least these questions relate to how children learn which is what we should all be talking about.

LikeLike

Reply
- Duane E Swacker says:
  
  October 25, 2017 at 5:55 pm
  
  My questions to you sx04:
  
  Will you please address the myriad onto-epistemological errors and falsehoods and psychometric fudging that Noel Wilson identified in his 1997 dissertation* and proves that standardized testing is completely invalid? Have you read Wilson’s work? Did the advisory committee ever consider Wilson’s complete destruction of standardized testing? If so what were the committee members’ responses? As it is do you know of any rebuttal/refutation to Wilson’s work?
  
  *“Educational Standards and the Problem of Error” found at: http://epaa.asu.edu/ojs/article/view/577/700
  
  LikeLike
  
  Reply
mathman says:

October 25, 2017 at 4:22 pm

This is what I notice at my high school; the National Honor Society is 84% female. The top 10 in each class is about 70 – 30 female.

Good for them. I am happy to see so many young women doing so well. But I am concerned that the numbers are not more balanced. What are we doing that causes us to lose our boys?

LikeLike

Reply
- Bob Shepherd says:
  
  October 25, 2017 at 5:03 pm
  
  A federal survey of about 9,000 young men and women born during the years 1980 to 1984 shows a big disparity when it comes to higher education, with women a third more likely to have received a bachelor’s degree by age 27. . . .
  
  At 27 years of age, 32 percent of women had received a bachelor’s degree, compared to 24 percent of men, and by the same age, some 70 percent of women had at least attended some college, compared to 61 percent of men.
  
  Digging deeper into the numbers, the survey found that once students enroll in college, women are more likely to don a cap and gown. Of the 70 percent of women who started college, 46 percent completed their bachelor’s degree by age 27. In comparison, of the 61 percent of men who started college, just 39 percent had completed their bachelor’s degree.
  
  In addition, 3 of 5 grad students are now female.
  
  We are losing boys. Our culture microconditions them that they must be winners–come out ahead in that stack ranking, that “race to the top,” to use Arnie Duncan’s moronic sports metaphor. When that doesn’t happen, they give up. That’s one reason. Another is that our boys can’t and don’t read. All video games all the time.
  
  LikeLike
  
  Reply
  - Bob Shepherd says:
    
    October 25, 2017 at 5:05 pm
    
    The stats are from this story: http://www.foxnews.com/us/2014/04/03/college-degree-gender-gap-widens-with-younger-gen-xers.html
    
    LikeLike
  - rickbobrick56 says:
    
    October 25, 2017 at 10:19 pm
    
    The vast majority of boys hate to write. Even more than they dislike to read (and think). The narrowing of the curriculum under 17 years of test and punish culture has placed an inordinate emphasis on writing – in both ELA and in math. The ridiculous push to make children explain in writing how or why their math solutions worked has backfired in very harmful ways. The unprecedented expansion of the null curriculum has resulted in fewer options for boys that would normally make school more appealing. Combine these with an American culture that rewards and celebrates stupidity, The Jackass Generation of young men who now take pride in maximizing buffoonery is not one that prioritizes academics. Sad.
    
    LikeLiked by 1 person
  - SomeDAM Poet says:
    
    October 26, 2017 at 12:54 pm
    
    The Common Core requirement to have students explain how they got their answer — and in N different ways! — is ridiculous and a sure interest killer.
    
    Students end up spending the majority of their homework time writing explanations for simple math problems.
    
    They are also forced to do problems in complicated ways when much simpler straightforward ones suffice.
    
    And on tests they often get penalized if they don’t explain how they got their answer, even if they got the right answer and clearly know how to do the problem.
    
    And because of Common Core, needlessly confusing word problems are now getting far more emphasis than they deserve — and apparently for no other reason than that Jason Zimba, who wrote the CC math standard, likes word problems.
    
    As a physicist, the guy knew basic math but knew nothing about writing standards.
    
    LikeLiked by 1 person
- Roy Turrentine says:
  
  October 25, 2017 at 6:22 pm
  
  Bob’s stats are borne out in my rural Tennessee community. Except that there is more to this than meets the stats. Boys in the culture are not encouraged to pursue intellectual paths in their lives. They are allowed to play around and be lazy by the same parents that made sure an older sister towed the line. In so many families, the boy is the person of privelege, the girl the person of duty.
  
  Another force is the numerical critical mass of students necessary to maintain an academic interest. Often in my career, I have seen groups of academically active girls feed off each other in a wonderful mix of learning. This culture is more rarely manifested in the male population, but has appeared more recently.
  
  Boys are more likely to produce academically if they are loners than are girls. Mixed groups are generally the best, with some boys in society with some girls, somewhat competing and somewhat cooperating toward a goal that is both individual and social.
  
  LikeLike
  
  Reply
Fred Smith says:

October 25, 2017 at 9:40 pm

Apparently, the testing industry can only survive by operating in the dark. The 1980 NYS Truth-in-Testing legislation was aimed at transparency with regard to ETS and college admissions testing. The law, which called for disclosure of research data by the publisher, was spurred by charges of item/test bias. Of course, ETS was against its passage.

Now, all these years later, everything has regressed to the point that psychometricians possess the data and we have to trust their expertise and their words that everything they did to produce the exams was technically sound. Thus, they stand as authorities on the validity of their own work. I do not trust them. They will hide and distort evidence and never let facts stand in the way of profits. A most salient point made in this article is that:

“College Board should share the data it had both before, during, and after the creation of the new SAT. College Board has not gone on record with its findings on the shift in gender inequities on the new SAT and has remained similarly mum—beyond misleading press releases—on how observed results have changed among other subgroups.” After-the-fact is the key.

This comment extends to high stakes NY statewide exams–where 1.2 million children are tested annually, but technical reports on the English and math exams are not issued for more than a year after they have been administered; where the FOIL process is used to thwart independent research into the quality of the instruments; and the state education department in partnership with the publisher, instead, assures us that everything was done properly.

Diane, I would challenge NYSED and the publisher to make data that is in their possession within weeks of test administration subject to outside analysis. This would allow an immediate post mortem study of the exams to take place–and we (the public, educators and researchers) would not have to rely on claims that all test items were
selected via a lengthy, elaborately-described review process prior to test construction. It would be more important and instructive to see how the items actually functioned and to determine whether, in fact, they yielded biased results.

Until then, everything said to justify the exams is smoke.

LikeLike

Reply
Máté Wierdl says:

October 26, 2017 at 5:41 am

I do not understand many of the comments above about women liking writing, boys not reading, etc—and I do not understand the article by Art Sawyer either.

Here is the indepth version of Sawyer’s article

https://www.compassprep.com/new-sat-has-disadvantaged-female-testers/

Sawyer claims on this chart (first line)

that females are doing worse on the new SAT because of increased weighing of math and decreased weighing of writing. In the same chart, you can immediately see the contradiction as you read the 3rd and 4th lines in the chart: women are doing better in math (which has increased weight) but doing worse in writing (which has decreased weight) in the new SAT.

Seeing this puzzle, I read the whole article. You indeed see that Math weighs more in the new SAT (50% in the new SAT vs 33% in the old one)

BUT top scoring women are doing better on the math portion of the new SAT than in the old SAT

What’s going on?

LikeLike

Reply
FLERP! says:

October 26, 2017 at 9:28 am

Are grades gender neutral, in terms of outcomes?

LikeLike

Reply
- Duane E Swacker says:
  
  October 26, 2017 at 9:55 am
  
  Grades suffer the same onto-epistemological errors and falsehoods that all assessments that attempt to take a complex process, the teaching and learning process, and describe it with a simple letter grade, a simple word or phrase-proficient, advanced, passing, failing, etc. . . that Noel Wilson showed in his 1997 dissertation “Educational Standards and the Problem of Error” found at:
  
  http://epaa.asu.edu/ojs/article/view/577/700
  
  Brief outline of Wilson’s “Educational Standards and the Problem of Error” and some comments of mine. (updated 6/24/13 per Wilson email)
  
  A description of a quality can only be partially quantified. Quantity is almost always a very small aspect of quality. It is illogical to judge/assess a whole category only by a part of the whole. The assessment is, by definition, lacking in the sense that “assessments are always of multidimensional qualities. To quantify them as unidimensional quantities (numbers or grades) is to perpetuate a fundamental logical error” (per Wilson). The teaching and learning process falls in the logical realm of aesthetics/qualities of human interactions. In attempting to quantify educational standards and standardized testing the descriptive information about said interactions is inadequate, insufficient and inferior to the point of invalidity and unacceptability.
  A major epistemological mistake is that we attach, with great importance, the “score” of the student, not only onto the student but also, by extension, the teacher, school and district. Any description of a testing event is only a description of an interaction, that of the student and the testing device at a given time and place. The only correct logical thing that we can attempt to do is to describe that interaction (how accurately or not is a whole other story). That description cannot, by logical thought, be “assigned/attached” to the student as it cannot be a description of the student but the interaction. And this error is probably one of the most egregious “errors” that occur with standardized testing (and even the “grading” of students by a teacher).
  Wilson identifies four “frames of reference” each with distinct assumptions (epistemological basis) about the assessment process from which the “assessor” views the interactions of the teaching and learning process: the Judge (think college professor who “knows” the students capabilities and grades them accordingly), the General Frame-think standardized testing that claims to have a “scientific” basis, the Specific Frame-think of learning by objective like computer based learning, getting a correct answer before moving on to the next screen, and the Responsive Frame-think of an apprenticeship in a trade or a medical residency program where the learner interacts with the “teacher” with constant feedback. Each category has its own sources of error and more error in the process is caused when the assessor confuses and conflates the categories.
  Wilson elucidates the notion of “error”: “Error is predicated on a notion of perfection; to allocate error is to imply what is without error; to know error it is necessary to determine what is true. And what is true is determined by what we define as true, theoretically by the assumptions of our epistemology, practically by the events and non-events, the discourses and silences, the world of surfaces and their interactions and interpretations; in short, the practices that permeate the field. . . Error is the uncertainty dimension of the statement; error is the band within which chaos reigns, in which anything can happen. Error comprises all of those eventful circumstances which make the assessment statement less than perfectly precise, the measure less than perfectly accurate, the rank order less than perfectly stable, the standard and its measurement less than absolute, and the communication of its truth less than impeccable.”
  
  In other words all the logical errors involved in the process render any conclusions invalid.
  
  The test makers/psychometricians, through all sorts of mathematical machinations attempt to “prove” that these tests (based on standards) are valid-errorless or supposedly at least with minimal error [they aren’t]. Wilson turns the concept of validity on its head and focuses on just how invalid the machinations and the test and results are. He is an advocate for the test taker not the test maker. In doing so he identifies thirteen sources of “error”, any one of which renders the test making/giving/disseminating of results invalid. And a basic logical premise is that once something is shown to be invalid it is just that, invalid, and no amount of “fudging” by the psychometricians/test makers can alleviate that invalidity.
  Having shown the invalidity, and therefore the unreliability, of the whole process Wilson concludes, rightly so, that any result/information gleaned from the process is “vain and illusory”. In other words start with an invalidity, end with an invalidity (except by sheer chance every once in a while, like a blind and anosmic squirrel who finds the occasional acorn, a result may be “true”) or to put in more mundane terms crap in-crap out.
  And so what does this all mean? I’ll let Wilson have the second to last word: “So what does a test measure in our world? It measures what the person with the power to pay for the test says it measures. And the person who sets the test will name the test what the person who pays for the test wants the test to be named.”
  
  In other words it attempts to measure “’something’ and we can specify some of the ‘errors’ in that ‘something’ but still don’t know [precisely] what the ‘something’ is.” The whole process harms many students as the social rewards for some are not available to others who “don’t make the grade (sic)” Should American public education have the function of sorting and separating students so that some may receive greater benefits than others, especially considering that the sorting and separating devices, educational standards and standardized testing, are so flawed not only in concept but in execution?
  
  My answer is NO!!!!!
  
  One final note with Wilson channeling Foucault and his concept of subjectivization:
  
  “So the mark [grade/test score] becomes part of the story about yourself and with sufficient repetitions becomes true: true because those who know, those in authority, say it is true; true because the society in which you live legitimates this authority; true because your cultural habitus makes it difficult for you to perceive, conceive and integrate those aspects of your experience that contradict the story; true because in acting out your story, which now includes the mark and its meaning, the social truth that created it is confirmed; true because if your mark is high you are consistently rewarded, so that your voice becomes a voice of authority in the power-knowledge discourses that reproduce the structure that helped to produce you; true because if your mark is low your voice becomes muted and confirms your lower position in the social hierarchy; true finally because that success or failure confirms that mark that implicitly predicted the now self-evident consequences. And so the circle is complete.”
  
  In other words students “internalize” what those “marks” (grades/test scores) mean, and since the vast majority of the students have not developed the mental skills to counteract what the “authorities” say, they accept as “natural and normal” that “story/description” of them. Although paradoxical in a sense, the “I’m an “A” student” is almost as harmful as “I’m an ‘F’ student” in hindering students becoming independent, critical and free thinkers. And having independent, critical and free thinkers is a threat to the current socio-economic structure of society.
  
  LikeLike
  
  Reply
  - Yvonne Siu-Runyan says:
    
    October 26, 2017 at 11:21 am
    
    YES to: “Although paradoxical in a sense, the “I’m an “A” student” is almost as harmful as “I’m an ‘F’ student” in hindering students becoming independent, critical and free thinkers. And having independent, critical and free thinkers is a threat to the current socio-economic structure of society.”
    
    Thank you, Duane.
    
    LikeLiked by 1 person
- teachingeconomist says:
  
  October 26, 2017 at 1:34 pm
  
  Flerp!,
  
  Boys as a group have lower GPAs than girls as a group. This has been true for at least the last twenty years. See https://www.nationsreportcard.gov/hsts_2009/gender_gpa.aspx
  
  LikeLike
  
  Reply
- Bob Shepherd says:
  
  October 26, 2017 at 3:50 pm
  
  No. They have been favoring the girls.
  
  LikeLike
  
  Reply
- teachingeconomist says:
  
  October 27, 2017 at 12:46 am
  
  Looking at a less aggregate measure, boys score higher on the ACT math and science while girls score higher in English and and social science. Overall, the gender differences are small in ACT scores. Girls, however, have higher high school GPA’s in all subjects, and there are large differences in teacher assigned grades between girls and boys. See http://www.act.org/content/dam/act/unsecured/documents/Info-Brief-2014-12.pdf
  
  If differences between boys and girls scores in the SAT is evidence of bias in favor of boys in the SAT test, surely the differences in grades is evidence of bias in favor of girls in high school education.
  
  LikeLike
  
  Reply
  - Máté Wierdl says:
    
    October 27, 2017 at 5:22 pm
    
    I think the safest and most important conclusion is that both boys and girls have to suffer through these useless exams, while the testmakers and the companies associated with these tests make an insane amount of money.
    
    LikeLike
  - teachingeconomist says:
    
    October 27, 2017 at 10:41 pm
    
    There is no doubt that useless exams are useless, just as there is no doubt that useful exams are useful.
    
    If we adopt high school grades as the sole measure of academic achievement there will be little hope for boys. The current ratio of 60% female to 40% male in competitive colleges and universities will go to 70% or perhaps 80% female.
    
    LikeLike
  - Máté Wierdl says:
    
    October 29, 2017 at 1:13 pm
    
    Not clear what you are implying. The reason why colleges drop SAT and ACT is that they are not as good predictors of successful college completion as high school GPA.
    
    So why should SAT or ACT be kept? So that more boys should get into colleges who then fail there?
    
    LikeLike
  - teachingeconomist says:
    
    October 29, 2017 at 3:43 pm
    
    Mate,
    
    I think schools have not taken into account that the sample of their own students has been censored by their admission process. This is very tricky to get right.
    
    In a selective school, students with low standardized test scores are admitted because the admission folks have a reason to believe that the standardized test does not accurately reflect the student’s academic ability. The students whose low standardized test scores the admissions people believe are accurate reflections of the student’s academic ability are not admitted and thus not part of the pool used by the institution to test if test scores matter for academic success.
    
    We should also think about the students with low GPAs that are admitted to selective schools. Athletes come to mind as a large group that would gain admission despite having a lower high school GPA. Perhaps the effectiveness of high school GPA is actually showing that college athletes, because of the huge time commitment for students athletes, tend to have lower college success than non-athletes.
    
    My own institution is nearly open admission, requiring only a 2.0 average over a set of high school courses for automatic admission to the university. The correlation between ACT scores and varies across student abilities, but it is strongest for those that would have no chance for admission to a selective university and thus would never be in the population that selective universities use to gauge the effectiveness of standardized exams to predict student success.
    
    I do not think students should be admitted to a university where they have little chance of thriving. My point is that using the single criteria of high school grades is a misleading because of the bias against male students. Lets take one of my sons as an example. He was not an athlete (he did theater in high school, another very time consuming extracurricular activity) and did not graduate in the top 10% of his class. If that is all the University of Chicago knew about him, they would have been unlikely to admit him, and the UC math department would have had one less freshman in Honors Analysis, and UC less honors mathematics graduate.
    
    LikeLike
  - Máté Wierdl says:
    
    October 30, 2017 at 5:22 am
    
    These are good points, thanks. I think we can avoid addressing the problem in terms of gender, which is imo questionable, by focusing on the essential issue: some kids have greater tolerance for busy work, and schools, universities and associated tests require more busy work than necessary. Creativity, curiosity, enthusiasm unfortunately play lesser role in achieving a good GPA.
    
    I have no idea how to objectively evaluate creativity or curiosity or enthusiasm. I suspect, it’s impossible, and so schools which pretend to have objective selection critera are fooling themselves and their clientele.
    
    LikeLike