Any social-psychological instrument must pass two tests to be considered accurate: reliability and validity. A psychological instrument is reliable if the same test subject, taking the test at different times, achieves roughly the same score each time. But IAT [Implicit Association Test] bias scores have a lower rate of consistency than is deemed acceptable for use in the real world—a subject could be rated with a high degree of implicit bias on one taking of the IAT and a low or moderate degree the next time around. A recent estimate puts the reliability of the race IAT at half of what is considered usable. No evidence exists, in other words, that the IAT reliably measures anything stable in the test-taker.
Few academic ideas have been as eagerly absorbed into public discourse in recent years as “implicit bias.” Embraced by a president, a would-be president, and the nation’s top law-enforcement official, the implicit-bias conceit has launched a movement to remove the concept of individual agency from the law and spawned a multimillion-dollar consulting industry. The statistical basis on which it rests is now crumbling, but don’t expect its influence to wane anytime soon.
Implicit bias purports to answer the question: Why do racial disparities persist in household income, job status, and incarceration rates, when explicit racism has, by all measures, greatly diminished over the last half-century? The reason, according to implicit-bias researchers, lies deep in our brains, outside the reach of conscious thought. We may consciously embrace racial equality, but almost all of us harbor unconscious biases favoring whites over blacks, the proponents claim. And those unconscious biases, which the implicit-bias project purports to measure scientifically, drive the discriminatory behavior that, in turn, results in racial inequality.
The need to plumb the unconscious to explain ongoing racial gaps arises for one reason: it is taboo in universities and mainstream society to acknowledge intergroup differences in interests, abilities, cultural values, or family structure that might produce socioeconomic disparities.
The implicit-bias idea burst onto the academic scene in 1998 with the rollout of a psychological instrument called the implicit association test (IAT). Created by social psychologists Anthony Greenwald and Mahzarin Banaji, with funding from the National Science Foundation and National Institute of Mental Health, the IAT was announced as a breakthrough in prejudice studies: “The pervasiveness of prejudice, affecting 90 to 95 percent of people, was demonstrated today . . . by psychologists who developed a new tool that measures the unconscious roots of prejudice,” read the press release.
The race IAT (there are non-race varieties) displays a series of black faces and white faces on a computer; the test subject must sort them quickly by race into two categories, represented by the “i” and “e” keys on the keyboard. Next, the subject sorts “good” or “positive” words like “pleasant,” and “bad” or “negative” words like “death,” into good and bad categories, represented by those same two computer keys. The sorting tasks are then intermingled: faces and words appear at random on the screen, and the test-taker has to sort them with the “i” and “e” keys. Next, the sorting protocol is reversed. If, before, a black face was to be sorted using the same key as the key for a “bad” word, now a black face is sorted with the same key as a “good” word and a white face sorted with the reverse key. If a subject takes longer sorting black faces using the computer key associated with a “good” word than he does sorting white faces using the computer key associated with a “good” word, the IAT deems the subject a bearer of implicit bias. The IAT ranks the subject’s degree of implicit bias based on the differences in milliseconds with which he accomplishes the different sorting tasks; at the end of the test, he finds out whether he has a strong, moderate, or weak “preference” for blacks or for whites. A majority of test-takers (including many blacks) are rated as showing a preference for white faces. Additional IATs sort pictures of women, the elderly, the disabled, and other purportedly disfavored groups.
Greenwald and Banaji did not pioneer such response-time studies; psychologists already used response-time methodology to measure how closely concepts are associated in memory. And the idea that automatic cognitive processes and associations help us navigate daily life is also widely accepted in psychology. But Greenwald and Banaji, now at the University of Washington and Harvard University, respectively, pushed the response-time technique and the implicit-cognition idea into charged political territory. Not only did they confidently assert that any differences in sorting times for black and white faces flow from unconscious prejudice against blacks; they also claimed that such unconscious prejudice, as measured by the IAT, predicts discriminatory behavior. It is “clearly . . . established that automatic race preference predicts discrimination,” they wrote in their 2013 bestseller Blind Spot, which popularized the IAT. And in the final link of their causal chain, they hypothesized that this unconscious predilection to discriminate is a cause of racial disparities: “It is reasonable to conclude not only that implicit bias is a cause of Black disadvantage but also that it plausibly plays a greater role than does explicit bias in explaining the discrimination that contributes to Black disadvantage.”
The implicit-bias conceit spread like wildfire. President Barack Obama denounced “unconscious” biases against minorities and females in science in 2016. NBC anchor Lester Holt asked Hillary Clinton during a September 2016 presidential debate whether “police are implicitly biased against black people.” Clinton answered: “Lester, I think implicit bias is a problem for everyone, not just police.” Then–FBI director James Comey claimed in a 2015 speech that “much research” points to the “widespread existence of unconscious bias.” “Many people in our white-majority culture,” Comey said, “react differently to a white face than a black face.” The Obama Justice Department packed off all federal law-enforcement agents to implicit-bias training. Clinton promised to help fund it for local police departments, many of which had already begun the training following the 2014 fatal police shooting of Michael Brown in Ferguson, Missouri.
A parade of journalists confessed their IAT-revealed preferences, including Malcolm Gladwell in his acclaimed book Blink. Corporate diversity trainers retooled themselves as purveyors of the new “science of bias.” And the legal academy started building the case that the concept of intentionality in the law was scientifically obtuse. Leading the charge was Jerry Kang, a UCLA law professor in the school’s critical race studies program who became UCLA’s fantastically paid vice chancellor for Equity, Diversity and Inclusion in 2015 (starting salary: $354,900, now up to $444,000). “The law has an obligation to respond to changes in scientific knowledge,” Kang said in a 2015 lecture. “Federal anti-discrimination law has been fixated on, and obsessed with, conscious intent.” But the new “behavioral realism,” as the movement to incorporate IAT-inspired concepts into the law calls itself, shows that we “discriminate without the intent and awareness to discriminate.” If we look only for conscious intent, we will “necessarily be blind to a whole bunch of real harm that is painful and consequential,” he concluded. Kang has pitched behavioral realism to law firms, corporations, judges, and government agencies.
A battle is under way regarding the admissibility of IAT research in employment-discrimination lawsuits: plaintiffs’ attorneys regularly offer Anthony Greenwald as an expert witness; the defense tries to disqualify him. Greenwald has survived some defense challenges but has lost others. Kang is philosophical: “It might not matter if Tony’s expert testimony is kicked out now,” he said in his 2015 lecture—in ten years, everyone will know that our brains harbor hidden biases. And if that alleged knowledge becomes legally actionable, then every personnel decision can be challenged as the product of implicit bias. The only way to guarantee equality of opportunity would be to mandate equality of result through quotas, observes the University of Pennsylvania’s Philip Tetlock, a critic of the most sweeping IAT claims.
The potential reach of the behavioral-realism movement, which George Soros’s Open Society Foundation is underwriting, goes far beyond employment-discrimination litigation. Some employers are using the IAT to screen potential workers, diversity consultant Howard Ross says. More and more college administrations require members of faculty-search committees to take the IAT to confront their hidden biases against minority and female candidates. Promotion committees at many corporations undergo the IAT. UCLA law school strongly encourages incoming law students to take the test to confront their implicit prejudice against fellow students; the University of Virginia might incorporate the IAT into its curriculum. Kang has argued for FCC regulation of how the news media portray minorities, to lessen implicit prejudice. If threats to fair treatment “lie in every mind,” as Kang and Banaji argued in a 2006 California Law Review article, then the scope for government intervention in private transactions to overcome those threats is almost limitless.
But though proponents refer to IAT research as “science”—or, in Kang’s words, “remarkable,” “jaw-dropping” science—their claims about its social significance leapfrogged ahead of scientific validation. There is hardly an aspect of IAT doctrine that is not now under methodological challenge.
Any social-psychological instrument must pass two tests to be considered accurate: reliability and validity. A psychological instrument is reliable if the same test subject, taking the test at different times, achieves roughly the same score each time. But IAT bias scores have a lower rate of consistency than is deemed acceptable for use in the real world—a subject could be rated with a high degree of implicit bias on one taking of the IAT and a low or moderate degree the next time around. A recent estimate puts the reliability of the race IAT at half of what is considered usable. No evidence exists, in other words, that the IAT reliably measures anything stable in the test-taker.
But the fiercest disputes concern the IAT’s validity. A psychological instrument is deemed “valid” if it actually measures what it claims to be measuring—in this case, implicit bias and, by extension, discriminatory behavior. If the IAT were valid, a high implicit-bias score would predict discriminatory behavior, as Greenwald and Banaji asserted from the start. It turns out, however, that IAT scores have almost no connection to what ludicrously counts as “discriminatory behavior” in IAT research—trivial nuances of body language during a mock interview in a college psychology laboratory, say, or a hypothetical choice to donate to children in Colombian, rather than South African, slums. Oceans of ink have been spilled debating the statistical strength of the correlation between IAT scores and lab-induced “discriminatory behavior” on the part of college students paid to take the test. The actual content of those “discriminatory behaviors” gets mentioned only in passing, if at all, and no one notes how remote those behaviors are from the discrimination that we should be worried about.
Even if we accept at face value that the placement of one’s chair in a mock lab interview or decisions in a prisoner’s-dilemma game are significant “discriminatory behaviors,” the statistical connection between IAT scores and those actions is negligible. A 2009 meta-analysis of 122 IAT studies by Greenwald, Banaji, and two management professors found that IAT scores accounted for only 5.5 percent of the variation in laboratory-induced “discrimination.” Even that low score was arrived at by questionable methods, as Jesse Singal discussed in a masterful review of the IAT literature in New York. A team of IAT skeptics—Fred Oswald of Rice University, Gregory Mitchell of the University of Virginia law school, Hart Blanton of the University of Connecticut, James Jaccard of New York University, and Philip Tetlock—noticed that Greenwald and his coauthors had counted opposite behaviors as validating the IAT. If test subjects scored high on implicit bias via the IAT but demonstrated better behavior toward out-group members (such as blacks) than toward in-group members, that was a validation of the IAT on the theory that the subjects were overcompensating for their implicit bias. But studies that found a correlation between a high implicit-bias score and discriminatory behavior toward out-group members also validated the IAT. In other words: heads, I win; tails, I win.
Greenwald and Banaji now admit that the IAT does not predict biased behavior. The psychometric problems associated with the race IAT “render [it] problematic to use to classify persons as likely to engage in discrimination,” they wrote in 2015, just two years after their sweeping claims in Blind Spot. The IAT should not be used, for example, to select a bias-free jury, maintains Greenwald. “We do not regard the IAT as diagnosing something that inevitably results in racist or prejudicial behavior,” he told The Chronicle of Higher Education in January. Their fallback position: though the IAT does not predict individual biased behavior, it predicts discrimination and disadvantage in the aggregate. “Statistically small effects” can have “societally large effects,” they have argued. If a society has higher levels of implicit bias against blacks as measured on the IAT, it will allegedly have higher levels of discriminatory behavior. Hart Blanton, one of the skeptics, dismisses this argument. If you don’t know what an instrument means on an individual level, you don’t know what it means in the aggregate, he told New York’s Singal. In fairness to Greenwald and Banaji, it is true that a cholesterol score, say, is more accurate at predicting heart attacks the larger the sample of subjects. But too much debate exists about what the IAT actually measures for much confidence about large-scale effects.
Initially, most of the psychology profession accepted the startling claim that one’s predilection to discriminate in real life is revealed by the microsecond speed with which one sorts images. But possible alternative meanings of a “pro-white” IAT score are now beginning to emerge. Older test-takers may have cognitive difficulty with the shifting instructions of the IAT. Objective correlations between group membership and socioeconomic outcomes may lead to differences in sorting times, as could greater familiarity with one ethnic-racial group compared with another. These alternative meanings should have been ruled out before the world learned that a new “scientific” test had revealed the ubiquity of prejudice.
The most recent meta-analysis deals another blow to the conventional IAT narrative. This study, not yet formally published, looked at whether changes in implicit bias allegedly measured by the IAT led to changes in “discriminatory behavior”—defined as the usual artificial lab conduct. While small changes in IAT scores can be induced in a lab setting through various psychological priming techniques, they do not produce changes in behavior, the study found. The analyses’ seven authors propose a radical possibility that would halt the implicit-bias crusade in its tracks: “perhaps automatically retrieved associations really are causally inert”—that is, they have no relationship to how we act in the real world. Instead of “acting as a ‘cognitive monster’ that inevitably leads to bias-consistent thought and behavior,” the researchers propose, “automatically retrieved associations could reflect the residual ‘scar’ of concepts that are frequently paired together within the social environment.” If this is true, they write, there would need to be a “reevaluation of some of the central assumptions that drive implicit bias research.” That is an understatement.
Among the study’s authors are Brian Nosek of the University of Virginia and Calvin Lai of Washington University in St. Louis. Both have collaborated with Greenwald and Banaji in furthering the dominant IAT narrative; Nosek was Banaji’s student and helped put the IAT on the web. It is a testament to their scientific integrity that they have gone where the data have led them. (Greenwald warned me in advance about their meta-analysis: “There has been a recent rash of popular press critique based on a privately circulated ‘research report’ that has not been accepted by any journal, and has been heavily criticized by editor and reviewers of the one journal to which I know it was submitted,” he wrote in an e-mail. But the Nosek, Lai, et al. study was not “privately circulated”; it is available on the web, as part of the open-science initiative that Nosek helped found.)
The fractious debate around the IAT has been carried out exclusively at the micro-level, with hundreds of articles burrowing deep into complicated statistical models to assess minute differences in experimental reaction times. Meanwhile, outside the purview of these debates, two salient features of the world go unnoticed by the participants: the pervasiveness of racial preferences and the behavior that lies behind socioeconomic disparities.
One would have difficulty finding an elite institution today that does not pressure its managers to hire and promote as many blacks and Hispanics as possible. Nearly 90 percent of Fortune 500 companies have some sort of diversity infrastructure, according to Howard Ross. The federal Equal Employment Opportunity Commission requires every business with 100 or more employees to report the racial composition of its workforce. Employers know that empty boxes for blacks and other “underrepresented minorities” can trigger governmental review. Some companies tie manager compensation to the achievement of “diversity,” as Roger Clegg documented before the U.S. Civil Rights Commission in 2006. “If people miss their diversity and inclusion goals, it hurts their bonuses,” the CEO of Abbott Laboratories said in a 2002 interview. Since then, the diversity pressure has only intensified. Google’s “objectives and key results” for managers include increased diversity. Walmart and other big corporations require law firms to put minority attorneys on the legal teams that represent them. “We are terminating a firm right now strictly because of their inability to grasp our diversity expectations,” Walmart’s general counsel announced in 2005. Any reporter seeking a surefire story idea can propose tallying up the minorities in a particular firm or profession; Silicon Valley has become the favorite subject of bean-counting “exposés,” though Hollywood and the entertainment industry are also targets of choice. Organizations will do everything possible to avoid such negative publicity.
In colleges, the mandate to hire more minority (and female) candidates hangs over almost all faculty recruiting. (Asians don’t count as a “minority” or a “person of color” for academic diversity purposes, since they are academically competitive.) Deans have canceled faculty-search results and ordered the hiring committee to go back to the drawing board if the finalists are not sufficiently “diverse.” (See “Multiculti U,” Spring 2013.) Every selective college today admits black and Hispanic students with much weaker academic qualifications than white and Asian students, as any high school senior knows. At the University of Michigan, for example, an Asian with the same GPA and SAT scores as the median black admit had zero chance in 2005 of admission; a white with those same scores had a 1 percent chance of admission. At Arizona State University, a white with the same academic credentials as the average black admit had a 2 percent chance of admission in 2006; that average black had a 96 percent chance of admission. The preferences continue into graduate and professional schools. UCLA and UC Berkeley law schools admit blacks at a 400 percent higher rate than can be explained on race-neutral grounds, though California law in theory bans them from using racial preferences. From 2013 to 2016, medical schools nationally admitted 57 percent of black applicants with low MCATs of 24 to 26 but only 8 percent of whites and 6 percent of Asians with those same low scores, as Frederick Lynch reported in the New York Times. The reason for these racial preferences is administrators’ burning desire to engineer a campus with a “critical mass” of black and Hispanic faces.
Similar pressures exist in the government and nonprofit sectors. In the New York Police Department, blacks and Hispanics are promoted ahead of whites for every position to which promotion is discretionary, as opposed to being determined by an objective exam. In the 1990s, blacks and Hispanics became detectives almost five years earlier than whites and took half the time as whites did to be appointed to deputy inspector or deputy chief.
And yet, we are to believe that alleged millisecond associations between blacks and negative terms are a more powerful determinant of who gets admitted, hired, and promoted than these often explicit and heavy-handed preferences. If a competitively qualified black female PhD in computer engineering walks into Google, say, we are to believe that a recruiter will unconsciously find reasons not to hire her, so as to bring on an inferior white male. The scenario is preposterous on its face—in fact, such a candidate would be snapped up in an instant by every tech firm and academic department across the country. The same is true for competitively qualified black lawyers, accountants, and portfolio managers. Continue reading…
Subscribe to Free “Top 10 Stories” Email
Get the top 10 stories from The Aquila Report in your inbox every Tuesday morning.