(Cite as: Williams, Damien P. ‘”Any Sufficiently Advanced Neglect is Indistinguishable from Malice”: Assumptions and Bias in Algorithmic Systems;’ talk given at the 21st Conference of the Society for Philosophy and Technology; May 2019)
Now, I’ve got a chapter coming out about this, soon, which I can provide as a preprint draft if you ask, and can be cited as “Constructing Situated and Social Knowledge: Ethical, Sociological, and Phenomenological Factors in Technological Design,” appearing in Philosophy And Engineering: Reimagining Technology And Social Progress. Guru Madhavan, Zachary Pirtle, and David Tomblin, eds. Forthcoming from Springer, 2019. But I wanted to get the words I said in this talk up onto some platforms where people can read them, as soon as possible, for a couple of reasons.
First, the Current Occupants of the Oval Office have very recently taken the policy position that algorithms can’t be racist, something which they’ve done in direct response to things like Google’s Hate Speech-Detecting AI being biased against black people, and Amazon claiming that its facial recognition can identify fear, without ever accounting for, i dunno, cultural and individual differences in fear expression?
All these things taken together are what made me finally go ahead and get the transcript of that talk done, and posted, because these are events and policy decisions about which I a) have been speaking and writing for years, and b) have specific inputs and recommendations about, and which are, c) frankly wrongheaded, and outright hateful.
And I want to spend time on it because I think what doesn’t get through in many of our discussions is that it’s not just about how Artificial Intelligence, Machine Learning, or Algorithmic instances get trained, but the processes for how and the cultural environments in which HUMANS are increasingly taught/shown/environmentally encouraged/socialized to think is the “right way” to build and train said systems.
That includes classes and instruction, it includes the institutional culture of the companies, it includes the policy landscape in which decisions about funding and get made, because that drives how people have to talk and write and think about the work they’re doing, and that constrains what they will even attempt to do or even understand.
All of this is cumulative, accreting into institutional epistemologies of algorithm creation. It is a structural and institutional problem.
So here are the Slides:
The Audio:
And the Transcript is here below the cut:
0:00
…algorithmic bias, machine learning, artificial intelligence, and it is called “Any Sufficiently Advanced Neglect Is Indistinguishable From Malice.”
This title comes from a repurposing of Clarke’s Law, “any sufficiently advanced technology is indistinguishable from magic,” repurposed by Dr. Debbie Chachra. She’s a material science engineer at Olin College of Engineering, up in Boston.
This idea that any sufficiently advanced neglect is indistinguishable from malice means that, we’re talking about instances in which a hazard was known, or least was foreseen by certain groups, was warned about and was warned about persistently enough in relation to either a system put in place, a technology created, or both, but was ignored—that the position of those who forward this knowledge was went unheeded. And it then created what were, for those groups, entirely foreseeable harms. Those harms then have, in turn, persisted long enough, with enough people raising a cry about them, who then go on to also be subsequently ignored, that those who claim ignorance to them are not really meaningfully distinguished from those who would actively seek to harm.
A harm created through persistent ignorance—through willful ignorance of harm raised—is not necessarily very different from harm intentionally done.
To talk about this, I want to go through a few case studies—a couple of them are going to be very familiar, because we’ve literally just heard about them—but a couple of them will hopefully be new to you. And I’m going to give the case studies of them and I’m going to go ahead and give some of their backgrounds as well.
Various resume sorting algorithms currently exist, which have been trained to sort resumes, based on applicant pools that have problems with things like “women sounding” names, or “Black sounding” names. Those resumes, even when controlled for exactly the same credentials, training, background, and education go on to be rated lower than names that are “white sounding” or “male sounding.” This has been a persistent problem in resume sorting by humans for a very long time, and when the resume sorting training was given to various machine learning algorithms, those biases made their way into those systems.
Many are trying to find ways around this. Anything from as simple as just removing names from the applicant pool and doing a more “blind” review, to actively antagonizing against that bias, for a different kind of bias to the other end of things. Natural Language Processing has a lot to do with this. There’s a case that came out, 2016, that said, that biases in natural language corpora, like email caches that are used to train natural language processing for machine learning algorithms, then managed to pass on biases like gender differentials, to those machine learning algorithms and those natural language processing systems.
There’s a reason for this.
Do you know what the largest cache of openly available natural language [corpus] is?
[AUDIENCE MEMBER]
Google search?
[DPW]
No. It’s the Enron emails.
So. A group of emails, comprising hundreds of thousands, if not millions of lines of natural language text between a very specific class and category of people, who talk in very specific, gendered, powered, and racialized ways about the topics under discussion. This is what’s used because the Enron emails were entered into public domain when Enron was put on trial, and so now it is publicly and freely available to all to us to train their algorithms.
Bail and sentencing algorithms we literally just heard about, but I want to talk about a little bit of a different case. This is from the Broward County, Florida uses of Compas. You see pretty much exactly the same things that Clinton was talking about, though. On the left, we have Bernard Parker, one prior offense for resisting arrest without violence—that is most likely he probably moved his arm while an officer was putting his cuffs on (that is “resisting arrest,” by the way)—zero subsequent offenses until the time at which he was re-entered into the system and rated a high risk. Dylann Fugett: one prior offense for attempted burglary, three subsequent offenses after that, for drug possession, rated according to Compas as a low risk.
Same sentencing county, obviously: Broward County, Florida; same, high likelihood of being the same judge doing the sentencing; and these very drastically different outcomes. You can get this information and you can see the comments algorithms, all of their metrics, looking at roughly 11,000 to 12,000 different subjects in Broward County, Florida, you can find this on ProPublica’s investigation, looking at the you know the way that Compas operates.
Self-driving cars: Machine vision, LIDAR, detection of self-driving cars, and how self-driving cars operate on the road. Self-driving cars and the algorithms that allow them to see the road are trained based off of pattern recognition, matching algorithms that are designed to teach it how to see what it sees. What self-driving cars cannot see very well…are Black people and wheelchairs. Especially if a person using a wheelchair does what many people do when they use wheelchairs. I don’t know how many of you know wheelchair users in your lives. Sometimes instead of what we think of as standard sitting-in-a-chair-and-pushing-the-wheels-forward kind of use, people in wheelchairs—wheelchair users tend to push backwards off of things. Because it increases speed and power and the ability to maneuver.
Yeah, a self-driving car has no idea what to do with that. And the likelihood of a self-driving car hitting a wheelchair user who is using a wheelchair in what it considers to be a “non-standard” way is roughly 90%.
Speaking of imaging systems not being able to see Black people: Google had a very persistent problem of not being able to properly categorize Black people in its image search. In fact, when it was given images of Black individuals, it returned pictures of gorillas or chimpanzees. This is not because somebody went in and taught it that Black people are gorillas or chimpanzees. This is because it was not taught anything and it tried to match its “best fit.” Because no one in the team doing the initial training and data collection, thought to give it a better way of understanding or seeing…people with darker skin tones. This has a long, long history in image collection and production.
We’ll talk about that again in a second. But before we get there, I also want to talk about the fact that Nikon cameras…ask questions like this:
They look at individuals with Asian phenotypes, who happened to be smiling in their image and facial recognition, and ask, “Did somebody blink?” Which—in case you weren’t aware, somehow—is super racist. But because nobody on that team, nobody on the development, the design, the process, the training, thought that, “Hey, there are certain metrics that we are coding for, or assuming to be universal across the board,” this didn’t come up until this was out in the field.
In addition, we have problems like automatic sinks, soap dispensers, a paper towel dispensers not being able to see darker skin tones. I had this problem yesterday in the YMCA building on the fourth floor. Had to walk downstairs to use the sink.
You have HP developing a motion sensitive camera that’s supposed to track faces—to keep the face in the center of the frame, regardless of where the user sits—not being able to see Black people.
You have a long racialized history of darker skin tones not being able to be properly rendered in photographic equipment, that has been digitized and translated into digital camera technologies, and encoded as tools and techniques that get rendered and used in technologies.
Photographic technology was designed and developed for the use of affluent white people. That sounds reductive, but it’s just the fact of the matter. When it was developed, that’s who would get to use it. When you were asking, “How do we make sure the details are renderable for the people in this picture,” the details you were looking at where the faces of the white people being photographed. That required the use of certain tools of optics, certain techniques of chemistry, to make sure that the contrast was properly allocated, so that those people could be seen. What that turned into was clear detail on light colors… and almost impossible-to-render details on anything darker.
If you look at pictures, photography from the 19 and early 20th century, what you will see very often—if there happens to somehow for some reason be a Black person in that picture—is a dark blur. These tools and techniques were rendered and reinforced and re-inscribed, over and over and over again, until such point as they became the accepted way of doing photography. That “accepted way” of doing photography then became the techniques and tools, the accepted methodology, by which digital camera technologies were trained. Even today, a digital camera will “white balance” on the lightest thing in the frame before it tries to balance anything darker. This comes from Kodak’s Shirley Cards, which was literally a white woman named Shirley and you would use it to balance the white image for the picture, and you would try to like, get as clear an image of her as you could.
Again, this is not to say that somebody said, “You know who I don’t want to take pictures of? Black people!” I mean, somebody probably said that, but like, photography as a whole didn’t say that. Right? What happened is that a series of assumptions, a series of biases about the way things are, and the way things probably would continue to be were inscribed as assumed knowledge. And that re-inscription and re-assumption, once again, got encoded for hundreds of years.
It has made its way into surveillance technology.
One of the weird offshoots of facial recognition not being able to see Black people very well is that, ironically and paradoxically, it gets used with greater frequency, on communities of color. Communities of color are very much more often subject to over-policing, on the assumption that the person “fits the description,” brought in by police, and, in many cases, just outright harassed, because they again “fit the description.” Facial recognition systems that cannot see darker skin tones are still much more likely to be rendered off of mug shots from databases that include those people who “fit the description,” but match no specific features of description. What this means, ultimately, is that facial recognition is going to be the least accurate on the population on whom it is used the most often.
You can look at the Georgetown Center For Privacy and Technology study, “The Perpetual Lineup,” from 2016; they talk a lot about exactly this. It’s a very long, very detailed and thoroughgoing report. It is fantastic.
However, many people think, “well the answer to this is inclusion, right? We diversify the teams who are doing the training, we diversify the applicant pool, we diversify the training pool.” However. That doesn’t work for everybody. In their 2018 talk called “Don’t Include Us, Thank You,” sarah aoun and Nasma Ahmed look at Simone Browne’s Dark Matters, talking about the history of facial recognition and photographic technologies as we recently discussed them, and they talk about the idea that, even if we were to make these technologies more accurate for the people who are most likely to be subject to them, it’s not going to make them less often used on communities of color; it will in fact, provide an excuse to use them more often on communities of color. Taylor Stone’s talk yesterday about street lights and surveillance made me think a lot about this, when I was thinking about the idea of people who do surveillance on communities that they expect to be trouble. This is that exact same kind of problem.
Moving away from facial recognition, but back into Google, we can talk about Dylann Roof’s Google history. Dylann Roof killed nine people in a church in Charleston, South Carolina. He did so because he had a moment during the Trayvon Martin/George Zimmerman trial in which he thought that he was understanding something about the nature of crime and inequality in America. And he was driven on his own account, for reasons that he couldn’t quite articulate, to search “Black on white crime.” We have no idea to know exactly what he was returned, but the way that Google Search works (and you can look up exactly how Google Search works), it’s highly likely that very first thing he saw was—based on his ISP, based on his location, and based on the searches in his area—white supremacist propaganda about statistics about Black on white crime.
I’ve got a short primer basically on how to change your Google Search settings, by the way, if anybody wants to take a look at that later. You can make it not take your results from your surrounding area, and you can make it remove as much about you as possible.
A more recent, tragic example, the Tree of Life synagogue shooting in Pittsburgh, just this last year. A white supremacist wandered into the synagogue and murdered a dozen people. Days—literally days—after this happened, Facebook’s online ad metric architecture, which has been given wide berth to create categories for advertising on its own remit, generated a category for the white supremacist conspiracy theory, “white genocide.” It said to someone who was talking about Jewish life and the hassles and hazards of Jewish life, an investigator from The Intercept, “Hey, this post that you’re making seems to fit with the ideas of these people who are interested in ‘white genocide,’ there’s about 180,000 people on Facebook who are interested in ‘white genocide,’ and if you add that word or that phrase in, your post will reach a wider audience.”
Facebook’s ad mechanism did this on its own. It was trained to do this to find those patterns and to generate development add categories on its own. A week later, Amazon did the same thing.
So how does this happen?
Like I said, it happens because the data sets these things are given, the code that they’re trained with, and assumptions at base—assumptions of things like objectivity of neutrality, or shared knowledge and experience of the world. In each of these cases, again, a community of individuals spoke up in advance and very clearly said, “Hey, maybe don’t do that. Maybe don’t create the algorithm to do these things. maybe think about the outcome of these technologies. Because these technologies have a history, prior to this, in such a way that it is highly likely that they will continue to reproduce systems of oppression, and bias, and bigotry. So maybe rethink what you’re doing.” And in each case, they were not heeded. Why?
Because of what we count as knowledge in the first place. What we think counts as knowledge doesn’t often include things like lived experience, and if it does include lived experience, the person whose lived experience that includes is often not those who have been marginalized by overarching systems of knowledge, assumption, authority, and expertise. We tend to preference systematized knowledge, but systems based on what? Systems based on what kinds of inputs? It’s a question that we do not often enough, ask? And the question that we very rarely ever ask is, “What about both of these things in tandem? What about people who, through their lived experience, have developed systems of knowledge that are not exactly reproducible for anyone who has not had that lived experience, directly?”
Who gets to know who gets to lay claim to knowledge, expertise, to have that knowledge and that expertise, heeded and recognized by the wider world?
Sorry for the wall of text.
A few fundamental points, here:
“Different phenomenological and post-phenomenological experiences produce different pictures of the world, different systems of knowledge by which to navigate that world.
“Code is not neutral, it is a language and like with any language, translation is an issue; we are translating our knowledge—our lived experience gained from perspective—into technoscientific language [that] systems can understand.
“People inscribe their values, their perspectives, into every single tool and system they create and into how they use them.”
We need to think intersectionally and intersubjectively about the construction of our knowledge need to think about those people who have not been included in the conversation about what it is we ought to be thinking about in the first place, what systems we ought, and ought not, to be trying to create, and how we ought to be, and ought not to be, deploying them.
For this we can think about Donna Haraway’s notion of the subaltern*; we can think about again, Kimberlé Williams Crenshaw’s notion of intersectionality of oppression; we can think about the idea that this is not about some kind of Oppression Olympics, it’s about the idea that different Locuses [sic] of Power, different identities, different subjectivities, will be impressed upon and subjectified in different ways, depending upon the societies in which they live.
I usually ask these questions at the outset of my talks, but I feel like this is a good place for them. These foundational questions of things like, how do you travel home? When you travel home outside of a car, where are your keys? What do you do when a police officer pulls you over? What kinds of things about your body do you struggle with whether and when to tell a new romantic partner? If you are able to stand, for how long? How do you prepare your hair on any given morning? What strategies do you have for keeping yourself out of institutional mental care? Without looking, how many exits to the lobby are there, and how fast can you reach them, encountering the fewest people possible? What’s the highest you can reach, unassisted? What’s the best way to reject someone’s romantic advances such that it is less likely that they will physically assault you?
Each and every one of these questions represents a category of lived experience and a system of knowledge developed around a way of behaving and interacting with and predicting in the world—developed around real, everyday lived experiences for trying to survive and save one’s own life.
It matters who gets to know, to be known, and to translate their knowledge into technoscientific systems and devices.
Thank you.
[Begin Question and Answer Portion] 23:06
I have here at the end, quite a long list of references to all the things that I was talking about, in case you want to look them up. They are increasing every day.
Questions? Matt.
[Dr Matthew Brown] 23:30
So, I sort of feel like the beginning of your talk… so the initial framing of the talk and the end of the talk are kind of competing to interpret the examples in the middle. So what I heard in the beginning was a kind of discussion about foreseeable harms and the failure to take into account foreseeable harms—I would, you know, I would use the language of like “moral recklessness” and “moral negligence” to think about, and then I would then I would interpret the examples then as a sort of straightforward ethical failure, a failure to take into account the risk and appropriate way. The end of the talk is sort of epistemologically framed, and it’s about situated knowledges and the failure is, is a failure, kind of, of the knowledge system, right? Rather than being a sort of values or ethics-oriented failure. So I was, I was hoping that you could say something to kind of bring it together.
[DPW] 24:46
Not “rather than;” “because of.” The failure in the values and the moral system is because of the failure in knowledge, because of the failure in epistemological reckoning, and a recognition of the epistemological status of those who might otherwise have prevented these moral failures, or at least mitigated them. That, in and of itself, is also a moral failure. It’s a systemic moral failure. It’s a moral failure to recognize those people as holders of knowledge, as caretakers of expertise, to recognize what they have as expertise and systems of knowledge as such, because they are not presented in very specific, structured ways. And by that mechanism—by that metric—they are then discounted as potential sites and sources of knowledge. And in so doing, we lose access to the moral status, the values framework on which we might have made better choices.
[MB]
So foresight is the linking piece?
[DPW]
Yes. Yeah, Gordon.
[Prof. Gordon Hull] 25:57
Yes, there’s two ways to construe the claim here, I think, I just want you to tell me if I’m out of it, or which one is more focal point of the thing. So one way is, we say, “well there’s a problem with the failure to include different kinds of people in these systems; so, if the training data would include Black people, or taken into account that there was over-policing, then it would do a lot better,” right? That’s one way. The other way, and I think this is probably—I don’t remember the Simone Browne book that well, but somehow the idea of quantification and statistics, itself, is in several ways tied to racism, regarding the slave ship manifests.
[DPW] 26:38
Yes. Yes.
[GH] 26:41
Am I reading you right that you sort of favor the Simone Browne Argument?
[DPW]
Yes. More—Moreover, like it is, it is clear to be able to say that, “yes, we could do a better job of measuring and quantifying certain categories of people,” but it is also clear that—and this is Browne’s argument, this is Safiya Noble’s, this is Nasma Ahmed’s and sarah aoun’s argument—that, when we have done so in the past, we have used it specifically to oppress and harm others.
Blood quantum for Native Americans, the you know, measuring of breath and physiognomical capacity for African Americans. All of these have histories of very precise and inscribe and careful measurement being used to make the lives of certain groups of people hell.
Yeah.
[Unknown Questioner] 27:36
So, following on Gordon’s question, I had a similar question in mind about which of the sort of critiques you are looking at? Stuff like Joy Buolamwini’s work, for instance, says that we need, technology needs to be more inclusive…
[DPW] 27:49
Yeah, I meant to it, I meant to include Joy’s working here because the Algorithm Justice Project [sic] is amazing, in terms of her remedy, here.
[UQ] 27:54
Yeah, it is, though, like it more pitches to work Gordon framed as the first horn, right? Where you sort of say like, “We need to make these predictions more accurate across the board,” rather than the latter critique that says “this, this kind of predictive apparatus is going to reproduce the kinds of projects that we’re worried about, no matter what,” right? So, I also think both are really interesting critiques; I hold with you in thinking that the latter critique, the more radical critique, is more promising. But then my question is, like, given what I think you laid out as like really useful sets of questions that can prime us to get outside of our normal, normative epistemological ways of solving these problems, like, what lesson do we take from that for further developing this radical critique, Do we take this as like—Like, obviously, this is not an argument for, like, more diverse tech teams, right? Like, that’s not going to do anything.
[DPW] 28:52
No, I meant, I meant to put a picture of Stanford’s like the the AI ethics team that was like 120, white people.
[UQ] 29:01
Well, even, there are independent arguments for diversifying tech. But like, this is not one of them. So do we think of this as like, helping us articulate constraints that we should place on the kinds of predictions that we think these systems should be allowed to make, do we think of it as an argument for abolishing these kinds of systems entirely, and coming up with new ways of doing this kind of work? Like what’s the normative lesson?
[DPW] 29:26
My normative lesson is, “Heed Marginalized People.” Fundamentally and foundationally. And, like, don’t include them necessarily in your training data, but include them in the questions that you ask at the outset, and who you think to ask about what you ought to do. I don’t think that there’s going to be (and this has come up a couple of times in our previous conversations), it’s not going to be a one-size-fits all answer for what we do and when we do it, in every single kind of case; there’s going to be a matrix [of] shifting, dynamic engagement of needs, stakeholders, and, ultimately, power dynamics that need to be redressed. And the only way that we’re going to be able to do that in a way that harkens a bit towards justice is going to be some way that allows us to say, “Okay, who have we not included?”
So to ask that question, “Who have we not thought about; whose harms, whose needs, whose voice has been, perhaps speaking, but unheeded, for a very long time? And how do we ensure that the things that they have called out as potential sites of failure, don’t go unremarked, don’t go unaddressed, for such a long time that we one day turn around and go, ‘whoever could have thought that this camera or this facial recognition technology might in some several ways be racist?’” except for all of the hundreds of people who told you that, for the past several decades
I do recognize, by the way, that it’s really close up on our break time—it’s actually past the start of our break time, so if y’all want to get some coffee, I definitely understand that and I don’t hold that against you. But if you want to stay here and keep talking, I’m also willing to do that.
Josh.
[Joshua Earle] 31:10
So…are there systems, or, I guess categories of systems, or categories of things that, in this that we should just, as [unheard name] asked in his question, just not do, or just not try to do?
[DPW] 31:33
Yeah, I mean, there have been several instances in the very recent past that have said, you know, everybody looks at and goes, “…Why would you do that? Like, why would you? Why would you try to make that technology? It would fundamentally be used for oppressive structures,” and facial recognition technology in the service of surveillance is one of them, primarily, like… So, Ruha Benjamin, talked about this a couple of weeks ago at the Gender, Bodies, and Technology conference in, down in Roanoke; and she was talking about this idea that, you know, looking at the prison industrial complex, looking at the carceral justice system, there are certain technologies that are only ever going to forward the oppressive aims of carceral justice; that are never going to help us to overturn that oppressive use of power; and that facial recognition for surveillance is primary among them.
So I think that Yeah, we can see instances in which things like that things like that facial recognition that purports to be able to tell you with your you know, if somebody’s gay, like… When there are systems in the world, there are political systems and regimes in the world, right now, who want to better be, you know, to be able to better identify gay people, so that they can kill them… Why would you make that? Why would you even prove that concept? Or attempt or purport to prove that concept, because, by the way, their methodology is garbage, and they prove nothing. It proves, it proves not anything at all. Why would you even move towards something like it, though? Why would you give someone the tool to be able to say, “I have a[n] ‘objective’ mathematical system that proves that certain people are gay?
[Prof. Chloé S. Georas(?)] 33:41
I think partly what’s interesting is, the technologies that based on racist biometric assumptions, is that they are part of that long cultural history of criminal anthropology. It has no scientific—all of that has been debunked, historically, but now it’s re-emerging with this legitimacy, and sort of scientific aura, and being sold and used, for “national security,” “border control,” etc. And it’s a reenactment, of that history.
[DPW] 34:08
Yes. It’s one of the one of the links that I that I put forward—and I think I put it in the, in my reference slides—there’s a, it’s just physiognomy, again, all over again, it’s physiognomy and phrenology, again. It’s this idea that there is a particular type of bodily metric that we can be able to, to make and make “fit.” And that necessarily elides disabled bodies that necessarily elides fat bodies, it necessarily elides any non-normative body that we want to say is you’re not right, you’re not the right kind of person, quote, unquote, right, kind, right. And that is exactly what’s being repurposed. But it’s being again, given this air of this kind of scientific veneer, again.
It’s being said, “Oh, you know, the math says it’s okay.” “The math” is just another system that we’ve realized biases into that’s you can make it do anything. You just have to know the system well enough to make it do anything. But that doesn’t mean it’s, “objective.” Doesn’t mean it’s “bias free,” somehow. There’s no such thing.
[DPW] 35:19
Thank y’all. Really appreciate you being here.
[*This concept was actually first explored, in this way, by Gayatri Spivak, in her 1988 lecture, “Can the Subaltern Speak?”]
Until Next Time.
Pingback: Bookmarks for October 5th from 16:44 to 20:57 : Extenuating Circumstances
Pingback: UnKamenCast RX 233 – “Hey, Zaia? Don’t Build a Gundam.” | TrialOfHeroes