What Is an Algorithm?

An algorithm is a set of instructions for how a computer should accomplish a particular task. Algorithms are used by many organizations to make decisions and allocate resources based on large datasets. Algorithms are most often compared to recipes, which take a specific set of ingredients and transform them through a series of explainable steps into a predictable output. Combining calculation, processing, and reasoning, algorithms can be exceptionally complex, encoding for thousands of variables across millions of data points. Critically, there are few consumer or civil rights protections that limit the types of data used to build data profiles or that require the auditing of algorithmic decision-making. Standards and enforcement for fairness, accountability, and transparency are long overdue for algorithms that allocate housing, healthcare, hiring, banking, social services, as well as goods and service delivery (Eubanks, 2018). Algorithmic accountability is the process of assigning responsibility for harm when algorithmic decision-making results in discriminatory and inequitable outcomes.

How Are Algorithms Used to Make Decisions?

Algorithmic decision-making is becoming more common every day. Increasingly, important decisions that affect people’s lives are governed by datasets that are too big for an individual to process. People have become accustomed to algorithms making all manner of recommendations, from products to buy, to songs to listen to, to social network connections. But, algorithms are not just recommending, they are also being used to make big decisions about people’s lives. Among many applications, algorithms are used to:

  • Organize social media feeds;
  • Display ads;
  • Sort résumés for job applications;
  • Allocate social services;
  • Decide who sees advertisements for open positions,
  • housing, and products;
  • Decide who should be promoted or fired;
  • Estimate a person’s risk of committing crimes or the length of a prison term;
  • Assess and allocate insurance and benefits;
  • Obtain and determine credit; and
  • Rank and curate news and information in search engines.

While algorithmic decision making can offer benefits in terms of speed, efficiency, and even fairness, there is a common misconception that algorithms automatically result in unbiased decisions. While it may appear like algorithms are unbiased calculations because they take in objective points of reference and provide a standard outcome, there remain many problems with those inputs and the outputs. As Frank Pasquale, law professor at the University of Maryland, points out, algorithmic decision-making is “black boxed,” which means that while we may know what goes into the computer for processing and what the outcome is, there are currently no external auditing systems or regulations for assessing what happens to the data during processing (Pasquale, 2015).

Algorithms are attractive because they promise neutrality in decision making—they take in data and deliver results. But algorithms are not “neutral.” In the words of mathematician Cathy O’Neil, an algorithm is an “opinion embedded in mathematics,” (O’Neil, 2016). And like opinions, all algorithms are different. Some algorithms privilege a certain group of people over another. O’Neil argues that across a range of occupations, human decision makers are being encouraged to defer to software systems even when there is evidence that a system is making incorrect, unjust, or harmful decisions.

When an algorithm’s output results in unfairness, we refer to it as bias. Bias can find its way into an algorithm in many ways. It can be created through the social context where an algorithm is created, as a result of technical constraints, or by the way the algorithm is used in practice (Friedman and Nissenbaum, 1996). When an algorithm is being created, it is structured by the values of its designer, which might not be neutral. And after an algorithm is created, it must be trained—fed large amounts of data on past decisions—to teach it how to make future decisions. If that training data is itself biased, the algorithm can inherit that bias. For these reasons and others, decisions made by computers are not fundamentally more logical and unbiased than decisions made by people.

Black-boxed algorithms can unfairly limit opportunities, restrict services, and even produce “technological redlining.” As Safiya Noble, professor of communication at University of Southern California, writes, technological redlining occurs when algorithms produce inequitable outcomes and replicate known inequalities, leading to the systematic exclusion of Blacks, Latinos, and Native Americans (Noble, 2018). Technological redlining occurs because we have no control over how data is used to profile us. If bias exists in the data, it is replicated in the outcome. Without enforceable mechanisms of transparency, auditing, and accountability, little can be known about how algorithmic decision-making limits or impedes civil rights. Noble writes,

technological redlining is a form of digital data discrimination, which uses our digital identities and activities to bolster inequality and oppression. It is often enacted without our knowledge, through our digital engagements, which become part of algorithmic, automated, and artificially intelligent sorting mechanisms that can either target or exclude us. It is a fundamental dimension of generating, sustaining, or deepening racial, ethnic, and gender discrimination, and it is centrally tied to the distribution of goods and services in society, like education, housing, and other human and civil rights. Technological redlining is closely tied to longstanding practices of ‘redlining,’ which have been consistently defined as illegal by the United States Congress, but which are increasingly elusive because of their digital deployments through online, internet-based software and platforms, including exclusion from, and control over, individual participation and representation in digital systems.[1]

Important examples of technological redlining were uncovered by ProPublica, who showed how Facebook’s targeted advertising system allowed for discrimination by race and age (Angwin and Tobin, 2017; Angwin et al., 2017). These decisions embedded in design have significant ramifications for those who are already marginalized.

[1] Noble wrote this definition of “technological redlining” specifically for this publication.

Example: Racial Bias in Algorithms of Incarceration

One of the most important examples of algorithmic bias comes from the justice system, where a newly-created algorithmic system has imposed stricter jail sentences on black defendants. For decades, the company Northpointe has developed algorithmic systems for justice system recommendations. One such system is the Correctional Offender Management Profiling for Alternative Sanctions (COMPAS), which is used across the country to assess the risk of recidivism for defendants in pretrial hearings. The system operates on numerous points of data, such as questions about whether parents had separated and how many friends had been arrested, to make sentencing recommendations to judges. The goal of the system is to help balance protecting public safety while also eliminating the possible bias of human judges (Christin et al., 2015).

While the exact details of how COMPAS computes scores is proprietary information, the system has been built and tested across several dimensions by Northpointe’s own team of computer scientists (Brennan et al., 2007; Brennan et al., 2009) and externally validated by researchers at Florida State University (Blomberg et al., 2010). Their analysis consistently showed that the system met a very commonly accepted definition of fairness within the field of statistics: (Chouldechova, 2016) for defendants of different races, it correctly predicted recidivism at about the same rate (Brennan et al., 2009; Blomberg et al., n.d.).

In 2016, however, ProPublica, a nonprofit news organization known for its investigative journalism, ran an analysis on how the system was being used in Broward County, Florida (Angwin et al., 2016). Their analysis revealed that even though the system predicted recidivism equally well for white and black defendants, it made different kinds of systematic mistakes for the two populations. The system was more likely to mistakenly predict that black defendants were high-risk, while making the opposite type of mistake for white defendants.

This meant that black defendants who would never go on to recidivate were being treated more harshly by the law, while white defendants who would go on to commit more crimes were being treated more leniently. To ProPublica, this was clear evidence of algorithmic bias (Angwin, 2016). Northpointe’s response was to reassert the statistical merit of the COMPAS system. In the end, there were no public announcements made about changes to the COMPAS system, and it continues to be widely used within courts. The COMPAS conflict hinges on two key factors: there are no standard definitions for algorithmic bias, and there is no mechanism for holding stakeholders accountable.

Northpointe and ProPublica both agreed that COMPAS should meet some definition of racial fairness but neither agreed about what that meant. Because there was no public standard, Northpointe was free to create its own definition of fairness. When a challenge was made, Northpointe was not accountable to any particular set of values. Because of this lack of governance around the technologies of algorithmic risk assessment tools, the courts that continue to use the COMPAS system are not accountable either. Recently, the New York City Council passed a bill to determine a process for auditing the selection, use, and implementation of algorithms used by the city that directly affect people’s lives (“The New York City Council – File #: Int 1696-2017”, 2017). The bill highlights a need for assessment of disproportionate impacts across protected categories as well as a procedure for redress if harms are found.