Complications with Algorithmic Systems

The COMPAS controversy demonstrates just how many different factors can complicate the design, use, assessment, and governance of algorithmic systems. Algorithms can be incredibly complicated and can create surprising new forms of risk, bias, and harm (Venkatsburamanian, 2015). Here, we lay out how complications in assessing fairness and bias are a result of stakeholders keeping algorithms intentionally opaque amidst calls for transparency. There is a need for greater reflection on models of power and control, where the sublimation of human decision-making to algorithms erodes trust in experts. Ultimately, regulators and researchers are ill-equipped to audit algorithms or enforce any regulation under these conditions.

Fairness and Bias

Algorithms are often deployed with the goal of correcting a source of bias in decisions made by humans. However, many algorithmic systems either codify existing sources of bias or introduce new ones. Additionally, bias can exist in multiple places within one algorithm.

An algorithmic system can take on unintended values that compete with designed values (Batya et al., 2006). In the case of COMPAS, the algorithm delivered discriminatory results because of the bias embedded in the training data. Because black people have historically been arrested at a higher rate than white people, COMPAS learned to predict that a black person is more at risk of being re-arrested than a white person. When implemented, this system reflects this learning back into the criminal justice system at a large scale, injecting a source of racial bias into steps of the judicial process that come after arrest.

By transferring values from one particular political and cultural moment to a different context, algorithms create a certain moral rigidity. Unless algorithms are consistently monitored and adjusted as time passes, they reinforce the values they were created with and can become rapidly outdated. For example, in terms of apportionment of healthcare, service delivery by insurance companies and hospitals depends on algorithmic decision-making, yet some doctors and caregivers do not agree with the standardized treatment models because these data are not robust enough to assess variables unavailable to the computer model, such as the unsteady living conditions of those in poverty.

Opacity and Transparency

Many algorithms are unable to be scrutinized because the data, process, or outcomes they rely on are kept behind closed doors. According to Jenna Burrell, this can happen for three reasons:

  • Intentional corporate or state secrecy, such as a trade secrets;
  • Inadequate education on the part of auditors; or
  • Overwhelming complexity and scale on the part of the algorithmic system.

The more complex and sophisticated an algorithm is, the harder it is to explain, even by a knowledgeable algorithmic engineer.

Without some level of transparency, it is difficult to know whether an algorithm does what it says it does, whether it is fair, or whether its outcomes are reliable. For example, there is a clear-cut need for transparency around risk assessment tools like COMPAS, but this need is challenged by upholding trade secrets laws. Also, in some cases, transparency may lead to groups and individuals “gaming the system.” For example, even the minimal openness surrounding how the trending feature on Twitter surfaces topics has allowed it to be manipulated into covering certain topics by bots and coordinated groups of individuals. Therefore, different contexts may call for different levels of transparency.

Repurposing Data and Repurposing Algorithms

Algorithms are expensive and difficult to build from scratch. Hiring computer scientists, finding training data, specifying the algorithm’s features, testing, refining, and deploying a custom algorithm all cost time and money. Therefore, there is a temptation to take an algorithm that already exists and either modify it or use it for something it wasn’t designed to do. However, accountability and ethics are context specific. Standards that were set and ethical issues that were dealt with in an algorithm’s original context may be problems in a new application.

PredPol, a predictive policing service, uses an algorithm designed to predict earthquakes to find and assign police to hotspots (Huet, 2015). Crime data isn’t the same as earthquake data, though, and civil rights organizations have criticized PredPol for using biased data to overpolice certain areas (Lartey, 2016). For a variety of reasons, crime data, especially that for arrests, is racially biased, which has an impact on any algorithm that uses it as training data.

This type of approach is also performed at an interpretive level, where the same data is interpreted to apply to a different context. For instance, credit history reports, which are designed to be evidence of financial responsibility, are often used as an input in hiring decisions, even though connections between credit history and work capability are dubious at best. In order to deal with such algorithmic creep, we may need new, more cost-effective systems for creating algorithms or more standards in place for evaluating when an algorithm can be successfully adapted from one application to another.

Lack of Standards for Auditing

Since the 1970s in the financial sphere, independent auditing has been used to detect instances of discrimination. While independent auditing could be used to detect bias in algorithmic systems, so far independent auditing is underutilized because of a lack of industry standards or guidelines for assessing social impact. One set of standards proposed by the Association for Computing Machinery US Public Policy Council seeks to ensure that automated decision-making is held to the same standards as equivalent human decision-making (“Statement on Algorithmic Transparency and Accountability,” 2017). According to the ACM, these principles should be applied by algorithm designers at every stage in the creation process, putting the primary responsibility for their adoption in the hands of industry. Another set of guidelines, put forward by a coalition of industry and university researchers, advocates for social impact statements to accompany the sale and deployment of algorithmic products (Fairness, Accountability, and Transparency in Machine Learning, n.d.).

In the wake of the Facebook hearings, Russian disinformation campaigns, and the targeted harassment of civil rights organizers, civil society organizations, such as Color of Change and Muslim Advocates, are calling for independent audits of platforms and internet companies (Simpson, 2018). Data for Black lives has called for a “data public trust,” where they ask Facebook to share anonymized datasets for public good (Milner, 2018). Data for Black Lives are also drafting a data code of ethics that would focus on data protections and limit digital profiling. Facebook reacted to Cambridge Analytica by deleting pages and limiting access to data, which forecloses the possibility of outside review (Facebook, 2018). As a result, it is imperative to create an organizational structure for independent auditing that is open and accessible to researchers and organizations.

Power and Control

One of the primary decisions made by algorithms is that of relevance of each dataset to other data points. What values, categories, and pieces of information are relevant to customers? Companies? States? Tarleton Gillespie (2014), a professor at Cornell University and principal researcher at Microsoft, states that algorithms are treated as trusted, objective sources of information. However, their decisions about relevance are choices shaped by a political agenda, whether that agenda is implicit or explicit to even the algorithm’s own designers. This is especially important for algorithms that perform a gatekeeping role. Algorithms replicate social values but also embed them into systems, creating new standards and expectations for what is important in a given context. While there are laws prohibiting the sharing or sale of health and financial data by hospitals and banks, discrimination occurs because there are few protections in place for consumer data brokering, where discrete data points act as proxies for protected categories that are then assembled into profiles that are sold. This can lead to technological redlining.

Trust and Expertise

Trust means many things in different disciplines, but one sociological perspective holds that trust is the belief that the necessary conditions for success are in place. Those who are pro-algorithm suggests that humans are too trusting of other humans and some algorithms can outperform experts. Humans are accepting of error in other humans, but hold algorithms to a higher standard. In a series of studies conducted at the University of Chicago, researchers found that a subject’s likelihood to use output from an algorithm dropped significantly after they saw evidence that the algorithm can make errors, even if it was still more accurate than their own responses. From this point of view, humans’ lack of trust in algorithms is irrational. However, as Eubanks’s and Noble’s research shows, algorithms are just as capable of bias as humans, as they are embedded with subjective values.

Who is being endowed with trust has a direct relationship with where liability for decision making should fall. One way of avoiding responsibility is to keep an air of mystery around who is ultimately accountable through a lack of specification. In the COMPAS case, it wasn’t clear who was liable for decisions so no one was held accountable for bias in the system. However, this can lead to a “moral crumple zone,” where one entity is held legally liable for errors, even if they aren’t in full control of the system (Elish and Hwang, 2015). For example, airplane pilots are held liable for the behavior of planes, even though many of the decisions are regularly made by computerized systems. Determining who is the trusted decision-maker between algorithmic engineers, algorithms, and users requires careful consideration of what the algorithm claims to do and who suffers from the consequences of mistakes. When an algorithm is making decisions or helping an expert make decisions, it becomes unclear who is ultimately responsible for the effects of those decisions.