Equal Protection Under Algorithms: A New Statistical and Legal Framework
In this Article, we provide a new statistical and legal framework to understand the legality and fairness of predictive algorithms under the Equal Protection Clause. We begin by reviewing the main legal concerns regarding the use of protected characteristics such as race and the correlates of protected characteristics such as criminal history. The use of race and nonrace correlates in predictive algorithms generates direct and proxy effects of race, respectively, that can lead to racial disparities that many view as unwarranted and discriminatory. These effects have led to the mainstream legal consensus that the use of race and nonrace correlates in predictive algorithms is both problematic and potentially unconstitutional under the Equal Protection Clause. This mainstream position is also reflected in practice, with all commonly used predictive algorithms excluding race and many excluding nonrace correlates such as employment and education.
Next, we challenge the mainstream legal position that the use of a protected characteristic always violates the Equal Protection Clause. We develop a statistical framework that formalizes exactly how the direct and proxy effects of race can lead to algorithmic predictions that disadvantage minorities relative to nonminorities. While an overly formalistic solution requires exclusion of race and all potential nonrace correlates, we show that this type of algorithm is unlikely to work in practice because nearly all algorithmic inputs are correlated with race. We then show that there are two simple statistical solutions that can eliminate the direct and proxy effects of race, and which are implementable even when all inputs are correlated with race. We argue that our proposed algorithms uphold the principles of the equal protection doctrine because they ensure that individuals are not treated differently on the basis of membership in a protected class, in stark contrast to commonly used algorithms that unfairly disadvantage minorities despite the exclusion of race.
We conclude by empirically testing our proposed algorithms in the context of the New York City pretrial system. We show that nearly all commonly used algorithms violate certain principles underlying the Equal Protection Clause by including variables that are correlated with race, generating substantial proxy effects that unfairly disadvantage Black individuals relative to white individuals. Both of our proposed algorithms substantially reduce the number of Black defendants detained compared to commonly used algorithms by eliminating these proxy effects. These findings suggest a fundamental rethinking of the equal protection doctrine as it applies to predictive algorithms and the folly of relying on commonly used algorithms.