1 Introduction

Computational law and personalized law currently receive a lot of attention from the scientific community (Busch and Franceschi 2020; Deakin and Markou 2020; Verstein 2019; Wagner and Eidenmüller 2019). This research focuses on the ability of encoding law into automatically accessible formats and on the effects of such attempts on the legal system and on society at large. While different terms have been used—computable law, technological management, encoded law, legal tech, legal AI (see e.g., Brownsword 2019; Cobbe 2020; Hildebrandt 2015, 2020)—one main interest of social science research in this field has been to understand what will occur if law can be interpreted and accessed by computers. In this article, we will use terminology developed in Guitton et al. (2022) referring to automatically processable regulation (APR) as it captures intentional implementations of software and hardware to regulate for a specific purpose. We reuse the same definition of APR, namely as “pieces of regulation that are possibly interconnected semantically and are expressed in a form that makes them accessible to be processed automatically, typically by a computer system that executes a specified algorithm, i.e., a finite series of unambiguous and uncontradictory computer-implementable instructions to solve a specific set of computable problems”.

APR is by no means welcomed by everyone with open arms; we have seen pushbacks against such implementations by policymakers and academics alike. For instance, Brownsword (2019) describes resistance in France against a proposed (and passed) 2019 reform of the Justice Act that banned the evaluation, analysis, comparison, and even prediction of judicial practices. Likewise, academics have pointed out that we need to address and debate questions of legitimacy and appropriateness of APR. Moses (2020) for instance points out that “not every tool that can do something should be used in all circumstances” (p. 216) and Hildebrandt (2018) highlights the necessity of arguing about the application of the law, as it would be a grave mistake to conflate a “mathematical simulation of a legal judgment” with a legal judgment itself (p. 23).

These arguments brought forward by legal scholars and social scientists are key as they point towards the larger question that surrounds the field: Which issues does APR raise? In this article, we address this fundamental question by discussing the attributes of the law that are being challenged through the rise of APR. While such a review of this many issues could give readers a sense that we oppose APR, this would not be the case; on the contrary, greater clarity on the array of challenges that APR brings is essential in order to bring about responsible solutions as well as suitable frameworks when implementing APR. To structure these issues, we make use of the typology proposed by Guitton et al. (2022). This typology relies on three dimensions. The first one is the technology supporting the automation of regulation overall. Here, different degrees of human mediation of the technology are distinguished, with no human mediation (i.e., full automation) or full human mediation on either side of the spectrum.Footnote 1 This dimension is important insofar that APR does not inherently and necessarily lead to removing humans from the loop entirely; rather, this depends on the specific application context. Second, the typology describes potential divergence of interests between actors, i.e., the ones sponsoring a project on automatically accessible regulation (e.g., a ministry), the ones implementing the project (e.g., a government body), and the ones using and hopefully benefitting from it (e.g., citizens). Among these actors, different conflicts of interest can emerge impeding the implementation of APR (e.g., a government body not being willing to implement a project by a ministry), thereby becoming a source of issues. And the third dimension distinguishes between different primary aims of automatically accessible regulation, and divides them into two groups between those aiming primarily for accessibility versus those aiming to achieve efficiency gains. Accessibility means rendering legislation more accessible to laypeople and well as practitioners, potentially enabling individuals to query the law without prior knowledge of it. Efficiency means making legal practice and jurisprudence more efficient, faster, and consume less (human) time and effort.

We have structured the conceptual part of this article by ordering issues into one of the three dimensions (Sects. 24), depending on which dimension they correlate the most with. We go through this order following what the most prominent issues thematized in the literature are. To improve readability, we have relayed to “Appendix 1” our argument of why the extent of each issue is more dependent on the dimension it is subsumed than on any other. In Sect. 5 of the article, we discuss the need for a public debate over the properties of APR. From Sect. 6 of the article, we turn to empirical data, and show that all three dimensions matter—not only conceptually and qualitatively but also regarding real-life implemented APR projects, and especially that the third dimension has an important impact on the number of issues an application of APR triggers. We discuss these findings at the very end of the article and highlight the implications thereof.

2 How the Degree of Mediation by Computers Impacts Key Attributes of the Law when Law Becomes Encoded

For the purpose of our text, the list of issues correlating with this dimension can be further divided into two sub-dimensions by looking at the source of the issue. On the one hand are issues that relate to the properties of the law that are difficult or even impossible to be encoded (such as issues of representativeness, evolution of the law and code, and legal singularity); these are discussed in Sect. 2.1. On the other hand are issues that relate to the properties of the automation and encoding itself (losing agency, dealing with non-human speed, replacement or displacement of the workforce by APR, transparency of encoding and reasoning); these are discussed in Sect. 2.2.

2.1 Issues Resulting from the Attributes of the Law and Its (In)encodability

2.1.1 Representativeness and Evolutiveness of APR: Capturing Moving Targets of the Law

Legal norms provide neither always inherently clear-cut rules, nor are they interpreted in a vacuum. Both, the use of vague terms (such as “reasonableness”, “good faith”), as well as the evolution of our understanding of law reflecting changing times with changing interpretations of social norms, where changing needs (in particular through case law) are a feature of law and not a bug (Hildebrandt 2020). Losing this flexibility of the law to ensure for “moving targets” is an issue that has been raised in the context of APR (Cobbe 2020; Hildebrandt 2015, 2020; Moses 2020).

The first point on vagueness highlights a core difficulty that technical research on implementing APR has been faced with. In contrast to program code, law is inherently imprecise, as codifying public and social opinions is not trivial (Kroll 2017; Buchholtz 2019) and because law often reflects a political compromise among different stakeholders. Neither the law nor APR are neutral. Rather, law is “contingent on the circumstances of the time” and “imbued with normative assumptions and priorities” (Cobbe 2020, p. 111); and APR is the product of “designers’ understanding of goals for society” (Cobbe 2020, p. 111). It is thus not surprising that legal norms are drafted broadly, i.e., in general terms, that make them open to interpretation (Verheij 2020). Such interpretation requires complementary knowledge: not only legal knowhow, but also—depending on the legislation at hand—technical expertise (e.g., in data protection law, see Tamò-Larrieux et al. 2021). Encoding such knowledge would either require to “foresee all potential scenarios and develop sub-rules that hopefully cover all future interactions” (Hildebrandt 2020, p. 71), which seems neither feasible nor sensible, or develop APR that is able to extract the context and adapt decisions accordingly (Moens 2006; Niklaus et al. 2021; Walton et al. 2021). Research on the latter issue has undeniably provided some promising results on capturing contexts (Tang and Clematide 2021) and on capturing the evolution of the interpretation of the law (Walton et al. 2021). Yet, it remains questionable whether the extent to which we capture social interpretation and reflexivity of the law (Cobbe 2020), tied to the evolution of the interpretation of the law, is sufficient to parallel non-encoded law. The social context of law is especially important when thinking about the role of judges who balance various (human) needs against each other when reaching a decision as the interpretation of legislation is inherently a “social product” (Buchholtz 2019, p. 183). Moreover, the application of the law also has a reflexive element, meaning that the prosecution of the law affects, alters, and produces the very fabric of society itself (Cobbe 2020), a point that is tendentially neglected by research on capturing contexts.

Another topic that falls within the category of representativeness but is more closely tied to the issue of evolutiveness is the question of how APR is able to represent changes in statutes (here assuming that the legal process still issues law in a text-based format). In such instances, and to remain up to date, APR would need to be designed in a manner that keeps track of changes (incl. case law) and adapts to the new regulatory environment. This may, of all the sub-issues within representativeness, be the one where a technical solution could help alleviate concerns the most.

Lastly, proponents of APR may at times under-play the inherently challenging task of establishing an interdisciplinary discourse needed to implement APR. In fact, creating and deploying APR requires resource-intensive, interdisciplinary discourse in particular between legal and technical disciplines which is difficult (Lutz and Tamò 2015). Root causes for this challenge include differing perceptions, expectations, and languages used across disciplines which however might share common assumptions, as well as the non-intuitiveness of disciplines to understand other fields’ rationality and logic (see on APR e.g., (Waddington 2019). Illustrating this challenge is the encoding process of the Rebates Act and of the Holidays Act in New Zealand, where interdisciplinary teams were created to translate the law first into pseudocode and then into actual program code (LabPlus 2018). While the New Zealand example might be taken as a successful example of such an interdisciplinary collaboration, projects may not live up to expectations because of a lack of individuals who bridge the disciplines, or because bureaucratic barriers exist preventing this step (e.g., in Canada, part of a project was discontinued as no technical expert could be involved with legislative drafters without them having a security clearance; as such technical experts were needed to have come from private companies, no security clearance could be providedFootnote 2).

2.1.2 Balanced APR: Singularity Versus Legal ideals

Tied to the discussed issues with representativeness of APR is the question on how to balance and translate different features of, and interests in, the law into program code. The associated issue here is the fear of legal singularity (i.e., the prevalence of only one right interpretation of the law) and the freezing of the law (i.e., repeating past applications of the law without taking into account the aforementioned social contextFootnote 3). Both give rise to competing issues: The quest for legal certainty versus the idea of fairness, contextuality, and overall justice. Legal certainty means reasonably foreseeing the legal effects of certain actions and thus being able to plan accordingly. Yet, legal certainty should not be understood so strictly to lead to a static legal environment, thereby granting weight to the ideals of fairness and justice.

Legal philosophers have extensively written on the role of the law, and the balance among different, potentially conflicting functionalities of the legal system. Important writings come from Radbruch (2006), who after the Second World War shifted away from a purely positivist approach towards law (Hart 1961 [2012]) to consider the role of the notion of justice (Moore 2020; Bix 2011). Radbruch states (Paulson 1994, p. 317): “Where there is not even an attempt at justice, where equality, the core of justice, is deliberately betrayed in the issuance of positive law, then the statute is not merely ‘false law’, it completely lacks the very nature of law. For law, including positive law, cannot be otherwise defined as a system and an institution whose very meaning is to serve justice”. Without wanting to open a discussion on the distinction between descriptive and normative aspects of the law (see here Shapiro 2011), it is important to highlight the rich legal discourse delving into these properties of legal systems (Hildebrandt 2015; Moore 2020; Radbruch 2006; Waldron 2001). Interestingly, the perception on how to balance such attributes against each other is also tied to the historical environment in which these discussions took place, namely under the fascist regime where legal certainty had a much higher prominence than any ideas of justice (Bix 2011). Thus, the interplay of justice, fairness, suitability for a purpose, and legal certainty are in constant negotiation with each other, and prevalence should be given depending on a given context.

2.2 Issues Resulting from the Attributes of Code and Its Impact on the Legal System, Individuals, and Society

2.2.1 Losing Agency Because of APR: Rule of Algorithms

Notwithstanding how representative and balanced APR is, the encoding of the law—especially when human involvement in the development and application process is highly disintermediated—is accompanied by the fear of algocracy (Brownsword 2019; Cobbe 2020; Diver 2020). Algocracy stands for the rule of code where either the implemented system’s outcome takes precedence over any human’s outcome, or where, in the long run, it impedes human comprehension and involvement when the algorithms (encoding and reasoning) are non-transparent, thereby dismantling public participation in the legal process. The issue, in other words, is that APR would give priority to “instrumental and procedural virtues but sacrifices human control and comprehension” (Danaher 2016). The fear of algocracy is therefore closely tied to the attribute of transparency and the notion that, to be subject to the law, we must be able to comprehend its scope (see the need for debate in Sect. 5 on accessibility of the law).

Algocracy raises the question of when and to which extent citizens should be left with the possibility for disobedience. There are many scenarios where it is thinkable that citizens disagree with implementations, outcomes, or other aspects of APR. Outside the digital realm, being able to conscientiously refuse to comply—civil disobedience—has evolved as a key tenet for functioning democracies. As a seminal author on the topic put forward, Thoreau (1849 [2021]), when faced with a conflict with one's own morals, people should prioritise their own conscience over complying with the law. Martin Luther King echoed these words (and referenced Thoreau) later when he expressed in a speech on the matter that “noncooperation with evil is as much a moral obligation as is cooperation with good” (King 1957). Should APR still leave people enough agency to choose whether to do so? And if so, how should this be regulated? Would it contravene certain goals of APR, for instance when it comes to forcing compliance—thereby also resulting in a loss of agency for whoever comes under the dictate of APR (Tamò-Larrieux et al. 2021)?

Very quickly in such a debate would come the need to define the scope of disobedience or civil disobedience. John Rawls defines it notably as a “public, nonviolent, conscientious yet political act contrary to the law usually done with the aim of bringing about a change in the law or policies of the government” (Rawls 1991). And yet, this might not be fitting. Thoreau’s own example of disobedience was not public, as could be many others refusing to comply with APR to reflect their own moral preferences. Similarly, does it need to be non-violent, or rather non-coercive? Does the legitimacy of one's own questioning of the law matter and under which criteria if so? The lack of publicness, of relation to violence and coercion of acts performed by APR might warrant a different label. Would for instance discussion of “conscious objections” be more fitting—a term that was trending during the time of contesting the US involvement in the Vietnam war (Cohen 1968)? Making assumptions explicit would be beneficial to the society where this debate takes place.

Lastly, the rule of algorithms challenges the rule of law (Leenes 2011), for instance in the context of assisting judges while keeping algorithms as a “black boxes” (Greenstein 2021). While laws and judgments are public for all to see, an important pillar of a public system (Shapiro 2011), decisions to interpret the law in one way or another as well as decisions to encode this interpretation in one way or another, may not necessarily be public. We treat the problem that algorithms and output assisting judges is kept opaque separately (in Sect. 3, “Transparency and Oversight of the Outcome of APR”) as it emphasizes a different viewpoint with respect to the control-orientation in the current section.

2.2.2 Dealing with the Non-human-Paced Application of APR

The application of the law by humans is time-consuming. With APR, the letter of the law and its enforcement collapse (Hildebrandt and Koops 2010), and the buffer that is naturally present in non-computer-mediated applications of the law disappears. This has consequences on the individual and the societal level. From an individual perspective, the human touch of the application of the law—the ideal of being heard and judged by another human who can show emotions and empathy—gets lost. And yet, we need empathy in legal proceedings which the “digital administrative state” threatens (Ranchordàs 2022). This is true as much for following procedures in behind-the-counter processing applications of benefits, as in more complex situations heard in courts. Courts in particular are not only a place for judicial outcome; they can be the theatre of complex psychological developments during which victims have the possibility to revisit facts, make them public by confronting their alleged perpetrator, both potentially supporting victims to make sense of traumatising, at times life-changing, events (Bandes 2000, 2009). Outcomes can serve as a way to obtain closure, even when the ruling does not favour the victim (see for instance Bandes 2009; Hulst et al. 2017); the wait for the process to be closed can be excruciating but also prepares the victim mentally for a specific point in time when they can expect a final decision (Bandes 2000). By collapsing this process into a few seconds, it is uncertain how human psychology would react when denied this instrument and throws the question of how and where people will look for closure. Could it lead to them delegitimizing courts as a collective result of feeling alienated? Could it lead to forms of social instability as a response to less trust in institutions—here considering the court system as one of such institutions? And could it lead to political responses that ultimately weaken the rule of law? Courts having lower credibility weakens enforcement, societal structure, and is more closely linked with higher levels of criminality, violence, and chaos (Besley and Persson 2011; Soares and Naritomi 2010). For the moment, early research has pointed out that, all else equal, those in the courtroom perceived a human judge as acting fairer than a robot-judge (Chen et al. 2021), which means that individuals may find closure more easily with a human judge. Another research on the subject of how automated decision-making is perceived brings about another nuance by positing such perceptions vary depending on the type of decision at hand, characterising emotions differently for mechanical tasks than for those with balancing and emotional elements (Lee 2018). The aforementioned questions as well as the early answer through this research points to the need for additional focus on the topic.

Similarly, also on the psychological side of consequences of APR, fast-paced applications may induce anxiety especially in the elderly (Ranchordàs and Scarcella 2021). Groups of people who may appreciate striking a conversation with someone over the counter might feel marginalized by the introduction of faster paces of decision-making (i.e., higher efficiency from another viewpoint). With increasing mediation by computers, we remove this human touch and introduce instead certain negative emotions that are different than when dealing with bureaucratic institutions in person, such as to the confrontation of not being able to use the most modern technology, and the fear (anxiety) of not being able to conduct the query required (be it with respect to filing taxes, or benefit requests, etc.) because of a lack of skills. And again, the mediation by computers misses a crucial aspect that should be present especially in administrative law: showing empathy for individual situations and forgiving mistakes (Ranchordàs 2022).

2.2.3 Replaced by APR: Changes in the Workforce

The aforementioned issues point towards change of gatekeepers to the law, from members of parliament having to legislate with software engineers, to judges having to comprehend software (or hardware) specificities (Tamò-Larrieux et al. 2021). This points to the overarching issue of who will be replaced by APR and how it might displace workers. At the time of writing, two scenarios seem relevant to be examined: One where, rather than replacing people, new forms of collaborations and positions, such as ‘tech-legislators’ (i.e., policymakers combining legal and technical expertise), will be created; and another one where more routine-tasks currently performed by staff within the different branches of government or legal fields will be replaced by APR (Susskind and Susskind 2015). Automating the demand process for public services (such as any types of benefits), might change the current workflows and render (some) current positions redundant (Suksi 2020). But it will also create demand for other types of jobs to maintain such systems, feed data into it, overview its output and more. Similar to the digitalization of other branches of society, such as industrial work, there is therefore a more general question regarding the extent to which jobs will be simply rendered non-existent versus how many jobs will be in the same time created—typically however, not requiring the same set of skills than for those jobs which are no longer needed—or how the interaction among APR and the current workforce handling legal issues will co-evolve (Malone 2018). From a societal point of view and at the macro level, displacement of jobs would appear, a priori, to throw less morally extensive questions than complete removal of jobs. A prior only, because new jobs are likely to throw questions already seen in recent years with the digital economy regarding the social status, protection of workers, and adequate compensation of workers (Bucher et al. 2021; Halvorsen et al. 2021; Hanschke and Hanschke 2021).

2.2.4 Transparency of APR: From the Encoding to Its Reasoning

The transparency of APR spans at least four dimensions: It implicates the actual encoding of a regulation, the explainability of how a regulation is being encoded (reasoning behind the encoding process), explainability and understanding of the output generated by APR, and the overarching challenge of overseeing and auditing the outcomes of APR (see Sect. 3.3 on “Transparency and Oversight” regarding the two latter ones). The two first dimensions depend majorly on the complexity of the law and the ability to accurately express it in specific sections of program code while keeping that code maintainable (especially in light of the many interdependencies within legal code), while the two latter ones depend much more on the stakeholders involved in the push for APR and the governance around it.

Necessarily, the issues linked to the transparency of the encoding, the reasoning behind the encoding and its explainability, and the understanding of the output are closely linked. To better understand the arising issues, we can draw from a rich literature that has focused on the role of transparency of algorithmic decision-making and in particular on explainability of such systems (see e.g., Lipton 2016; Selbst and Barocas 2018; Wachter and Mittelstadt 2019; Wachter et al. 2017, 2018; Zerilli et al. 2019). This literature highlights the changes in decision-making processes—without focusing on the legal domain (with some exceptions, see e.g. Esposito 2021)—and how computer programs with low degrees of mediation by humans (e.g., data-driven unsupervised learning systems) challenge the notion of transparency. Technical remedies to address the lack of transparency have been proposed (for instance with Catala, Merigoux et al. 2021), yet not necessarily focused on the field of APR. Even if more transparent APRs exist, transparency should not be reduced to its informational dimension, which presupposes that individuals will understand information and thereby control (i.e., decide) how to interact with automated decision-making systems (Felzmann et al. 2019) or APR. Such a narrow focus would neglect the broader, societal implications of automated decision-making systems and APR in particular. One such important implication is tied to the challenge of oversight and the decision power attached to it. This explains calls that “technology should be brought into the public sphere where it increasingly belongs” (Feenberg 2009, p. 149; Sadowski et al. 2021), pointing to the need for a public debate not only on specific, discrete instances of how to interpret parts of legislation for its encodability, but rather on the overall way of dealing with interpretability issues as well as how to communicate automated legal decision-making and argumentation (Esposito 2021), on deciding requirements to make technical choices public, and possibly on the extent to which legal professionals as much as laypeople should be able to understand technical implementations of rules. This last point is especially crucial (and reminiscent of Sect. 2.1 on interdisciplinarity and of the need for a debate on the state’s role in making the law accessible): If legal professionals and laypeople do not have the technical skills to understand “technology [that is] brought into the public sphere,” then it substantially reduces the impact of making APR transparent in the first place.

3 How the Potential for Divergence of Interests Impacts APR Implementations

We continue our review of issues by moving to those primarily correlating with the dimension of potential for divergence of interests, a dimension looking at how sponsors, implementers, users, and beneficiaries might not always have aligned goals and priorities. This has consequences for the type of issues it spurs.

3.1 Affordability of Law and Usability of APR

Many APR projects (see Sect. 6 on the empirical analysis) aim at making the law more accessible to laypeople overall. Projects that fall within this aim include systems that are trained to find conflict between laws (e.g., Regulatory evaluation platform—McNaughton 2020), provide answers to citizens on their eligibility to various schemes (e.g. ElectKB, Rates Rebate, Mes Aides—see respectively Alauzen 2021; LabPlus 2018; Mowbray et al. 2020), help provide the beginning of an answer to a legal question by retrieving relevant legal information, or challenge enforcements of the law (e.g., DoNotPay—Whittaker 2020). These systems have the potential to make access to the law more affordable, most giving laypeople a starting point for answering a legal question and thereby helping them better assess whether they need to pay an (expensive) professional. This could tilt the balance of power towards laypeople. However, this is only true insofar that the cost of such systems would remain below that of consulting a legal professional. The tools provided by state institutions have been so far released without requiring upfront payment from users but this is not the case for tools developed by private companies. These private projects lead to the question of how affordable we as a society want to make access to the law, especially if the most affordable and easiest way to access the law is via APR (see Sect. 5)? Should the state regulate (through subsidies, price ceilings, or other) without stymieing innovation? Or will private company come up with different business models (such as licensing)? These discussions need to be held in the public realm, as we note the ambivalence of tools too expensive to be accessible albeit aimed at making law more accessible, and posit that the potential contributions of APR have to be delicately balanced against the cost issues and professional displacements that such implementations trigger.

Similar to the issue of affordability is the one of usability of APR. APR needs to be “easy to use”, or at least usable for laypeople if the goal is to render access to the law and to especially legal resources and recourse mechanisms more open and readily available. As the law and case law can be complex, software may quickly overwhelm (lay) users if the implementers do not design them carefully.

3.2 Transparency and Oversight of the Outcome of APR

Many critics have already been formulated regarding the use by judges in the US of a software to help them decide whether a person will be a recidivist—the Correctional Offender Management Profiling for Alternative Sanctions, abbreviated COMPAS (see for instance Van Den Hoven 2021). One of the issues is that, while the judge will make the outcome of the trial public, the outcome of the software partially supporting and potentially influencing this decision is not. As a consequence, the person affected by the decision is denied a chance to fully comprehend what is happening (which matters for finding closure, if we are to keep courts as they are currently set up) and to contest the algorithm’s output and the decision based thereon. There is here a noteworthy parallel with speeding cameras, which also produce an outcome from an opaque process: The internal technical specificities of speeding cameras are often not public but, importantly, are regulated—for instance with regard to certain tolerance to the speed they register. In many cases, it might still not be accepted to receive a ticket from a speeding camera and there is a possibility of contesting it. Yet, different from COMPAS, the outcome is known, the process is regulated, and there is a way to contest it. This brings the point that COMPAS’s lack of transparency in the outcome is not specific to technical specifications of the algorithm; much more, it is specific to how processes are set up around the use of the algorithm.

Similarly, central issues arising from the deployment of APR are: What will the process to oversee equivalences look like? How public will it be? And who becomes the appropriate authority to verify the equivalences of the software and its output—with the choice of candidates ranging from entities housed in parliaments, administrations, executive branches, courts, and others, potentially at times competing with their own interpretation? Oversight over how the law is being applied is already today a complex undertaking, and the introduction of APR will add a layer of complexity to the required expertise to conduct a due process over the application, as it will require the combination of legal and technical expertise (Suksi 2020). While technical proposals to enable such dual oversights exist, ultimately it is primarily a question of setting up clear, transparent, and fair processes, and second only, a question of technical expertise to evaluate APR (prominently meaning: how to conduct such an evaluation requiring the combination of legal and technical knowhow)? The problem is hence threefold: one of bureaucratic turf wars and competing interests in choosing an arbiter of truth in the blame game guaranteed to follow upon mishaps and contest (of either encoding, or outcome, see below on that too); one of making the agreed-upon process as transparent as possible and communicating it; and one of skills as current policymakers and administrators in institutions in charge of applying the law are not equipped to oversee nor audit APR implementations.

In light of this tripartite challenge, a change in the gatekeepers to the law is to be expected. Current gatekeepers, such as legislators, lawyers, and judges, will be altered or their role and work will change as technical expertise becomes more important and is thus intermediated through software developers. Such changes are bound to result in conflicts of interests. In the extreme, the change in gatekeepers could lead to limiting actual participation of politicians, state officials, and citizens in legislative and judicial process.

3.3 Responsibility for APR: Who to Blame?

We can distinguish between three broad types of mistakes happening with APR: mistakes related to the encoding of APR per se, to the underlying input data feeding the algorithm, or to the overall implementation (e.g. a vulnerability for confidentiality breach potentially resulting in personal data leaks). On top of that, there could be an undesired output (e.g., that is morally or socially not acceptable), in the sense that the system performed according to specifications but an unforeseen exception occurred (see the previous issue on requiring oversight to catch such issues, Sect. 3.2). Of these three, only two are specific to APR—encoding and overall implementation. A judge, for instance, might as well rely on the wrong facts and issue a “wrong” ruling later overturned by an appeal court. In the case of APR, the lines might become even more blurry than usual for the main reason that the responsibility for APR is tied to the issue of the lack of established due process to oversee the creation and application of APR as well as the changes in decision-making powers. Especially in the midst of changing gatekeepers, it is highly questionable who to hold accountable to justify a certain mistake that has occurred, who to blame for it, and who to require to correct it. It may be that a certain portion of code has to be amended, that biases in the underlying data feeding into the algorithm have to be redressed, that an overall IT implementation has to be overhauled to account for an error (ranging from security, to configuration, to process flow more generally), or that an outcome has to be retracted. Regarding the last point on outcome, we contend that mistakes even in projects seeking to make the law more accessible will bear much consequence, similar to those implementations of APR within a decision-making process, as, for instance, wrong information on eligibility could lead to citizens not filing a claim on this basis and missing out on opportunities, thereby making void the original goal of the project. Contrary to the creation and application of the law, there is here a lack of established procedures that ensure that checks and balances are upheld in these four different areas.

An example that can be briefly mentioned here is a scandal at the UK Post Office, which relied on a private company to develop a fraud detection scheme. Based on their system, 39 workers were convicted, and jailed, of theft in attempts to defraud the state as their declared closing accounts did not match what the software was indicating. In a first instance where it was assumed that the software could not possibly be faulty, many accused were convicted, only for their case to be overturned with the judge declaring that “[t]he failures of investigation and disclosure were in our judgment so egregious as to make the prosecution of any of the ‘Horizon cases’ an affront to the conscience of the court” (Siddique and Quinn 2011). This example points not only to the question who should be held responsible but also how responsibility should be distributed among different parties: the sponsor of the project, namely the UK Post Office, the private company (Horizon) providing the software, or allegedly corrupt underlying data (Siddique and Quinn 2011)?

4 How the Aim of APR Impacts the Final Implementations and Effects on Individuals and Society

4.1 Simulation Versus Real-World Impact: What is APR’s Current Impact?

Another issue that arises from the perspective of a user is understanding when APR can be (legally) relied upon and when systems are basically just simulations. In fact, many current projects of APR (e.g., DoNotPay in the UK, Mes Aides in France, Rates Rebate in New Zealand, Overtime Regulation in Canada, cf. Guitton et al. 2022) are merely simulations whose output is by no means binding (in fact, in the case of DoNotPay, the system is not even sponsored by the state, and the case of Mes Aides is complicated in this respect (Guitton et al. 2022). For any system, especially when sponsored by the state, there is a risk that citizens will mistake the system for the real one processing their application (when in fact they still need to file their application through another channel), or that they will mistakenly believe that those taking the decision will be using the same system. Often, in the aforementioned projects, those within the decision-making process in are in fact relying on completely different systems, with the simulation and reality having little in common (for Mes Aides, see Alauzen 2021). While some have feared the replacement of the workforce because of APR, state officials have rather lamented the opposite: The double fear of increased workload resulting from more applications because of the easy, affordable access to tools that empower individuals to claim their rights, and of more tiring arguments with disappointed rejected applicants who strongly believed in a positive outcome following the legal simulation and who struggle to see why there is a gap between the two systems. Arguably, the risk is also that it leads to a certain psychological toll on both sides, for those over-worked processing claims and for applicants who are on an emotional rollercoaster. Furthermore, also especially for projects with an accessibility aim, the critique of relying on legal advice provided by an APR comes with a nuance, namely that APR might present law as simpler than it really is, “leading to less precise advice and potentially inaccurate legal positions”, and consequently potentially “widening the gap between access to legal advice enjoyed by high-income and by low-income individuals” (Blank and Osofsky 2020).

4.2 Contestability of APR: Turning Encoding and Decisions

Contesting APR needs to be able to occur both on the level of encoding as well as on the level of (simulated or not) decisions reached by a system. Therefore, the issue of contestability is necessarily closely linked to the aforementioned ones of transparency (of the code and of the outcome) and responsibility (again, for encoding and for outcomes). To this comes the feeling of correctness, of being wronged by the algorithm and that this wrong interpretation took precedence over anything else—in other words, the issue of algocracy.

At the same time though, contestability goes beyond transparency, as it includes a justificatory element and due process in which the code and its outcome are being scrutinized. It also differs from responsibility in so far that it tries to apply a redress to the wrong, whereas responsibility seeks to lay the blame (in a stretch, so that the wrong does not come up again) This aligns with arguments made by scholars outside of the field of APR and more broadly tackling issues arising by algorithmic decision-making systems who have positioned a need to be able to justify algorithmic processes and outcomes (Malgieri 2021; Wachter and Mittelstadt 2019). In particular in the context of APR though, contestability becomes a central issue (Diver 2020; Hildebrandt 2015, p. 218), as the common procedural ways to contest a law and its interpretation cannot be employed (mainly due to the technical expertise needed to understand the APR’s code). This in turn challenges our understanding of the rule of law (Greenstein 2021; Hildebrandt 2020), and links yet again to another issue already thematised, the one of oversight.

The issue of contestability might not be equally prevalent in each implementation of APR. In fact, the argument can be made that the actors involved in issuing the APR and the target group of the APR will influence the severity of potential outcomes and thus the need to contest APR decisions as well as the extent of the redress. For instance, in the context of the state taking a decision in an automated form, the stakes are rather high due to states' coercive capabilities. Decisions can lead to financial support (e.g., with social benefits) or losses (e.g., with fines) for individuals, or in more extreme (futurist) cases in privation of certain liberties. The key question hence becomes: Which authorities could defend, justify or, if needed, retract decisions? Would the appeal processes already in place be sufficient? In administrations, would a human public servant (be it a social worker or a judge) be expected to work backwards from the output to find out arguments in support of the output in order to take a decision to leave it as it is, or again, retract it? Or, if no human is expected to carry out this reverse engineering work, would there be an independent appeal process to contest the output? But if that is the case, it is likely that anyone facing an outcome negative to their own interest will appeal, in which case will the system truly be more efficient? Or will the system merely result in all the appeal cases for retraction being reconsidered from the beginning by a judge and/or public servant, thereby making any use of APR in the first place close to useless? As for other issues, many questions remain open and will depend on many factors—from project type, governance processes around it, to technical choices.

5 The Need for a Public Debate Over the Properties of APR and Its Impact on the Legal System, Individuals, and Society

The three remaining points differ slightly from the list above in that they are not issues on their own—they become problematic as a challenge to our democratic institutions if a decision has been made without public consultation (Sadowski et al. 2021). At worst, implementers would have taken decisions without understanding of the different possible paths, leaving it as implicit that there is an agreement around the point when in fact, there has not been a public debate.

5.1 On the Role of the State to Foster Accessibility

In order to be able to follow the law, law should be accessible to everyone. However, history shows that only in more recent times, accessibility has become a cornerstone of what we perceive law should be. In Roman times, the oral tradition of the law meant that only a few had knowledge of the law. In the Middle Age, kings sought to leverage the lack of literacy by focusing on written text as a basis for the law; kings sought “to control the local normative order” (Herzog 2018, p. 246), arguably to minimise the impression of arbitrariness. The French revolution changed the way we understand what accessibility to law meant, by moving away from having judges and jurists determine the scope of the law as they see fit, to having parliaments acting as guardians of the law and creating new laws. This change in who was in charge of the law—at least its enactment—coupled with higher literacy rates and technical breakthroughs such as the printing press, meant that law became in theory overall more accessible for citizens. However, making law and case law public (typically in text form) does not mean that law is still accessible per se. The lack of systematic publication (e.g., of court cases) or other legal documents can still nowadays make it hard for citizens to actually have access to the law, even if such document should already be public and obtainable.

The UK for instance, a common law country for which judicial decisions bear much more weight than in civil law ones, did not until very recently (April 2022) have such a system in place which systematically published all judgments (The National Archives 2022). But the question of obtaining comprehensive, albeit again supposedly already public data, on case law in bulk remains one step further, and APR unavoidably relies on the availability of this bulk data. That many public servants still under-appreciate how APR could play a role in making law more accessible (Aidinlis et al. Forthcoming) may explain the slow uptake for change in that direction, although the new policy may be indicative of changing winds too.

The history sketched above leads to an interesting question: What should the role of the state be in making law accessible? That an academic field of “legal design” has emerged around this very question reflects its timeliness and that much more research will probably be published in the near future (Doherty 2022). While this question was relevant for text-based law and case law, it has become highly pertinent for APR. APR does not change only the gatekeepers to the law (as much as it did in the past), but it also goes to the core of the social contract between the state and its residents. And a clear definition of this social contract—whether it includes this view of the state as championing access to techno-legal education as a basic necessity for good citizenry, and which values are fundamental—should come as a result of a democratic debate, in line with the political system in which this debate would be inscribed (Pollicino and Gregorio 2021).

From the point of view of a country’s residents, on the other hand, the benefits of better access to the law are obvious: It reduces the need to consult legal professionals and allows for more independent thinking. From the legislator’s point of view, one major benefit of rendering law more accessible is that it is more likely to achieve its intended aim (e.g., ensuring compliance; deterrence of illegal behaviors; access to governmental services to citizens). For instance, in the case of any legislation on giving social benefit rights to residents, the laws will have carved out exceptions and rules; legislators would want to see those applied as they voted them to be. By excluding parts of the population from their rights, especially when due to issues of accessing and understanding the law, the state wanders astray from its mandate to live by the rule of law that enables and constrains its actions. It hinders that the legislators’ will, as expressed in voted laws, be respected fully as it should.

5.2 Efficiency of the Law: A (Dangerous) Misconception?

Two other questions that still need to be openly debated are: Is a higher efficiency of processes genuinely needed? And if the answer is yes, will a higher efficiency really be achieved through APR (or other forms of technology solutions), or are there other alternatives yielding better results? Asking these questions should bring to the forefront trade-offs that will be necessary to make and which would otherwise be made only implicitly. Also, the quest for a more efficient legal system comes with many promises that in practice fail to materialize (Guitton et al. 2022). For instance, taking human “sentiments” and “emotions” out of judgments, is unlikely to lead to less biased outcomes, especially not when the data used to train APR is based on that very “sentimental” and “emotional” data in the first place (Sankin et al. 2021). The promise of minimizing costs of legal processes is likewise a dangerous misconception as many failures of implemented APR projects have shown (Henriques-Gomes 2020; The Economist 2021). One may argue that those failed promises are not surprising, as techno-solutionism itself has been shown to fail to address complex issues adequately (Carr 2013).

Even though we are aware of these misconceptions, APR projects pushing for more efficiency in the legal system seem to neglect the need for a thorough public debate on the subject matter which takes into account both the gains and problems of the automatization process. Such a debate should include turning to questions such as: What are the costs of turning regulation into APR and what are the economic, financial, political, societal benefits of doing so? Already asking these questions may lead to the realisation that there is a dearth of research on framework to answer these questions and may spur further incentives to advance them. Without a proper idea of risks and benefits, it will be difficult to have an evidence-based dialogue on some of these questions. Underpinning these questions are also assumptions to make more explicit: What is an acceptable level of backlogs and waiting time for trial for courts? How much investment is the state capable and willing of undertaking to maintain this acceptable level? The same questions apply to the legislating process—as it will likely be lengthened—and to public services. And to all three: Is there a willingness to trade efficiency for anything else that could get lost on the way? It may be that no threshold can be set, but that at least, the public is treading on a similar set of explicit assumptions rather than ones buried deep in academic debate.

6 Empirical Analysis, Arguments, and Discussion

In a second step to making this comprehensive list, we sought to look at how these issues actually do appear in actual implementations of APR. There were four possible ways of doing this empirical analysis, each with its own drawback. First, we could have tried to transcribe what we observed for each project and whether the issues had been raised. However, this would have involved many assumptions and approximations when we did not have the information at hand. Second, we could have asked project members to fill out surveys on the issues. However, there would have been reporting biases with the main concern that participants could misunderstand survey questions and mis-report. Third, we could have opted for a normative approach, in line with the concepts from above, and looked at whether the issue should have come up. However, this would not have been helpful to make any kind of argument as to the current state of issues being raised by projects being currently implemented. And lastly—the approach that we chose to follow—, we described low- to high-risk criteria that approximate whether the issue occurred throughout the observed implementations. While we acknowledge that the risks should more likely be represented as a spectrum, our analysis restricts itself to criteria indicating a high risk (annotated 1) or low risk (annotated with 0). When difficulties arose with respect to accuracy of what could be observed (by reading about the APR implementations, reporting thereof, conducted interviews), we adapted the level of confidence accordingly (see “Appendix 3” for the confidence levels). We evaluated whether each project was exposed to the risk of the issue coming up, hence giving an overview of current risks being raised by actual projects, and still provided a descriptive and empirical complement to the normative and conceptual discussion of issues from above. And each data point weighed only 0.48% to the general analysis.

We use these discussed 12 categories of issues,Footnote 4 spanning subsets totaling 21 issues, as the basis for an empirical analysis (the category of issues corresponding to a need to debate would have been too normative for us to code and we therefore left it out). We considered the 10 real-life implementations in Guitton et al. (2022) and for each implementation, and for each issue (turned into a risk) we asked: Was the project exposed to this risk factor? (For more specific questions asked per issue, see “Appendix 3”).

Through these 210 generated data points, it became possible to conduct statistical analyses to see how such theoretical considerations have found their way so far into actual implementations of APR. The binary coding of whether individual issues were raised per case is available in “Appendix 3”. We acknowledge that when attempting to capture contexts, certain nuances were still left out. To reduce the discrepancy between the “reality” we are trying to encode and personal biases, we conducted independent risk assessments for all projects before discussing each discrepancy to iron them out, as much as possible.

We carried out two statistical tests, which gives us supporting evidence for two different claims. Firstly, we look at the difference in the average number of issues (x̄) that projects geared towards accessibility trigger against those geared towards efficiency—the first dimension of the typology from Guitton et al. (2022). Here, we derived the following hypotheses:

$${\text{H}}_{0} :{\overline{\text{x}}}_{{{\text{efficiency}}}} \le {\overline{\text{x}}}_{{{\text{accessibility}}}}$$
$${\text{H}}_{{\text{a}}} :{\overline{\text{x}}}_{{{\text{efficiency}}}} > {\overline{\text{x}}}_{{{\text{accessibility}}}}$$

Regarding the small sample size (< 30), we conduct a one-tailed t-test and compare its result tstat to a critical value tcritical,ɑ at a significance level ɑ. For the significance level α = 0.05 and df = N − 1 = 4 degrees of freedom, we have:

$${\text{t}}_{{{\text{critical}},\upalpha }} = { 2.13}$$

We obtain the following results for tstat:

Table 1 Input and result of t-stat

As the tstat is well above tcritical,α, we can reject the null-hypothesis and conclude that the number of issues for efficiency projects is significantly (p < 0.01) higher than the number of issues for accessibility projects (Table 1).

Secondly, we conduct a linear regression with the number of issues as the dependent variable, and with the position of the project in the typology (as the Euclidean distance from the origin along the dimensions: type of project, potential for divergence of interests, and degree of computer mediation). This is especially relevant as the design of the typology dimensions as well as the scoring of projects in the typology did not have as a goal to correlate in any ways with a measure of a project riskiness. This means that the results of the linear regression are therefore meaningful, from a research design perspective (Fig. 1).

Fig. 1
figure 1

Trend line of the number of issues with the Euclidean distance of the three dimensions from the typology

The results of the linear regression are in Table 2.

Table 2 Results from the linear regression

The R2 indicates how good of a fit the regression is and the value here is high. We interpret this as our second conclusion that the three dimensions of the typology, taken together as expressed by the Euclidean distance, are a good proxy of the number of issues in a project. Also, the p-value of the linear regression gives a high significance of the relation of the independent variable with the dependent one.

7 Conclusion

In this comprehensive review of issues that APR triggers, we have highlighted their varied facets, ranging from direct societal implications to those with more nuanced effects on our democratic systems. Much further debate in various public arenas will still be required to reduce many risks.

So far, we have seen that APR, first, could face issues of being representative, in the sense of attempting to define otherwise intentionally vague terms, of possibly not capturing the evolution of either social norms or of the statutes, and of bringing together inter-disciplinary groups (lawyers and software engineers). Second, implementers of APR might give more weight to legal certainty than there is, at the cost of certain higher-level legal ideals. Third, depending on the specific implementation, APR could remove agency even though it might be very well warranted in some (or many) situations that humans retain much control; balancing the two is a risky act. Fourth, the psychological impact of having decisions being made automatically rather than at much slower manual speed is most likely under-appreciated today. Fifth, the transition to APR, were it to happen as widely as many predict, will impact people’s jobs, with some being required to re-train to fit new job descriptions, and certain categories of jobs disappearing entirely (also causing psychological pressure on individuals). Sixth, the transparency of the encoding of the reasoning is one issue that often finds prominence, and correctly so, in many commentators’ and scholars’ work, equating it with “black boxes”. Seventh is a dilemma of the extent to which for-profit companies can genuinely make the law more accessible by providing legal services cheaper than consulting a professional. Eighth, similarly to affordability, if the user experience is not at the forefront of usability designs for accessibility-oriented APR to solve the mismatch between complexity of the law and laypeople’s lack of prior knowledge, then this might undermine the endeavour to make the law more accessible. Nineth, the “black box” issue of transparency comes with an extra nuance, namely that the outcome of an APR may not be known, and as unclear might also be the auditing process trying to counter-weigh this very issue. Tenth, in a changing world of gatekeepers and amidst a web of coveted interests, attributing responsibility is hard, be it for an incorrect encoding of the law, or bad selection of underlying data feeding the algorithm, implementation, or output. Eleventh, the changes only coming in progressively has meant that many projects have only been, possibly confusingly, given to the public as a simulation instead of giving access to the actual decision-making algorithm. Twelfth and last, when a mistake arises, there needs to be a process to contest and reverse decisions. Many of these twelve issues are, once again, risks. They do not have to materialise but will depend on implementations and on how much those sponsoring and implementing projects have been giving careful thoughts to them.

For anyone looking into managing the implementation of an APR system, keeping in mind all of these risks is also in and of itself a challenge. But this article has also shown, qualitatively and quantitatively, how the typology dimensions were relevant in supporting such an assessment. From a qualitative review, we were able to ascertain that all issues are related to at least one dimension more strongly than any other, with all three dimensions playing a role in at least one issue—hence, showing the relevancy of using the typology to analyse issues that APR spans. From our quantitative review, and going much beyond only the relevancy of the dimensions, we went on to make two arguments. First, that efficiency projects trigger a higher number of issues than those aiming at accessibility. And second, when looking at the number of issues, the project type, potential for divergence of interests, and degree of mediation by computers taken all together are a highly relevant predictor of the risk of a project developing the discussed issues.

Two shortfalls for future research are apparent from this. In our empirical analysis, we looked only at the number of issues as we felt that appreciating the extent of issues would have been too judgmental and not replicable. With time, as more projects are implemented and more issues come about, it may become more realistic to assess once more the extent of such issues and what triggered them. A second area for future research is on how to avoid that most of these issues come about. To keep the scope of the paper tight, we have refrained from veering into principles and policy-making recommendations. But the comprehensive review can be leveraged to further build up such a framework, at best in close discussion with policymakers involved in this field.