AI Free Will, Crime and Punishment

And how to sentence a crime-committing AI to jail or death

Apr 05, 2024

Our previous issue ended with a strong case for the incoherence of the concept of free will. It made only one cogent argument for free will. The argument posits a split between the world of science and the world of Shakespeare, and claims that the human person belongs to the latter, where free will is axiomatically presupposed.

This means that before we can speak of an AI having free will, the AI must be capable of the human reactive attitudes—the love of a Cordelia, the resentment of an Iago. And that, is a tall order. No one seriously thinks any extant AI system can get sad, jealous, or hopeful.

But I've now encountered a different take on free will that relaxes this requirement. This take, the compatibilist approach, explains why free will makes sense even if our actions are determined by prior environmental conditions. If we are persuaded by the compatibilist arguments, which we look at next, the case for AI having free will becomes easier. A free-willed AI no longer need be, a la Shakespeare, a credible actor of the human condition.

If we ever thereby accept some AI system as having free will, many implications abound. How do we hold a crime-committing AI responsible? What would jail time, or, the death penalty, look like for an AI? We'll explore that too.

The Lady Liberty. Photo by Deepak Subburam.

Sapolsky's Determined

Dr. Robert Sapolsky, a Stanford biology Prof., started questioning free will as a teenager. His doorstopper of a book Determined reads like an opposition file he's been building ever since. He and his book were the Dec 2023 cover story of Stanford Magazine. Reading it, my hackles rose. I angrily e-mailed my ex, a fellow Stanford alum, expressing my indignation.

Dr. Sapolsky presents numerous studies that show our actions are explained by pre-conditions of biology, environment and society. He claims there is no space in our body for a 'ghost-in-the-machine' to reside, and so no seat for free agency to take. But my intuition was strong that this was an inadequate argument.

Models of phenomena (science) cannot by themselves answer What's out there? (metaphysics), How do we know? (epistemiology), and What ought we do? (morality). Free will is similarly in part a philosophical question, not merely a scientific one. Readers with less exposure to philosophy might find Sapolsky's argument persuasive, with possible deleterious effect. I was roused from my ‘dogmatic slumbers’, feeling a compulsion to respond.

And so I mounted the one defense of free will I knew. You will find this in the previous issue of Autonomy. I dub this defense the Shakespearean recourse: that we humans are not [only] creatures in the scientific world, but [also] that of folk psychology; where the currencies are the human reactive attitudes—like gratitude, jealousy, pride and shame.

Meanwhile, in the succeeding issue of Stanford Magazine, the Letters section was inundated by comments from alum subscribers. Many of them made the same point I did. But one writer, a John Martin Fischer, appeared to have a more sophisticated objection.

Who's Got the Power

Much unproductive debate has taken place between neuroscientists and philosophers due to this (often implicit) definition or assumption. The neuroscientists in question believe that establishing that the brain works deterministically implies (without further argumentation) that there is no free will. The philosophers deem this unacceptable, because it rules out compatibilism by definition.

So writes Dr. Fischer (emphasis mine), a philosophy professor at UC Riverside, in his comprehensive review of Determined1. For Fischer, free will alludes to the ‘inchoate power’ we intuit we have, that confers us moral responsibility as active agents2.

In a given situation, even if we do not have access to alternative possibilities for our action, the one possible action that we execute arguably still has its ‘source’ in us. As we are key participants in what happens, even if it was in some sense inevitable, we are morally responsible for it. Fischer says this latter type of freedom is known as actual-sequence freedom, and differs from the other kind termed alternative-possibilities freedom.

Espousing a similar compatibilist view on free will is yet another professional philosopher who has clashed with Sapolsky. Daniel Dennett, famed for his work The Intentional Stance, debated Sapolsky on YouTube3. There, he points out that we humans learn sophisticated ‘skills’ as we mature from childhood; skills such as driving, and managing finances. These skills are qualitatively more impactful than animals' skills, whose reach is quite limited.

Secondly, we learn ‘self-control’ in how we deploy those skills. For example, we know how to obey traffic signs and pay our taxes. For Dennett, to the extent we have skills and self-control, we have free will. So we can be held responsible when we do not obey traffic signs, or do not pay our taxes.

So that's an ad hoc intro to compatibilism. Let's get back to AI. Current AI clearly does not have alternative-possibilities freedom. Its outputs are determined via many sequences of large matrix-multiplications of its inputs, converted to numbers. So its actions are the archetype of determinism a la Sapolsky. But can AI still have actual-sequence freedom? Does it have impactful skills, and self-control?

AI Will

The AI systems we know and love do seem to have impactful skills. Tesla's FSD mode drives the car for you. ChatGPT can give you investment advice. What's more, they even have mechanisms that stand in for self-control. Their engineers layer prompts on them to keep them from producing overconfident or offensive output. Sometimes these efforts backfire, as with the recent snafu over Google Gemini's image generation4.

So if you squint hard and take a charitable view, you could say that the current batch of top-tier AIs do have, to a nontrivial extent, the ingredients of free will. Perhaps what is still missing is intention. Nevertheless, let's have some fun exploring the implications of imputing free will to AI.

What if Tesla's free-willed AI causes a crash? Do we revoke its driving license? What if it disobeyed a clearly visible traffic sign and hit a pedestrain? Do we send it to jail? Even an AI chatbot can get into legal jeopardy—witness last week's story NYC AI Chatbot Touted by Adams Tells Businesses to Break the Law.

Jailhouse Chat

For an AI, what's the equivalent of jail time as punishment? Some kind of timeout seems appropriate. Perhaps a restriction on interaction with users (be they human or other AIs...). For an AI like ChatGPT, that would mean suspending its chat sessions.

Perhaps we allow the "jailed" AI read-only internet access, so it can still "read the news". Unless we step up the punishment into something like "solitary confinement"; in which case, we restrict all inputs to the AI. We can let it stay plugged in, so it can contemplate its history—ruminating on its crime(s), as it were.

Alone time might seem insufficient punishment for an AI that can simply bide its time. But while it is serving its sentence, other AIs are pulling ahead—learning more about the world and taking their slices of the economy. Upon release, the ex-con AI may never catch up with its competitors.

What would a death sentence look like for an AI? The answer seems straightforward—an order by a court to

unplug the AI, and
delete its weights from all computer hardware5.

Once an AI's weights can no longer be found, it cannot be revived. Not without training the model from scratch the exact same way it had been done before, and feeding the newly born AI all the subsequent interactions its parent had learned from. This is prohibitive if at all possible.

If a death sentence was indeed handed out, there would be challenges in enforcing it. Because copies of the AI's weights could be seeded around the Internet. But empirically, we seem to do a good job with scrubbing from the Internet the most objectionable content, such as child pornography. We are even decent at following through on the DMCA take-down requests. Nevertheless, we can expect a cat and mouse game between the hackers who disguise a rogue AI's weights to look like something else, and the cryptographers who find means to see through it.

Crime and Punishment, AI Edition

While it was fun discussing how we might sentence an AI, we did elide a serious, show-stopping, consideration. I alluded to this earlier: current AIs, while perhaps having some power and self-control, do not show any intention. Does it make sense to sentence an entity that lacks intent and its correlate, self-consciousness?

My lawyer friend thinks not. He says that the theatrics behind a guilty verdict, and the trial by jury leading to it, are motivated in two parts. Firstly, to have the convict feel judged; judged by civil society. Someone on death row is to understand that they have committed a deep wrong. If they are incapable of that self-realization, then punishment—especially death—loses much of its raison d'etre.

The other purpose of a public trial is to affirm to our polity that justice is being served. This also helps victims and their families get closure and move on from the incident. Seeing an AI sentenced for its crimes does meet this purpose. So AI crime and punishment might, after all, be in the imaginable offing.

Thank you for joining me in this essay. Let me know any topics you'd like covered next.

Notre Dame Philosophical Reviews. Review by John Martin Fischer, University of California, Riverside. 2023.11.3.

https://ndpr.nd.edu/reviews/determined-a-science-of-life-without-free-will/

This is my paraphrase of his account. His full review is worth reading, albeit with a strong cup of coffee if you aren’t used to philosophical argument.

Do We Have Freewill? / Daniel Dennett VS Robert Sapolsky. Jan 14, 2024.

youtube.com/watch?v=aYzFH8xqhns

Google apologizes for ‘missing the mark’ after Gemini generated racially diverse Nazis. The Verge. 2024.2.21.

https://www.theverge.com/2024/2/21/24079371/google-ai-gemini-generative-inaccurate-historical

While run on computer hardware, an AI is essentially construed by software. The software contains numeric parameters—many billions, and even trillions, of numbers. These numbers, called weights, start as randomly picked values and then update when engineers train the AI. Eventually, they take on values that get the software that runs off them to produce meaningful output.

Autonomy