1. AI judges are more consistent but *less discretionary
Many commenters note that the study shows GPT‑5 “adheres to the legally correct outcome… 100 % of the time” while human judges only get it right about half the time.
“The LLM adheres to the legally correct outcome significantly more often than human judges” – droidjj
“the LLM makes no errors at all” – thewanderer1983
2. Human judges bring judgment and empathy that AI lacks
A large portion of the discussion argues that the role of a judge is to interpret the law in light of context, values, and the victim’s interests—something an LLM cannot replicate.
“Judges do what their name implies – make judgment calls” – codingdave
“The judge’s decision reflects a moral view that victims should be fully compensated” – tadzikpk
3. Bias, accountability, and the “black‑box” problem
Participants repeatedly warn that AI inherits the biases of its training data and that its decisions cannot be audited or appealed in the same way as a human judge’s.
“The AI is trained on the entire legal history that would bias it toward historical norms” – arctic‑true
“Who controls the computer? It can’t be the government… it can’t be a software company” – arctic‑true
4. Practical concerns about implementation and appeals
The debate also covers how an AI‑first system would fit into existing legal workflows, the need for a human‑in‑the‑loop review, and the risk of undermining public confidence.
“Human‑in‑the‑loop AI doesn’t remove the human corruption factor at all” – scottLobster
“A human judge review with a high bar for analysis if in disagreement with the AI” – vjulian
These four themes capture the core of the discussion: the trade‑off between consistency and discretion, the fear of bias and lack of accountability, and the practical hurdles of integrating AI into the justice system.