Show Me the Receipts
Your AI made a decision that changed someone’s life. Their lawyer asked why. Nobody on your team could answer. In regulated industries, that’s not a glitch. It’s a liability.
In my last piece, I wrote about the proprietary data problem: employees feeding sensitive information into AI tools they don’t control, coding assistants introducing vulnerabilities, agents destroying production databases. I ended by saying the next piece would be about observability as a human right.
Then, six days later, a federal court made the argument for me.
On March 9, 2026, a federal judge in Minnesota ordered UnitedHealth Group to open up its AI playbook. Becker’s Payer Issues reported that the court sided with plaintiffs across six of seven discovery categories. The order didn’t just ask what model the company was using. It demanded internal documents about an algorithm called nH Predict: who built it, how it’s used, who it incentivizes, and how it shapes coverage denials.
The lawsuit tells one man’s story. According to the complaint, Gene Lokken was at a skilled nursing facility after a medical crisis. In July 2022, UnitedHealthcare cut his coverage. They said additional days weren’t medically necessary. Lokken and his physician disagreed. They appealed and lost. Medical Economics reported that his family paid $12,000 to $14,000 a month out of pocket for almost a year. He died on July 17, 2023.
When the family asked why, there was no real answer. The decision came from an algorithm. The reasoning was proprietary. An Optum spokesperson told Becker’s that nH Predict is “a guide” to help inform caregivers. But a STAT investigation cited in the lawsuit found that UnitedHealth pressured employees to keep patient stays within 1% of the algorithm’s prediction.
According to the plaintiffs, appeals data suggest nH Predict may be wrong about 90% of the time when denials are challenged. Nine out of ten denied claims were reversed on appeal. But a Kaiser Family Foundation report found that only about 0.2% of policyholders ever challenge a denial. The tool kept running because the appeals rarely came.
Your Peers Just Became Case Studies
UnitedHealth isn’t the only cautionary tale.
A 2023 ProPublica investigation found that Cigna used an algorithm called PXDX to deny over 300,000 claims in two months. According to internal spreadsheets reviewed by ProPublica and The Capitol Forum, medical directors were signing off on denials in batches, spending an average of 1.2 seconds per case. One doctor reportedly denied 60,000 claims in a single month. Nobody opened a chart.
The class action filed in Sacramento names specific patients. Suzanne Kisting-Leung had an ultrasound to check for ovarian cancer. It found a cyst. Cigna denied the claim. Another plaintiff had a vitamin D test ordered by her doctor. Denied. According to the complaint, Cigna gave no explanation for either.
Cigna disputes these characterizations. But the allegations themselves are already shaping how regulators and plaintiffs’ attorneys look at every carrier using similar tools.
Regulators and plaintiffs’ attorneys don’t care whether you call these tools “guides,” “assistants,” or “workflows.” If an algorithm narrows the funnel, humans rubber-stamp the output, and patients can’t get a meaningful explanation, then in practice the algorithm is making the decision. That’s the story now on the record. And it’s the lens regulators will apply to every carrier using AI.
The Question Your Board Should Be Asking
I talk to IT directors and VPs in regulated industries. Executives keep asking: “Is our AI compliant?”
That’s the wrong question.
The right one: if a judge orders you to explain any AI-influenced denial tomorrow, case by case, can you? Not at the aggregate level. For each individual decision. Can your chief medical officer explain why the AI recommended denial in a specific case? Can your CIO show, in a traceable way, how a given data point influenced the outcome? Can your compliance officer demonstrate to regulators that humans still truly own the decision? This is not an IT question. It’s a board-level risk question that happens to be implemented in code.
If the honest answers are “no,” “not really,” and “we hope so,” you have a governance problem hiding under that efficiency costume.
Recent AMA surveys suggest a majority of physicians believe unregulated AI tools are driving more prior authorization denials. At a 2025 industry conference, the AAPC Knowledge Center reported that algorithmic denials dominated every panel and hallway conversation. Attendees cited reversal rates on appeals around 90%. The system was wrong most of the time. It kept running anyway.
What Observability Actually Means
I’ve spent most of my career managing IT infrastructure. In my world, observability means the ability to understand what a system is doing by looking at its outputs. When a server goes down, you check the logs. When a query runs slow, you trace it. If a platform I run for 45,000 researchers goes down and I can’t tell my team why, that’s a career-limiting event.
That is the baseline for a web server. It should not be too much to ask of a system that decides whether a 74-year-old gets to stay in rehab.
Think of it this way. If your building’s fire alarm evacuated everyone and nobody could explain why it went off, you’d call that a broken system. You wouldn’t keep using it. You wouldn’t trust it with people’s safety.
That’s what we’re doing with AI right now. Trusting it with people’s safety. And we can’t tell anyone why it does what it does.
Where Regulation Stands
The EU AI Act is the most ambitious attempt to require explanation. According to a Cogent Infotech analysis, high-risk AI systems in healthcare and credit scoring will need auditable decision logs by August 2026. Penalties run up to €20 million or 4% of global revenue.
In the U.S., states are moving first. California’s Physicians Make Decisions Act, effective January 2025, requires a qualified physician to make final medical necessity decisions. Not an algorithm. Texas followed in June 2025, requiring licensed practitioners to review AI-influenced medical records before clinical decisions are made. Colorado’s AI Act, effective February 2026, requires impact assessments and explicit notice when AI is used in consequential decisions, including insurance.
But the pressure is moving in both directions. As Gunderson Dettmer and Pearl Cohen both reported, a Trump executive order signed in December 2025 directs the Attorney General to challenge state AI regulations deemed obstacles to national competitiveness. If you’re in a regulated industry counting on state laws to clarify your obligations, that ground may shift under you.
When the rules are in flux, courts fall back on a simple standard: did you know what your system was doing, and did you act responsibly? “We deployed a black-box model from a major vendor” will not hold up well on the stand.
How This Blows Up
You don’t need a dramatic scenario. You need one bad week.
The class action week. A plaintiff’s firm realizes a single algorithm touched tens of thousands of denials. They move for class certification, arguing the system applied the same flawed logic to everyone. You’re not defending one denial anymore. You’re defending your entire AI operating model.
The regulator week. A state insurance department picks up a media story, opens an investigation into AI-assisted denials, and discovers you can’t produce meaningful decision logs. Other states notice. Multi-state market conduct exams follow.
The boardroom week. A patient story goes viral. Journalists start asking your CEO for specifics about your algorithms. The board asks: did we approve this risk? Who owns it? How much have we actually saved, and at what exposure?
All three hinge on the same weakness: you can’t show the receipts. In every scenario, your CIO, CMO, chief compliance officer, and general counsel are answering for systems nobody on the team can fully explain.
What Observability Would Actually Look Like
I’m not proposing that every AI decision needs a 20-page report. I’m proposing that when a system makes a decision that materially affects someone’s health, finances, or freedom, that person has the right to a meaningful explanation. Not “based on our analysis.” A real answer.
That means three things at minimum. The inputs should be visible: what data the system used, from which sources, with what known limitations. According to the Lokken complaint, nH Predict analyzes a database of 6 million patients, looking at diagnosis, age, living situation, and physical function. If those factors ended Lokken’s coverage, his family deserved to know which ones mattered and how much weight each carried.
The reasoning should be traceable. Not the full model architecture, but the chain of logic from input to output. California already requires a physician to make the final call on medical necessity. If a doctor has to explain their reasoning, a system that replaces the doctor should too.
And the confidence should be disclosed. Was this a strong call or a close one? If the model was 51% confident, that’s a very different situation than 95%. Both produce the same denial. But they demand very different levels of human review.
If your AI vendors can’t provide all three on demand, that’s not a feature request. That’s a red flag your board should see.
The technology for all of this exists today. Decision logs. Confidence scores. Interpretable model designs. These are production tools, not research concepts. The obstacle is not capability. It’s incentive. Explanation costs compute. It slows processing. It creates records that can be used in court. Every reason Cigna had for processing claims in 1.2 seconds is a reason they wouldn’t want to explain each one.
The Connection to This Series
If you’ve been following this series, observability connects to every argument I’ve made. In my first piece, I proposed an agent registry with audit trails. Those trails are useless if the reasoning behind each action is a black box. In my second piece, I argued that consent requires transparency. You should be able to ask your agent “Why did you do that?” and get a real answer. In my last piece, I wrote about what happens when your AI runs on someone else’s infrastructure. Observability is the other side of that problem. When you can’t see how a system makes decisions, you can’t tell whether those decisions serve your interests or the platform’s.
Here’s what I keep coming back to: observability is the mechanism that makes accountability possible. Without it, everything else I’ve proposed has no teeth. Accountability without observability is theater.
What I Don’t Have Answers To
I don’t know where to draw the line on what counts as a “material” decision. A healthcare denial is obvious. A product recommendation probably isn’t. But between those poles there’s a lot of gray, and requiring explanation for every automated decision may not be practical.
I don’t know how to make explanations useful to people who aren’t engineers. A decision log full of technical jargon is transparent in theory and useless in practice. Observability that only serves the people who built the system isn’t a right. It’s a feature.
And I worry about explanation becoming its own kind of theater. Companies are already producing “explainability reports” that check a compliance box without telling the affected person anything meaningful. If that’s where this goes, we’ve traded one performance for another.
But the alternative is the world we’re in right now. Algorithms making life-altering decisions. Nobody able to say why. Gene Lokken’s family knows what that costs.
What I’m Asking
If you work in insurance, healthcare, or financial services: look at the AI tools your organization has deployed and ask what they can explain about their own decisions. If the answer is “not much,” bring that to your compliance team, your legal team, and your board. California and Texas have enforceable laws now. Colorado takes effect next month. The Lokken court just ordered an insurer to open its algorithm to discovery. The direction is clear.
If you can’t explain your AI-driven denials to a judge, don’t expect your shareholders or regulators to be any kinder.
If you manage IT infrastructure in a regulated environment, you’re going to be the person who has to answer the question: can our AI tools show their work? Start asking your vendors now. The ones who can’t answer are telling you something about the risk you’re carrying.
And if you’ve been on the receiving end of a decision that didn’t make sense, a denied claim, a rejected application, a coverage termination your doctor disagreed with: ask for the explanation. You may not get one. But the asking matters. And as the Lokken ruling shows, courts are starting to treat that silence as something worth investigating.
This is the seventh in a series about AI accountability.
If you’re thinking about these questions too, I hope you’ll subscribe.
Rachel Ankerholz is an IT Director and writer exploring the intersection of AI ethics, accessibility, and human-centered technology. She writes about who gets included, and who gets left behind, when we build systems.


