Your Model Thinks Helping People Makes Things Worse

Last week I wrote about why causal inference has always worked but couldn't scale. The response was a decent mix of "finally someone said it" and "prove it" via comments and DMs.

If you want us to prove it, book a demo and bring a dataset. We're not shy.

But let's talk about what happens when you actually can do causal inference at scale. In the real, ugly, operational world where decisions get made on bad data and nobody has time for your elegant math.

You know what industry everyone loves to go into detail about? Insurance! /s

But really, it's a massive industry sitting on an embarrassing gap between what they spend on analytics and what they get back. It's also one of the cleanest examples I've found of an entire sector making important decisions with tools that literally cannot answer the question they're asking.

Averages Are Lying.#

A claim comes in. It gets scored by a predictive model (or several), and if the score crosses a threshold, it gets flagged for additional resources. More attention, more expensive handling, more scrutiny, more… bad.

"Based on historical patterns, how expensive is this claim likely to be?" is a fine question if you're passively watching the world happen to you. But carriers don't passively watch claims. They intervene. They assign specialists. They escalate reviews. They authorize treatments. They deploy resources that cost real money.

We were talking to a carrier recently about how they handle resource deployment. Nurse case managers, specialist reviews, early treatment authorization. They knew, intuitively, that putting the right resource on a claim early could keep a $20,000 injury from spiraling into a $100,000 nightmare. But their model told a different story. It saw nurse assignment correlated with expensive claims and concluded: nurses are a cost driver.

Which is insane.

Nurses get assigned to severe claims because they're severe. The model learned the correlation and everyone downstream treated it like a recommendation.

So they were stuck. They can’t prove if this kind of intervention has an impact without precision, case-by-case analyses that would take too long to be useful (like, a year after the claim is closed and you have all the historical data). But right now, the averages are dead wrong.

But that’s today’s state of “data-driven” and it’s the best everyone’s got. Resource Deployed → High Cost.

The model confuses the selection effect (resources go where the fire is) with the causal effect (resources reduce costs when deployed at the right time on the right claim). Interventions that could save real money get deployed too late, too broadly, or not at all because the model can't distinguish helping from hurting.

The Question That Actually Matters#

Every carrier needs to answer, for every claim:

"What specific intervention, applied to this specific claim, at this specific point in time, will change its trajectory?"

That's precision claims management. The math has existed for decades. Pearl, Rubin, the whole causal inference canon. The problem I spent a whole article ranting about was that the operational machinery to do it at scale didn't exist. Combinatorial explosion. Unmeasured confounders. Temporal complexity. The engineering problems that kept causal inference trapped in bespoke, one-off projects.

What RootCause.ai does is learn a causal model of a claims system from its operational data, the cause-and-effect relationships that drive outcomes, not just the correlations, and build a causal digital twin. Ask it "what happens to this specific claim if I deploy a specialist today?" and it gives you the counterfactual: outcome with, outcome without, what drives the difference, and how confident the estimate is.

It's a flight simulator for claims decisions. And it runs on data you already have.

High-Impact vs. High-Cost#

A high-cost claim is one where you're writing a big check no matter what. Catastrophic event. Flag it. But your ability to change the outcome is minimal.

A high-impact claim is where a specific intervention actually bends the trajectory.

Most carriers spend resources on expensive-but-unrescuable claims while ignoring the "quiet middle." Claims that look fine on paper but are silently creeping toward catastrophe. A soft-tissue injury worsens because nobody intervened early. A claimant's frustration boils into litigation. A two-week recovery becomes a two-year ordeal.

The causal digital twin catches these by asking what's changeable. Simulate the intervention. Decompose the effect. Quantify the ROI before you spend a dime.

What This Looks Like at the Desk#

Instead of a risk score (high, medium, low, good luck) an adjuster sees:

assets

"Positive intervention ROI. Deploying a specialist now is projected to reduce total cost by $8,567 with 95% confidence."

That's a counterfactual comparison. Here's what happens if you act, here's what happens if you don't, here's how sure we are. Scale that across a book of business and the carrier stops playing defense against cost and starts playing offense for outcomes.

This Is Just One Thing#

Precision claims is one application of causal inference at scale. One.

The same digital twin that simulates individual claim interventions can stress-test policy changes across the whole book. What happens if we change our specialist assignment threshold? What if we shift resources between regions? Historically the answer was: implement the change, wait 12 months, find out.

With a causal digital twin you can be wrong for free (that’s a good thing, by the way).

Claims is just the door you walk through. Underwriting. Pricing. Fraud. Distribution. Every decision in insurance is an intervention on a complex system. Every one can be decomposed, simulated, and optimized.

The carriers that are figuring this out first will have a compounding structural advantage that gets wider every quarter. And at some point, the ones still running on predictive analytics will find that gap very hard to explain in a board meeting.

Jake Friedenberg is Co-Founder of RootCause.ai, where we build enterprise causal AI. If you want to see precision claims, or any causal decision intelligence, in action, reach out. We like skeptics.

Your Model Thinks Helping People Makes Things Worse

Averages Are Lying.#

The Question That Actually Matters#

High-Impact vs. High-Cost#

What This Looks Like at the Desk#

This Is Just One Thing#

Related Posts

Causal Inference Always Worked. We Just Couldn't Scale the Damn Thing. Until Now.

Netflix Spent a Decade Building Causal Infrastructure. Bestie, You Don't Have a Decade.

Ouija Boards & Causal Inference