Ouija Boards & Causal Inference

Author

Jake Friedenberg

Date Published

Jake Friedenberg Co-Founder at RootCause.ai | Causal Inference at Scale


A wizard stands before you in a dark tower. Behind him, a scrying pool pulses with light. Charts swirl in the mist. He grips his staff, wheels around, and bellows: "BEHOLD! The omens reveal... speeding correlates with accidents!" The king's court nods. Riders are dispatched to enforce speed penalties across the realm. Six months later the accident rate hasn't budged. The wizard is back at the pool, squinting at new shapes, muttering about sample sizes.

You have this wizard. He works in your analytics department. The pool is your BI dashboard. And no amount of rigor changes the ritual: the wizard will never be able to reliably compute what will happen if you intervene.

The Séance#

Alright, nerd sh*t aside.

A major logistics company had a truck accident problem. They wanted to reduce the number of accidents. So they did what any well-funded enterprise does: they threw money and data at it. Bling bling!

Truck telemetry! Speed! Aggressive driving behaviors! Hard braking events! Snow! Rain! Truck type! Driver profiles! Full-time, part-time, contractor, dog, cat, red, blue! Time of day, day of week, route characteristics.

They made a big internal website. A graphic designer clearly helped. It had a branded dashboard showing years of data from dozens of sources. It was an absolute f*cking marvel of data engineering and took the better part of a year to build. Beautiful work.

And every month, they and a half-dozen executives would gather around this thing and do the best they could: sacrifice a chicken on the data altar and try to divine the future by reading its entrails.

Executives speculate that speeding tracks with accidents, and they create a policy to crack down on speeding. Deploy. Wait four months. Nothing moves.

Part-time drivers look riskier. Increase screening. Deploy. Wait four months. Nothing changes.

Weather looks like a factor in Q1. Seasonal protocols. Deploy. Wait four months. Nothing really mattered in the results.

What Causal Discovery Actually Found#

RootCause.ai ran causal discovery across the same messy operational data they'd been staring at for years and it took us about 3 hours.

The answer wasn't speeding, weather, or driver employment type. It was actually completely unremarkable: driver fatigue.

Drivers who exceeded 4.5 continuous hours behind the wheel, violating maximum continuous driving regulations, were the upstream driver. Speeding, aggressive cornering, poor weather performance: all downstream. Fatigue degrades every other behavior simultaneously. The dashboard had been surfacing those degraded behaviors as if they were independent problems. The team had been intervening on them one by one. For two years!

The system ingested their operational data, ran automated causal discovery across hundreds of variables, identified where unmeasured factors were influencing relationships, and built a digital twin: a simulation environment where you test interventions before you deploy them.

Before the company changed a single policy, they queried the twin. "What happens if we enforce the 4.5-hour limit?" Here's the expected reduction. Here's the causal chain. Here's the confidence interval. "What about the speed penalty program we ran last year?" Here's why it didn't work: speeding is a symptom of fatigue in this population, and penalizing the symptom doesn't touch the cause.

They deployed the driving hour enforcement. The numbers moved. On the first try. Knowing which variable to act on before you act eliminates the sequential experimentation cycle that was eating years of their time.

Burn the Altar#

Causal inference is not new, but very few companies have the ability to execute causal projects successfully. The math has been right for decades and it eliminates the need to run certain kinds of investigatory experiments.

RootCause.ai has mathematical breakthroughs that lower the barrier to entry for causal inference. Automated causal discovery across messy enterprise data. Autonomous data ontologies. Latent confounder identification. Digital twin simulation. Point it at the right problem and compress years of experimentation into days.

We'll be publishing our white paper later this year. If you're skeptical in the meantime, you can just get in touch with us.

You don't need to test ten things to find the one that works. You need to find the one that works and then test it. The order matters. The difference in cost is measured in years.

The wizard is talented. Always has been. Maybe it's time to give him something better than a scrying pool.


Full technical details will be published in our white paper later this year. More on how we do causal discovery can be found at https://rootcause.ai