How to trust a black-box artificial intelligence
Outstanding paper award at ICML 2025 (episode 6)
Hello fellow researchers and AI enthusiasts!
Welcome back to The Future of AI! In this sixth and last episode of my ICML review series, we’ll tackle the uncertainty itself, and discuss a novel way to measure confidence in machine learning predictions.
How can we truly know if an AI system will make safe and reliable predictions when deployed in critical environments, without peeking inside its black-box model? And is there a smarter way to evaluate the real risks it might face beyond average uncertainties?
Full reference : Snell, Jake C., and Thomas L. Griffiths. “Conformal prediction as bayesian quadrature.” arXiv preprint arXiv:2502.13228 (2025).
Context
Machine learning is now used in sensitive areas like healthcare and finance, where mistakes can have serious consequences. This makes it essential not only to build accurate models but also to know how reliable their predictions are. The problem is that modern machine learning models (especially deep learning systems) are often “black boxes”. They make predictions but give little information about how certain or uncertain they are about those predictions.
One popular way to measure and control uncertainty is called conformal prediction. It allows to add an extra safety net around any machine learning model, so we can say that “the answer is probably within this range”. Importantly, conformal prediction does this without needing to know the inner structure of the model. This is like checking whether a car is safe to drive without needing to know how the engine works.
However, the current version of conformal prediction is based on a statistical philosophy called frequentism, which relies heavily on long-run averages. This approach sometimes fails in practice because it cannot easily use prior knowledge we might already have, and its guarantees often only hold on average rather than in specific situations.
Key results
This paper proposes a new way forward by bringing ideas from another statistical philosophy: the Bayesian probability. In simple terms, Bayesian methods allow us to incorporate prior information and provide richer, more intuitive statements about uncertainty. The Authors combine conformal prediction with a tool called Bayesian quadrature. Quadrature is a mathematical way of estimating totals or averages. In this context, it’s used to better capture the range of possible errors a model can make.
The key insight is that by expressing conformal prediction in a Bayesian way, we can get a whole picture of the possible risks. This is especially helpful when deploying models in real-world settings where uncertainty is not abstract but deeply impactful.
The Authors show that two widely used uncertainty methods — split conformal prediction and conformal risk control — actually appear as special cases within the broader Bayesian framework. This means that this new method doesn’t throw away past progress but extends it, offering more flexible and clear guarantees.
In experiments with both synthetic data and real-world image data (MS-COCO), the Bayesian approach consistently provided tighter, more reliable predictions. It reduced the chances of the model producing overly risky results while also keeping prediction ranges smaller and more practical.
To conclude, this paper shows that by adopting the Bayesian approach, we can make machine learning systems safer and more trustworthy. Instead of measuring how often do models fail on average, this new approach evaluates how confident can we be, right now, with the data in front of us. This is a notable shift that could make AI deployment more responsible and robust.
My take
Driving a car without knowing how the engine works, even with the “check engine” light on, usually feels fine, because the worst that can happen is the car stops and we’re late to a meeting. Then we can simply take it to a professional — a mechanic — who understands how it works and can fix it. The brakes, on the other hand, are a different story. We don’t need to fully understand how they function, but we do demand complete certainty that when we press the pedal, the car will stop.
We tend to expect the same level of confidence from AI systems, but the reality is very different. In most cases, even the professionals — the researchers and engineers — have little to no idea how the models they’ve built actually work. Put simply, AI models store their knowledge in the form of numbers and weights, often millions or even billions of them, and no human can interpret their meaning or the intricate relationships between them.
So how can we trust a tool that not only regular users don’t understand, but even the experts? By adding extra layers of protection and by carefully estimating its errors and uncertainties. That way, when we hit the accelerator, the system accelerates. And when we hit the brakes, it stops. Not the other way around, and with certainty as close as possible to 100%.
Looking ahead
This sixth episode wraps up the review series from this year’s ICML conference.
On Thursday I’ll kick off a new series about one of the biggest conferences in computer vision. If you’d like to follow along and never miss an update, make sure to subscribe to The Future of AI!

