Why AI struggles to think outside the box

Outstanding paper award at ICML 2025

Sep 08, 2025

Hello fellow researchers and AI enthusiasts!

Welcome back to The Future of AI! Today we’re diving into the second paper from my ICML review series. If you missed the first installment, here’s a quick reminder: the International Conference on Machine Learning (ICML) is one of the most prestigious gatherings in the field, bringing together cutting-edge research from across academia and industry. This year’s conference took place in Vancouver this July, and it was packed with ideas that could shape the future of AI.

Can large language models truly be creative, or are they just clever parrots? This paper puts today’s chatbots to the test, uncovering surprising tricks that make them more imaginative.

Full reference : V. Nagarajan, C. H. Wu, C. Ding, and A. Raghunathan, “Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction,” arXiv preprint arXiv:2504.15266, 2025

Context

This paper explores how today’s chatbots, like ChatGPT, handle creative tasks that require more than just predicting the next word (technically, the next token). While these models excel at producing coherent text, they often struggle when a task demands what the Authors call “a leap of imagination”: invent new math problems, come up with clever wordplay, or generate novel scientific ideas.

Key results

The Authors came up with a definition for the concept of creativity and designed simple tasks to mimic real-world creativity in a controlled environment. These tasks fall into two types:

Combinational creativity: Connecting unrelated ideas to produce something new, like making a clever joke or drawing a scientific analogy. For example: “What kind of shoes do spies wear? Sneakers.”
Exploratory creativity: Constructing entirely new patterns under a set of rules, like inventing a new logic puzzle or designing a unique protein structure.

The Authors discovered three major insights:

Next-token prediction is short-sighted. LLMs focus only on the immediate next word (or token), and this limits their creativity. Multi-token approaches, like teacherless training (a form of self-supervised learning) or diffusion models (more on those in a later post), were much better at producing diverse and original outputs.
Randomness is actually good. Furthermore, input randomness beats output randomness. Instead of only applying randomness in output through temperature sampling, adding meaningless seed prefixes to the input (called seed-conditioning ) led to surprisingly creative results, even with deterministic generation.
Learning by heart is bad. In other words, LLMs should reduced how much information they memorize. Multi-token approaches not only increased creativity but also reduced the tendency of models to repeat training examples word for word.

These findings highlight the limitations of current LLMs in open-ended creative tasks and suggest new training and inference strategies to unlock more originality. Ultimately, the Authors argue for going beyond next-token prediction to capture the “leap of imagination” that underlies human creativity.

My take

This paper touches on a core limitation of today’s large language models: their reliance on next-token prediction. While this approach has brought us remarkably far, its shortcomings are becoming clear. The real question is: can we move beyond it?

In my view, intelligence and creativity are closely connected. History’s greatest minds stood out not just for reasoning, but for their ability to invent, imagine, and create. If we can give machines this capacity, we may unlock something far closer to true Artificial General Intelligence — whether we call it AGI, Superintelligence, or whatever name the future chooses.

Looking ahead

In short, while today’s large language models can generate fluent text, they still fall short when true creativity — the “leap of imagination” — is required.

Next time, we’ll dive into research on how AI learns when information is missing — a key challenge on the path toward more adaptable and resilient systems.

If you’re enjoying this series on cutting-edge AI research, be sure to subscribe so you don’t miss future posts from The Future of AI.

The Future of AI