Do AI systems learn the same view of the world?

Published

Two years ago, researchers at MIT proposed a provocative idea: as AI models become more powerful, they begin to see the world in the same way. But not everyone was convinced and now EPFL scientists have shown that the picture is more nuanced.

Today’s artificial intelligence systems learn by being trained on massive datasets. Language models are trained on text, vision models on images and video, and audio models on sound data. Yet these different data types are often grounded in the same reality. In 2024, researchers at the Massachusetts Institute of Technology argued that, as models grow more capable no matter what data they are trained on (images, text, video or audio), the way they see the world is growing more similar.

Their Platonic Representation Hypothesis suggested that, like Plato’s ideal forms, these systems appeared to be discovering the same underlying structure of the world.

The idea quickly captured the imagination of the AI community and raised profound questions. If different AI systems independently arrive at the same internal view of reality, does this reveal something fundamental about intelligence itself?

Measuring distances between concepts

Inside modern AI systems, concepts such as “dog,” “car,” or “tree” are represented as vectors in high-dimensional spaces. To measure similarity among models, researchers can compare these internal representations by looking at the patterns of distances or similarities among many concepts. Based on the findings from such comparisons, the MIT work suggested that, as models become more capable, their internal representations increasingly align across systems, hinting at convergence toward a shared representation of the world. But the EPFL team suspected something was missing.

So, researchers from the Machine Learning for Biomedicine Laboratory part of EPFL’s School of Computer and Communications Sciences and School of Life Sciences, revisited the Platonic hypothesis and uncovered a more complex picture. “The original idea was extremely exciting,” explains Assistant Professor Maria Brbic, head of the lab and co-author of the study. “But we kept coming back to the same question: what did the similarity scores actually mean?”

The answer is in the maths

In their new paper, being presented at the 2026 International Conference on Machine Learning in July, Brbic and her colleagues argue that the apparent convergence of different models was misleading: the models looked more similar partly because of problems in the way the similarities were measured. As AI models get bigger and more complex, the similarity scores can naturally go up, even if the models are not actually learning the same underlying structure of the world. Their findings suggest that AI systems do not converge toward a single universal representation of reality in the way many researchers had begun to imagine.

Part of the problem lies in the strange mathematics of high-dimensional spaces. In everyday three-dimensional space, randomness behaves intuitively. In extremely high-dimensional spaces – the kind used by modern neural networks – distances can “concentrate,” meaning that many unrelated points can end up being almost equally distant from one another.

“We took ideas from high-dimensional geometry and used them to question the similarity metrics. Once we did that, a lot of things started to fall apart,” said Fabian Gröger, a PhD student at the University of Basel and lead author of the paper who spent time as a visiting researcher at EPFL with Brbic. He worked on the problem along with Shuo Wen, a PhD student in the MLBio lab.

“The simplest question we asked at the start was: if I take two random models that are completely independent, have never been trained, and have never seen data, why should a metric report non-zero similarity? That suggests similarity even when there is no shared learned structure,” said Brbic. “If two random models already appear similar, then the metric may be picking up a mathematical baseline rather than meaningful shared structure.”

Yet the EPFL team did not conclude that convergence disappears entirely. Instead, they found that one particular type of similarity remains remarkably stable: local neighborhood relationships between concepts.

In practice, this means AI systems often learn that certain ideas belong together – cars cluster near other cars, animals near animals, and related concepts form stable neighborhoods – even if the larger geometry of the models differs significantly.

“What matters is not necessarily the absolute structure of the space, but who is near whom,” says Brbic.

Plato versus Aristotle

In response to this insight, the team proposed the Aristotelian Representation Hypothesis. While Plato emphasized universal ideal forms, his student, Aristotle focused more on relationships, categories and context. The researchers argue that this relational view better describes how modern AI systems organize knowledge.

The distinction may sound abstract, but it carries important implications for the future of AI. The Platonic hypothesis had encouraged the idea that sufficiently advanced AI systems might naturally converge toward a shared and stable understanding of the world. For some, this raised hopes that advanced AI systems might become easier to compare, combine, or align. The EPFL findings complicate that picture.

“Our work suggests there can still be significant differences in how models represent the world globally,” says Gröger. “That matters for alignment, multimodal systems and ultimately how we understand what these models are actually learning.”

The study also introduces a new framework for evaluating representation similarity that corrects for the biases identified by the researchers. The team tested the approach across language, vision and video models, finding consistent evidence for local convergence rather than global alignment.

Pushing science forward

Importantly, the work has been received positively by the researchers behind the original Platonic hypothesis. Before publishing the paper, the EPFL team contacted the MIT authors to discuss the findings.

“The response was very constructive,” says Brbic. “This is how science progresses. You refine ideas, test assumptions, and gradually build a better understanding together.”

Far from closing the debate, the work opens new questions about what exactly AI systems share and what remains fundamentally different between them. “The next challenge is to understand precisely which local structures converge, and whether that knowledge could help us build more reliable and better aligned AI systems,” concluded Gröger.

Author: Tanya Petersen
Source: EPFL

Share

You might be also interested in

Mice actively seek better views to make visual decisions

A study led by EPFL shows that when objects are difficult to see, mice don’t simply look harder. They move to find better viewpoints, adjusting their behavior according to how much visual information is available.

(more…)

Nine ERC Advanced Grants awarded to EPFL researchers

The European Research Council (ERC) awarded nine “ERC Advanced Grants” to EPFL researchers. This prestigious funding scheme gives senior researchers the opportunity to pursue ambitious, curiosity-driven projects that could lead to major scientific breakthroughs.

(more…)

EPFL researchers create an AI model that thinks like we do

An EPFL team has created a new Large Language Model that is structured similarly to a human brain, allowing users more control and moving away from “black box” AI.

When a standard Large Language Model (LLM) is confronted with a problem, it tries to solve it by matching it to similar information it has seen before, and then give an answer based on those past patterns. But how it decides which information to use and what value it gives to different pieces of information can be somewhat inscrutable from the outside.

The LLM MiCRo (Mixture of Cognitive Reasoners) is architecturally divided into four specialized areas that act like different parts of the human brain, allowing users to have more control over how it approaches a question, and to better understand how it comes to its answers. The model, which was presented at the International Conference on Learning Representations, comes from the NLP Lab, part of the School of Computer and Communication Sciences (IC), and the NeuroAI Lab, part of IC and the School of Life Sciences at EPFL.

The four experts

To create MiCRo, researchers identified four regions of the brains specializing in different functions, which they call ‘experts’: language, logic, social reasoning, and world knowledge.

“The brain is organized into specialized regions, each tuned to handle a specific function. So far, we don’t see this division of labor as clearly in current language models,” says Badr AlKhamissi, a PhD candidate leading this research. “We picked four brain regions that neuroscientists know well and gave the model its own specialized modules, each one trained to be analogous to one of those brain regions.”

An LLM usually functions as a stack of layers that a problem or question can be processed through. In the case of MiCRo, each layer is divided into the four different experts. You give a sentence to the model starting at layer one, for example “The cat is asleep”. Then within this layer, the router can choose one expert for the first word “the”, but a different epxert for second word “cat” and so on, making it modular and highly adaptable.

“Each word of a sentence can go to different experts,” AlKhamissi explains. “So one sentence can actually be processed by multiple experts at each layer.”

Consider a prompt like: “Emma wants to split a CHF 60 dinner bill among three friends, but she knows that Jake lost his job last week and is too proud to say he’s struggling.” A purely mathematical module handles the arithmetic: CHF 60 divided by three is CHF 20 each. But the social reasoning module picks up on something subtler: Emma’s awareness of Jake’s situation, his unspoken pride, and the implicit suggestion that she might quietly cover his share. Both kinds of reasoning are needed to fully understand what’s going on, and in MiCRo, each aspect of the prompt is routed to the expert best equipped to handle it.

“When we see how the model works, we can see that it routes the words that relate to the social aspects to the social expert, and when it does the mathematical part, it routes those numbers to the logic expert.”

This separation makes it easier to see how the model is ‘thinking’ and why it makes certain decisions. It also means decisions can be steered – for example, you can decide to increase the impact of the social expert, or suppress the logic expert, depending on what kind of model you want to use in a certain situation.

“In traditional LLMs, you can do this via prompting by telling the model to make the output more social or make it more related to emotions,” AlKhamissi says. “But here, this is done by intervening in the architecture itself without doing any prompting.”

“A virtuous circle”

To create MiCRo, the EPFL team worked with Greta Tuckute, a neuroscientist from Harvard and MIT, to understand which parts of the human brain are activated by different problems, and then applied that learning to the model.

To identify the region analogous to the ‘logic’ expert in the brain, neuroscientists give humans demanding tasks, such as hard mathematical equations, and less demanding tasks, like easy mathematical equations, and then recorded their brain activity to find which brain regions are the most active for the demanding tasks versus non-demanding tasks. AlKhamissi’s team then did the same for the model, giving it demanding mathematical equations to see which experts would be most activated.

“The cool thing is we just used exactly what they do in neuroscience, but in the model. And the model was able to identify those experts on its own.”

While neuroscience informs the model, the model also informs the understanding of the brain, potentially allowing neuroscientists to discover the contributions of different areas for a given problem or question; for example that a certain sentence activates the language areas 20%, the mathematical areas 50%, and the social reasoning areas 40%.

“For my PhD work, I have been interested in this virtuous circle between neuroscience and AI. In one direction, we use findings and insights from neuroscience about the brain and integrate them into language models,” AlKhamissi says, “and now, with models like MiCRo, we can explore the other direction and ask how we can use AI models to help us understand the brain in a better way.”

Author: Stephanie Parker
Source: EPFL