New AI closes data gaps and shows how extreme weather emerges on Earth

Published

Gaps in data and difficult‑to‑compare datasets limit what climate and weather AI models can reliably predict. Researchers from the ETH Domain have now introduced an AI model that helps close these gaps, reconstructs satellite images, and sheds light on how weather, land and water interact.

@iStock

The impacts were severe: Within a very short time, tropical storm Doksuri intensified into a super typhoon in July 2023. Exceptionally strong winds tore roofs from houses along the coasts of China and the Philippines, trees were uprooted, and torrential rain flooded streets and residential areas. In many places, everyday life came to a temporary halt.

Extreme events such as Super Typhoon Doksuri are particularly difficult for weather and climate models to predict, as they arise from complex interactions between the atmosphere, the land surface and the water cycle.

Researchers from the ETH Domain have now introduced a new artificial intelligence (AI) model that has learned these interactions and feedback autonomously – without human guidance – and, compared to previous AI models, more precisely captures how air, land and water interact on Earth.

AI understands the key connections within the Earth system

The new Earth System Foundation Model (ESFM) does not treat atmospheric and hydrological (i.e. water-related) processes in isolation, but rather represents them as part of an interconnected Earth system.

“Previous AI weather models have often focussed primarily on the atmosphere. Our model, by contrast, deliberately links atmospheric weather data with hydrological and land-based data. On this basis, the AI identifies key patterns, trends and relationships within the Earth’s weather system and uses them to generate forecasts, even when important data is missing,” explains Fanny Lehmann, mathematician, ETH AI Center Postdoctoral Fellow, and member of the team that developed the new model.

“The true strength of our model lies in its ability to learn the interactions that are crucial for weather from different data sources. This allows ESFM to integrate very different and hard‑to‑compare data types and to analyse them jointly for the first time.”

The researchers tested their model using Super Typhoon Doksuri as a case study. This tropical storm was not part of the training data. Even so, ESFM predicted wind strength with remarkable accuracy over several days and simultaneously captured realistically where the storm was, how quickly it moved, and how it expanded in space. This demonstrated how effectively the new model can jointly process very large, complex and heterogeneous datasets.

Learning from incomplete and heterogeneous data

The integrative approach of ESFM addresses a need in climate and environmental sciences. In research practice, data often varies considerably: some comes from satellite imagery, some from weather balloons, ground‑based stations, or other sensors. This data ranges from very fine-grained, short‑term measurements to large‑scale, long‑term observations.

Data types also differ markedly. While satellite imagery and climate models provide data in the form of large‑scale raster maps, ground stations or wells record key variables such as temperature, air pressure, wind speed or water levels at specific locations and at defined points in time.

To integrate these different types of environmental data, ESFM follows a multi‑stage approach: rather than forcing all data types into a single format from the outset, it initially treats them separately, depending on their type – whether satellite or station data – and tags them with information on when and where they were measured.

This approach enables the combination of very different data within a common spatial and temporal framework, while preserving its specific information. On this basis, the model learns the typical, recurring process chains and fundamental relationships within the Earth system.

Maintaining performance even when data is missing

“Earlier AI models for weather forecasting – unlike ESFM – were often trained on a single type of data or on a few datasets of a similarly formatted nature,” explains Firat Ozdemir, lead developer of the ESFM team and Senior Data Scientist at the joint Swiss Data Science Center of ETH Zurich and EPFL. “Their performance often declines when working with highly heterogeneous or incomplete data. ESFM addresses this challenge by integrating multi-source data and filling data gaps much more efficiently.”

“ESFM is neither a classical climate model nor a weather forecasting or specialised storm‑warning model; rather, it belongs to a distinct category of models that can serve as a flexible foundation for a wide range of tasks in climate and weather research,” says Sebastian Schemm, atmospheric scientist and professor at the University of Cambridge, formerly at ETH Zurich.

“Its advantage lies in a kind of learned systemic understanding that enables it to produce plausible predictions in many cases, even when data is incomplete or patchy.”

Designed to bridge data gaps intelligently

Such data gaps significantly hindered previous AI models in analysing and predicting complex weather and water phenomena. However, in research practice, it is not uncommon for individual measurements to be lost or compromised due to weather conditions or technical issues. Measurement networks, too, often contain gaps, as monitoring stations are unevenly distributed.

ESFM, by contrast, is specifically designed to cope with missing data and to internally reconstruct incomplete observations, such as patchy satellite images. After training, the model succeeds in generating forecasts from satellite observations in which only around 3 percent of the pixels are available.

The researchers, including Benedikt Soja, Professor of Space Geodesy at ETH Zurich, showed that their model can reliably fill data gaps both in weather station data and in the long-term global ERA5 dataset. On this basis, it is able to generate plausible forecasts of weather conditions.

Building on many learned examples of how the atmosphere, land and water are interconnected, ESFM can plausibly complete patchy satellite images with information such as temperature, humidity, soil type, whether an area is land or sea, and topography.

The model systematically embeds this information within the processes that link, for example, rainfall, soil moisture and groundwater, thereby helping to improve the understanding of droughts and potentially making them easier to predict.

ESFM infers missing measurement data by relating data gaps to other available data sources and to patterns it has learned from similar situations in neighbouring regions, from related variables, and from past observations.

Learning is more than repetition

“Through training on very different types of data, models such as ESFM acquire a form of fundamental knowledge and can therefore flexibly solve a wide range of tasks. In AI research, they are referred to as foundation models,” says Torsten Hoefler, Professor of Computer Science at ETH Zurich, who also serves as Chief AI Architect at the Swiss National Supercomputing Centre (CSCS) in Lugano, where he oversees research on new AI approaches (see box).

Like all foundation models, ESFM can be used for a range of tasks and can also be adapted to specific applications through a process known as finetuning. The team’s research shows that ESFM applies fundamental physical principles consistently and reliably – even when addressing new physical or weather‑related questions or working with variables for which it was not explicitly trained.

In the future, ESFM or especially finetuned versions have the potential to provide reliable forecasts of weather and water processes. “We intend to leverage the model’s representational power across diverse domains such as agriculture, biodiversity and hydrology,” says Mathieu Salzmann, Senior Scientist at EPFL and Deputy Chief Data Scientist at the Swiss Data Science Center (SDSC).

The Earth System Foundation Model fills in missing data in satelliste images, shown here using data from the MODIS Earth observation sensor, which provides global observations. On the left, an incomplete saatellite image is shown. On the right, the satelliste image generated by ESFM delivers complete global coverage (Images: Firat Ozdemir /SDSC)
In addition to satellite images, the Earth System Foundation Model can also fill in and forecast missing point‑based measurements from ground stations, such as temperature or …
wind observations. (Images: Firat Ozdemir / SDSC)

Swiss AI Initiative, ICAIN and download
ESFM was developed within the Weather and Climate Foundation Models project, which is part of the Swiss AI Initiative. The project also includes ETH Zurich mathematics professor Siddhartha Mishra. Within this framework, researchers from ETH Zurich, EPFL and other partners are developing foundation models addressing key challenges in Switzerland.
ESFM is also supported by the International Computation and AI Network (ICAIN) at ETH Zurich. The network promotes international AI collaboration and works to ensure that such foundation models can be used in the Global South. Within the ESFM project, ICAIN helps identify partners in data‑sparse regions to enable finetuning of the model with local data.
ESFM is freely available on the AI platform Hugging Face and in the Git repository.

Reference
Ozdemir, F., Cheng, Y., Mohebi, S., Lehmann, F., Adamov, S., Trentini, L., Huang, L., Lingsch, L., Zhang, Z., Fuhrer, O., Soja, B., Mishra, S., Hoefler, T., Schemm, S., and Salzmann, M.: ESFM – A foundation model framework for heterogeneous data integration. EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18011. DOI: 10.5194/egusphere-egu26-18011.

Earth System Foundation Model (ESFM): A Unified Framework for Heterogeneous Data Integration and Forecasting: https://arxiv.org/abs/2605.00850

Finetuning a Weather Foundation Model with Lightweight Decoders for Unseen Physical Processes: https://arxiv.org/abs/2506.19088

Article by: Florian Meyer, ETH Zurich

Share

You might be also interested in

EPFL researchers create an AI model that thinks like we do

An EPFL team has created a new Large Language Model that is structured similarly to a human brain, allowing users more control and moving away from “black box” AI.

When a standard Large Language Model (LLM) is confronted with a problem, it tries to solve it by matching it to similar information it has seen before, and then give an answer based on those past patterns. But how it decides which information to use and what value it gives to different pieces of information can be somewhat inscrutable from the outside.

The LLM MiCRo (Mixture of Cognitive Reasoners) is architecturally divided into four specialized areas that act like different parts of the human brain, allowing users to have more control over how it approaches a question, and to better understand how it comes to its answers. The model, which was presented at the International Conference on Learning Representations, comes from the NLP Lab, part of the School of Computer and Communication Sciences (IC), and the NeuroAI Lab, part of IC and the School of Life Sciences at EPFL.

The four experts

To create MiCRo, researchers identified four regions of the brains specializing in different functions, which they call ‘experts’: language, logic, social reasoning, and world knowledge.

“The brain is organized into specialized regions, each tuned to handle a specific function. So far, we don’t see this division of labor as clearly in current language models,” says Badr AlKhamissi, a PhD candidate leading this research. “We picked four brain regions that neuroscientists know well and gave the model its own specialized modules, each one trained to be analogous to one of those brain regions.”

An LLM usually functions as a stack of layers that a problem or question can be processed through. In the case of MiCRo, each layer is divided into the four different experts. You give a sentence to the model starting at layer one, for example “The cat is asleep”. Then within this layer, the router can choose one expert for the first word “the”, but a different epxert for second word “cat” and so on, making it modular and highly adaptable.

“Each word of a sentence can go to different experts,” AlKhamissi explains. “So one sentence can actually be processed by multiple experts at each layer.”

Consider a prompt like: “Emma wants to split a CHF 60 dinner bill among three friends, but she knows that Jake lost his job last week and is too proud to say he’s struggling.” A purely mathematical module handles the arithmetic: CHF 60 divided by three is CHF 20 each. But the social reasoning module picks up on something subtler: Emma’s awareness of Jake’s situation, his unspoken pride, and the implicit suggestion that she might quietly cover his share. Both kinds of reasoning are needed to fully understand what’s going on, and in MiCRo, each aspect of the prompt is routed to the expert best equipped to handle it.

“When we see how the model works, we can see that it routes the words that relate to the social aspects to the social expert, and when it does the mathematical part, it routes those numbers to the logic expert.”

This separation makes it easier to see how the model is ‘thinking’ and why it makes certain decisions. It also means decisions can be steered – for example, you can decide to increase the impact of the social expert, or suppress the logic expert, depending on what kind of model you want to use in a certain situation.

“In traditional LLMs, you can do this via prompting by telling the model to make the output more social or make it more related to emotions,” AlKhamissi says. “But here, this is done by intervening in the architecture itself without doing any prompting.”

“A virtuous circle”

To create MiCRo, the EPFL team worked with Greta Tuckute, a neuroscientist from Harvard and MIT, to understand which parts of the human brain are activated by different problems, and then applied that learning to the model.

To identify the region analogous to the ‘logic’ expert in the brain, neuroscientists give humans demanding tasks, such as hard mathematical equations, and less demanding tasks, like easy mathematical equations, and then recorded their brain activity to find which brain regions are the most active for the demanding tasks versus non-demanding tasks. AlKhamissi’s team then did the same for the model, giving it demanding mathematical equations to see which experts would be most activated.

“The cool thing is we just used exactly what they do in neuroscience, but in the model. And the model was able to identify those experts on its own.”

While neuroscience informs the model, the model also informs the understanding of the brain, potentially allowing neuroscientists to discover the contributions of different areas for a given problem or question; for example that a certain sentence activates the language areas 20%, the mathematical areas 50%, and the social reasoning areas 40%.

“For my PhD work, I have been interested in this virtuous circle between neuroscience and AI. In one direction, we use findings and insights from neuroscience about the brain and integrate them into language models,” AlKhamissi says, “and now, with models like MiCRo, we can explore the other direction and ask how we can use AI models to help us understand the brain in a better way.”

Author: Stephanie Parker
Source: EPFL

Smarter waste sorting with AI

EPFL startup WasteFlow has developed an AI-powered copilot that identifies and measures waste streams, helping sorting facilities work more efficiently. Support from several EPFL entrepreneurship programs helped the company accelerate the development of its technology.

(more…)

EPFL launches the world’s first fully open medical LLMs

MeditronFO is the first fully open framework for building medical large language models, to make AI in healthcare more transparent and accountable.

(more…)