Joint initiative for trustworthy AI

Published

ETH Zurich and EPFL are launching the “Swiss AI Initiative”, whose purpose is to position Switzerland as a leading global hub for the development and implementation of transparent and reliable artificial intelligence (AI). The new Alps supercomputer based at the Swiss National Supercomputing Centre (CSCS) provides the supporting world-class infrastructure.

In February 2024 ETH Zurich’s new Alps supercomputer goes live at the Swiss National Supercomputing Centre (CSCS) of ETH Zurich in Lugano. Boasting the next generation of 10,000 graphics processing units (GPUs), Alps is one of the world’s most powerful computers and has been especially developed to meet the needs of applications in the area of artificial intelligence. This new computer gives Swiss scientists access to the sort of computing power only available to the world’s biggest tech companies.

Technological edge to protect Switzerland’s digital sovereignty

The new supercomputer therefore gives Switzerland a significant competitive advantage over international rivals. This is because the infrastructure for supercomputing is in short supply worldwide due to the rapid development of generative AI and – where available – is mostly owned by a handful of large multinationals. “Through this joint initiative we want to exploit our advantage as a location and make Switzerland’s expertise in artificial intelligence transferrable to society as a whole,” explains Christian Wolfrum, ETH Zurich Vice President for Research. “Science must assume a pioneering role in such a forward-looking field, rather than leaving it to a few multinational corporations. Only in this way can we guarantee independent research and Switzerland’s digital sovereignty.”

Transparency and Open Source

The aim of the initiative is to develop and train new large language models (LLM). These must be transparent, deliver comprehensible results and ensure legal, ethical and scientific criteria are met. “Unlike the large language models that are usually available in the public domain today, the Swiss AI Initiative strongly emphasized transparency and Open Source. Everyone must be able to understand how the models were trained, the sort of data used, and how results are recovered,” stresses Jan Hesthaven, Provost and Vice President for Academic Affairs at EPFL.

To develop such models, the Swiss AI Initiative will use ten million GPU hours on the new Alps computer over the next 12 months, equivalent to the computing power of a single GPU running at full load for over 1,100 years. Switzerland is therefore the first country in the world to operate a research infrastructure on the next-generation NVIDIA Grace Hopper Superchip.

Swiss AI Initiative already up and running

This additional computing capacity will be used to develop new, industry-specific AI base models for use in different areas such as robotics, medicine, climate sciences or diagnostics. In addition, the Initiative will also explore fundamental questions in the development and use of LLM models, such as: What form will future interaction between humans and AI take? What is the appropriate ethical framework? How do we manage security and data privacy? What new approaches can be used to scale up models and make them more energy efficient?

AI for industry and public administration

The Swiss AI Initiative has set itself the goal of bringing together science, industry and politics to collaborate in shaping and driving forward the development and use of artificial intelligence in Switzerland. Existing partnerships with companies, hospitals and public-sector bodies will be expanded further. Swisscom’s CTO Gerd Niehage comment: “Here at Swisscom we welcome the Swiss AI Initiative, especially as we are convinced this will be an important building block in Switzerland’s digital future. It accelerates the digital transformation and creates new capabilities our country needs to play a dominant role in the area of generative artificial intelligence. For Swisscom, AI solutions like the Swiss AI Initiative are a key element of innovative digital solutions that our customers can trust.”

The software infrastructure, accumulated expertise and base models developed in Switzerland should be transferable as openly and directly as possible to society and industry. To remain competitive, SME’s will also have to rely increasingly on the use of AI in future. Like public services, they will be able to directly benefit from the open Swiss AI Initiative. On top of that, the Swiss AI Initiative is developing a programme for supporting start-ups in the area of artificial intelligence.

Networking researchers all over Switzerland

ETH Zurich and EPFL operate their own AI centres that will work closely together in future, along with the Swiss Data Science Center, to conduct world-class interdisciplinary AI research. This initiative aims to pool the specialist knowledge of around a dozen Swiss universities, technical universities and research institutes. Over the past few months over 75 professors from all over Switzerland have signed up to the initiative. In addition, other international researchers have also been invited to work together on the development of multilingual, cross-border open source LLMs. ETH Zurich and EPFL are already members of ELLIS, the European network of AI excellence, which includes some 40 AI hot spots in Europe.

Author: ETH Zurich / EPFL

Source: EPFL

Share

You might be also interested in

Mice actively seek better views to make visual decisions

A study led by EPFL shows that when objects are difficult to see, mice don’t simply look harder. They move to find better viewpoints, adjusting their behavior according to how much visual information is available.

(more…)

Nine ERC Advanced Grants awarded to EPFL researchers

The European Research Council (ERC) awarded nine “ERC Advanced Grants” to EPFL researchers. This prestigious funding scheme gives senior researchers the opportunity to pursue ambitious, curiosity-driven projects that could lead to major scientific breakthroughs.

(more…)

EPFL researchers create an AI model that thinks like we do

An EPFL team has created a new Large Language Model that is structured similarly to a human brain, allowing users more control and moving away from “black box” AI.

When a standard Large Language Model (LLM) is confronted with a problem, it tries to solve it by matching it to similar information it has seen before, and then give an answer based on those past patterns. But how it decides which information to use and what value it gives to different pieces of information can be somewhat inscrutable from the outside.

The LLM MiCRo (Mixture of Cognitive Reasoners) is architecturally divided into four specialized areas that act like different parts of the human brain, allowing users to have more control over how it approaches a question, and to better understand how it comes to its answers. The model, which was presented at the International Conference on Learning Representations, comes from the NLP Lab, part of the School of Computer and Communication Sciences (IC), and the NeuroAI Lab, part of IC and the School of Life Sciences at EPFL.

The four experts

To create MiCRo, researchers identified four regions of the brains specializing in different functions, which they call ‘experts’: language, logic, social reasoning, and world knowledge.

“The brain is organized into specialized regions, each tuned to handle a specific function. So far, we don’t see this division of labor as clearly in current language models,” says Badr AlKhamissi, a PhD candidate leading this research. “We picked four brain regions that neuroscientists know well and gave the model its own specialized modules, each one trained to be analogous to one of those brain regions.”

An LLM usually functions as a stack of layers that a problem or question can be processed through. In the case of MiCRo, each layer is divided into the four different experts. You give a sentence to the model starting at layer one, for example “The cat is asleep”. Then within this layer, the router can choose one expert for the first word “the”, but a different epxert for second word “cat” and so on, making it modular and highly adaptable.

“Each word of a sentence can go to different experts,” AlKhamissi explains. “So one sentence can actually be processed by multiple experts at each layer.”

Consider a prompt like: “Emma wants to split a CHF 60 dinner bill among three friends, but she knows that Jake lost his job last week and is too proud to say he’s struggling.” A purely mathematical module handles the arithmetic: CHF 60 divided by three is CHF 20 each. But the social reasoning module picks up on something subtler: Emma’s awareness of Jake’s situation, his unspoken pride, and the implicit suggestion that she might quietly cover his share. Both kinds of reasoning are needed to fully understand what’s going on, and in MiCRo, each aspect of the prompt is routed to the expert best equipped to handle it.

“When we see how the model works, we can see that it routes the words that relate to the social aspects to the social expert, and when it does the mathematical part, it routes those numbers to the logic expert.”

This separation makes it easier to see how the model is ‘thinking’ and why it makes certain decisions. It also means decisions can be steered – for example, you can decide to increase the impact of the social expert, or suppress the logic expert, depending on what kind of model you want to use in a certain situation.

“In traditional LLMs, you can do this via prompting by telling the model to make the output more social or make it more related to emotions,” AlKhamissi says. “But here, this is done by intervening in the architecture itself without doing any prompting.”

“A virtuous circle”

To create MiCRo, the EPFL team worked with Greta Tuckute, a neuroscientist from Harvard and MIT, to understand which parts of the human brain are activated by different problems, and then applied that learning to the model.

To identify the region analogous to the ‘logic’ expert in the brain, neuroscientists give humans demanding tasks, such as hard mathematical equations, and less demanding tasks, like easy mathematical equations, and then recorded their brain activity to find which brain regions are the most active for the demanding tasks versus non-demanding tasks. AlKhamissi’s team then did the same for the model, giving it demanding mathematical equations to see which experts would be most activated.

“The cool thing is we just used exactly what they do in neuroscience, but in the model. And the model was able to identify those experts on its own.”

While neuroscience informs the model, the model also informs the understanding of the brain, potentially allowing neuroscientists to discover the contributions of different areas for a given problem or question; for example that a certain sentence activates the language areas 20%, the mathematical areas 50%, and the social reasoning areas 40%.

“For my PhD work, I have been interested in this virtuous circle between neuroscience and AI. In one direction, we use findings and insights from neuroscience about the brain and integrate them into language models,” AlKhamissi says, “and now, with models like MiCRo, we can explore the other direction and ask how we can use AI models to help us understand the brain in a better way.”

Author: Stephanie Parker
Source: EPFL