The limits of AI in materials science

Researchers at Friedrich Schiller University Jena reveal strengths and weaknesses of language-image models in scientific tasks

14-Aug-2025
Nicole Nerger/Universität Jena

r. Kevin Jablonka, junior research group leader at the Institute of Organic Chemistry and Macromolecular Chemistry at the University of Jena

Current AI-based speech-image models can perceive content very well, but reach their limits when it comes to more complex scientific processes. This is shown by a recent study conducted by researchers at Friedrich Schiller University Jena in collaboration with international partners. In this work, the researchers systematically investigated for the first time how well modern AI models can process visual and textual information in chemistry and materials science.

Innovative evaluation method for AI

"Our study solves a problem in AI research: how can you evaluate multimodal systems fairly if it is unclear what data the models have already seen during training?", says Dr. Kevin Maik Jablonka, head of a Carl Zeiss Foundation junior research group at Friedrich Schiller University Jena and the Helmholtz Institute for Polymers in Energy Applications (HIPOLE) Jena, explaining the methodological innovation. The evaluation procedure developed makes it possible for the first time to systematically analyze the strengths and weaknesses of current AI systems in scientific applications.

"Multimodal AI systems that can understand both text and images are seen as the future of scientific assistance systems," explains Jablonka. "We wanted to find out whether these models really have the potential to support researchers in their daily work - from literature evaluation to data analysis."

More than a thousand tasks from everyday scientific life

To test the capabilities of multimodal AI, the international team developed the "MaCBench" assessment procedure (https://macbench.lamalab.org), which comprises more than 1,100 realistic tasks from three central areas of scientific work: extracting data from the literature, understanding laboratory and simulation experiments and interpreting measurement results. The tests included tasks ranging from analyzing spectroscopy data to evaluating laboratory safety and interpreting crystal structures.

The team examined leading AI models for their ability to understand and link scientific information. "In contrast to pure text models, these systems must be able to process visual and textual information simultaneously - a core capability for scientific work," explains Jablonka.

Success with simple tasks, weaknesses with complex thinking

The results of the study now presented show a differentiated picture: While the AI models reliably recognized laboratory equipment or extracted standardized data almost error-free, fundamental weaknesses were revealed in spatial analyses and linking different sources of information. "It was particularly striking that the same information was processed significantly better by the models when it was presented as text rather than as an image," reports Jablonka. "This indicates that the integration of different types of data is not yet working optimally."

Another striking discovery was that the performance of the models correlated strongly with the frequency of the test materials on the internet. "This suggests that the models partly rely on pattern recognition from training data instead of developing real scientific understanding," says the researcher.

Foundations for better AI assistance systems

The findings could be beneficial in the development of future scientific AI assistants: "Before these systems can be used reliably in research, their spatial perception and the linking of different types of information must be fundamentally improved," summarizes Jablonka. "Our work shows concrete ways in which these challenges can be tackled and AI tools for the natural sciences can be improved."

Note: This article has been translated using a computer system without human intervention. LUMITOS offers these automatic translations to present a wider range of current news. Since this article has been translated with automatic translation, it is possible that it contains errors in vocabulary, syntax or grammar. The original article in German can be found here.

Original publication

Other news from the department science

Most read news

More news from our other portals

Is artificial intelligence revolutionising chemistry?