My watch list  

AI learns the language of chemistry to predict how to make medicines


TheDigitalArtist,, CC0

Machine learning algorithms can have a better understanding of chemistry (symbolic image)

University of Cambridge researchers have shown that an algorithm can predict the outcomes of complex chemical reactions with over 90% accuracy, outperforming trained chemists. The algorithm also shows chemists how to make target compounds, providing the chemical 'map' to the desired destination.

A central challenge in drug discovery and materials science is finding ways to make complicated organic molecules by chemically joining together simpler building blocks. The problem is that those building blocks often react in unexpected ways.

"Making molecules is often described as an art realised with trial-and-error experimentation because our understanding of chemical reactivity is far from complete," said Dr Alpha Lee from Cambridge's Cavendish Laboratory, who led the studies. "Machine learning algorithms can have a better understanding of chemistry because they distil patterns of reactivity from millions of published chemical reactions, something that a chemist cannot do."

The algorithm developed by Lee and his group uses tools in pattern recognition to recognise how chemical groups in molecules react, by training the model on millions of reactions published in patents.

The researchers looked at chemical reaction prediction as a machine translation problem. The reacting molecules are considered as one 'language,' while the product is considered as a different language. The model then uses the patterns in the text to learn how to 'translate' between the two languages.

Using this approach, the model achieves 90% accuracy in predicting the correct product of unseen chemical reactions, whereas the accuracy of trained human chemists is around 80%. The researchers say that the model is accurate enough to detect errors in the data and correctly predict a plethora of difficult reactions.

The model also knows what it doesn't know. It produces an uncertainty score, which eliminates incorrect predictions with 89% accuracy. As experiments are time-consuming, accurate prediction is crucial to avoid pursuing expensive experimental pathways that eventually end in failure.

In the second study, Lee and his group, collaborating with the biopharmaceutical company Pfizer, demonstrated the practical potential of the method in drug discovery.

The researchers showed that when trained on published chemistry research, the model can make accurate predictions of reactions based on lab notebooks, showing that the model has learned the rules of chemistry and can apply it to drug discovery settings.

The team also showed that the model can predict sequences of reactions that would lead to a desired product. They applied this methodology to diverse drug-like molecules, showing that the steps that it predicts are chemically reasonable. This technology can significantly reduce the time of preclinical drug discovery because it provides medicinal chemists with a blueprint of where to begin.

"Our platform is like a GPS for chemistry," said Lee, who is also a Research Fellow at St Catharine's College. "It informs chemists whether a reaction is a go or a no-go, and how to navigate reaction routes to make a new molecule."

The Cambridge researchers are currently using this reaction prediction technology to develop a complete platform that bridges the design-make-test cycle in drug discovery and materials discovery: predicting promising bioactive molecules, ways to make those complex organic molecules, and selecting the experiments that are the most informative. The researchers are now working on extracting chemical insights from the model, attempting to understand what it has learned that humans have not.

"We can potentially make a lot of progress in chemistry if we learn what kinds of patterns the model is looking at to make a prediction," said Peter Bolgar, a PhD student in synthetic organic chemistry involved in both studies. "The model and human chemists together would become extremely powerful in designing experiments, more than each would be without the other."

Facts, background information, dossiers
  • chemical reactions
  • drug discovery
  • materials science
  • machine-learning
  • prediction models
  • artificial intelligence
More about University of Cambridge
  • News

    'Artificial leaf' successfully produces clean gas

    A widely-used gas that is currently produced from fossil fuels can instead be made by an 'artificial leaf' that uses only sunlight, carbon dioxide and water, and which could eventually be used to develop a sustainable liquid fuel alternative to petrol. The carbon-neutral device sets a new b ... more

    Color-changing artificial 'chameleon skin' powered by nanomachines

    Researchers have developed artificial 'chameleon skin' that changes colour when exposed to light and could be used in applications such as active camouflage and large-scale dynamic displays. The material, developed by researchers from the University of Cambridge, is made of tiny particles o ... more

    Self healing robots that "feel pain"

    Over the next three years, researchers from the Vrije Universiteit Brussel, University of Cambridge, École Supérieure de Physique et de Chimie Industrielles de la ville de Paris (ESPCI-Paris) and Empa will be working together with the Dutch Polymer manufacturer SupraPolix on the next genera ... more

  • Videos

    Graphene: A 2D materials revolution

    Graphene is a two-dimensional material made up of sheets of carbon atoms. With its combination of exceptional electrical, mechanical and thermal properties, graphene has the potential to revolutionise industries ranging from healthcare to electronics. more

    Where there’s muck there’s aluminium (if not brass)

    Technology developed in Cambridge at the Department of Chemical Engineering and Biotechnology lies at the heart of a commercial process that can turn toothpaste tubes and drinks pouches into both aluminium and fuel in just three minutes. The process recycles a form of packaging – plastic-al ... more

    Nanomaterials Up Close: Gum Arabic

    This alien glob is a piece of gum arabic from the hardened sap of the Acacia tree, most likely collected from a tree in Sudan. Rox Middleton, from the University of Cambridge, explains how the electron microscope has changed the way we are able to interact with objects at the nanoscale, all ... more

Your browser is not current. Microsoft Internet Explorer 6.0 does not support some functions on Chemie.DE