'Technological revolution': DALL-E inspires biotech labs to invent new drugs

Researchers anticipate that text-to-image-like AI models of biology will eventually result in the creation of more potent medications.
Baba Tamim
Development and research of new drugs
Development and research of new drugs


Text-to-image AI model technology like OpenAI's DALL-E 2 has caused a stir in biotech labs, which are increasingly adopting generative AI, called a diffusion model, to create new medicine. 

Programs using diffusion models to more precisely develop designs for new proteins have been independently announced by two labs, according to a report by MIT Technology Review on Thursday. 

The Boston-based business Generate Biomedicines unveiled Chroma, which it refers to as the "DALL-E 2 of biology." The RoseTTAFold Diffusion program was created concurrently by a team at the University of Washington under the direction of biologist David Baker. 

"We're generating proteins with really no similarity to existing ones," said Brian Trippe, one of the co-inventors of RoseTTAFold. 

The tech labs announced the two powerful new generative models that can create on-demand novel proteins not found in nature. 

"We can discover in minutes what took evolution millions of years," said Gevorg Grigoryan, CEO of Generate Biomedicines.

It is possible to instruct these protein generators to create designs for proteins with particular characteristics, such as structure, size, or function. 

In essence, this enables the development of novel proteins that can be called upon to perform specific tasks.

"What is notable about this work is the generation of proteins according to desired constraints," said Ava Amini, a biophysicist at Microsoft Research in Cambridge, Massachusetts. 

Proteins as therapeutic interventions

Researchers anticipate that this will eventually result in the creation of fresh, more potent medications. 

A lot of the newest medications available today are protein-based since proteins are prime targets for drugs. 

The essential building elements of living systems are proteins. Living beings use them to digest food, tense muscles, sense light, activate the immune system, and many other things. They have a crucial play a crucial role in recovery from illnesses. 

But, the component list for drugs currently only includes natural proteins. The aim of protein creation is to add a virtually endless number of computer-designed proteins to that list.

"Nature uses proteins for essentially everything," said Grigoryan, "The promise that offers for therapeutic interventions is really immense."

There is nothing new about computational methods for designing proteins. 

However, prior methods were slow and not very effective in creating huge proteins or protein complexes—molecular machines made up of numerous proteins joined together. 

Protein production through diffusion models 

The diffusion models are neural networks that have been trained to filter out "noise"—random alterations contributed to the data—from their input. 

A diffusion model will attempt to produce a recognizable image from a random collection of pixels.

These models have proved to be a promising technique in a few recent research by Amini and others, however, these were just proof-of-concept models.

Such studies served as a foundation for the first full-fledged programs, Chroma and RoseTTAFold Diffusion, which can generate exact designs for a wide range of proteins.

"It may be fair to say that this is more like DALL-E because of how they've scaled things up," said Namrata Anand, who co-developed one of the earliest diffusion models for protein production in May 2022. 

The major effectiveness of Chroma and RoseTTAFold Diffusion, in the opinion of Namrata Anand, is that they have scaled up the technique by training on more data and computers.

By separating the chains of amino acids that make up proteins, Chroma adds noise to its system. It then attempts to construct a protein from a random collection of these chains. 

Chroma can produce unique proteins with certain features while being guided by predetermined limitations on what the end product should look like.

Despite using a different strategy, Baker's squad achieves comparable outcomes. Its diffusion model begins with a structure that is considerably more disorganized. 

Another significant distinction is that RoseTTAFold Diffusion, unlike DeepMind's AlphaFold, uses data from a separate neural network that has been trained to predict protein structure to determine how the parts of a molecule fit together. 

This serves as the general process' guiding principle.

Creating novel proteins is just the beginning

The creation of novel proteins, according to Grigoryan, is just the beginning. "At the end of the day, what matters is whether we can make medicines that work or not," he said. 

Protein-based medications require mass production, laboratory testing, and finally, human testing. Several years may pass before it's good to use. 

But Grigoryan believes that his business and others will discover approaches to use AI to accelerate such phases as well.