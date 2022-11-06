“The rods and cones in our eyes that sense light and make it possible for us to see, the molecular sensors that underlie hearing and our sense of touch, the complex molecular machines that convert sunlight into chemical energy in plants, the motors that drive motion in microbes and our muscles, enzymes that break down plastic, antibodies that protect us from disease, and molecular circuits that cause disease when they fail — are all proteins.”

Across the planet and in our human bodies

Metagenomics uses gene sequencing to discover proteins in samples from environments across the planet and even in our human bodies. It is common knowledge that a vast number of proteins exist beyond the ones that have been cataloged and annotated in well-studied organisms and now these proteins are coming to the surface.

A map of tens of thousands of high-confidence predictions. Meta

“Metagenomics is starting to reveal the incredible breadth and diversity of these proteins, uncovering billions of protein sequences that are new to science and cataloged for the first time in large databases compiled by public initiatives such as the NCBI, European Bioinformatics Institute and Joint Genome Institute, incorporating studies from a worldwide community of researchers,” continued the Meta research team.

The discovery is made using a program called ESMFold with a model that was originally designed for decoding human languages. The finds have been compiled into the open-source ESM Metagenomic Atlas and could one day be used in the production of new drugs, the characterization of unknown microbial functions, and the discovery of evolutionary links between distantly related species.