How a new database of human genes can help discover new biology

Researchers have created a database of poorly understood genes called Unknome, hoping it will help discover new biology.
Rizwan Choudhury
DNA strand
DNA strand


A team of UK-based scientists has developed a new database of genes largely unknown to science, despite being part of our DNA. These genes encode proteins, but their functions remain a mystery.

The database, called Unknome, is a public and customizable resource that can help researchers identify and investigate these neglected genes. It covers humans and other animals commonly used in experiments, such as flies, worms, and mice.

The scientists from the Dunn School of Pathology, the University of Oxford, and the MRC Laboratory of Molecular Biology in Cambridge say that many of these genes have been overlooked because of various biases in scientific research, such as funding, peer review, and availability of laboratory models.

Important roles in health and disease

They warn that ignoring these genes could be a missed opportunity, as they may have important roles in health and disease.

To create Unknome, the scientists assigned a “knownness” score to every protein in the human genome based on the information available in the scientific literature about its function, conservation across species, subcellular localization, and other factors. They also included proteins from model organisms in the database.

Each knownness score reflects how well a protein is understood by science. A score of zero means nothing is known about it, while a score of 100 means it has been fully characterized.

According to Unknome, there are thousands of proteins in humans and other animals that have a knownness score of one or less, indicating that they are poorly understood.

The database allows users to customize their own knownness scores by giving different weights to different factors depending on their research interests. This way, they can prioritize the genes that they want to study.

Sample of 260 unknown genes

To demonstrate the potential of Unknome, the scientists selected 260 genes that are highly unknown in humans and also present in flies. They deleted each of these genes in fly embryos and observed the effects.

They found that many of these genes were essential for survival, and some were involved in male fertility, development, tissue growth, protein quality control, or stress resistance.

“These uncharacterized genes have not deserved their neglect,” says molecular biologist Sean Munro, one of the authors of the study.

The scientists hope that Unknome will inspire other scientists to explore the unknown parts of our genome and discover new aspects of biology.

“Our database provides a powerful, versatile, and efficient platform to identify and select important genes of unknown function for analysis, thereby accelerating the closure of the gap in biological knowledge that the unknome represents,” Munro adds.

He further says that thousands of human proteins have unclear roles, yet research focuses on those already well understood. To help address this problem, he and his colleagues created a Unknome database that ranks proteins based on how little is known about them and then performed functional screens on a selection of these mystery proteins to demonstrate how ignorance can drive biological discovery.

The study was published in the journal PLOS Biology

Study abstract:

The human genome encodes approximately 20,000 proteins, many still uncharacterised. It has become clear that scientific research tends to focus on well-studied proteins, leading to a concern that poorly understood genes are unjustifiably neglected. To address this, we have developed a publicly available and customisable “Unknome database” that ranks proteins based on how little is known about them. We applied RNA interference (RNAi) in Drosophila to 260 unknown genes that are conserved between flies and humans. Knockdown of some genes resulted in loss of viability, and functional screening of the rest revealed hits for fertility, development, locomotion, protein quality control, and resilience to stress. CRISPR/Cas9 gene disruption validated a component of Notch signalling and 2 genes contributing to male fertility. Our work illustrates the importance of poorly understood genes, provides a resource to accelerate future research, and highlights a need to support database curation to ensure that misannotation does not erode our awareness of our own ignorance.

Add Interesting Engineering to your Google News feed.
Add Interesting Engineering to your Google News feed.
message circleSHOW COMMENT (1)chevron
Job Board