Sci-fi Alert! Scientists Can Now Translate Brain Signals Directly Into Speech

The development has great potential for helping people who cannot speak, and it could even lead to new ways for computers to communicate directly with the brain.
Loukia Papadopoulos

The ability to create technology that can read thoughts is a much-discussed topic not only for its potential for good but also for its fears of misuse.

In the last few years, scientists have been making some rather impressive progress on the technology with even some people fearing our brains could one day be hacked.

Straight out of a sci-fi movie

Now, in a scientific first and quite frankly something straight out of a sci-fi movie, neuroengineers from Columbia University have announced they have designed a system that translates thought directly into speech. Better yet, the outcome is recognizable clear speech.


They achieve this by monitoring brain activity and then combining speech synthesizers with artificial intelligence to get an intelligible result.

The development could soon help people who cannot speak and could even lead to new ways for computers to communicate directly with the brain.

"Our voices help connect us to our friends, family and the world around us, which is why losing the power of one's voice due to injury or disease is so devastating," said Nima Mesgarani, PhD, the paper's senior author and a principal investigator at Columbia University's Mortimer B. Zuckerman Mind Brain Behavior Institute.

"With today's study, we have a potential way to restore that power. We've shown that, with the right technology, these people's thoughts could be decoded and understood by any listener."

Most Popular

Previous research has shown that when people speak or even listen distinct patterns of signals emerge in the brain. However, early attempts to decode these patterns were based on simple computer models that analyzed spectrograms (visual representations of sound frequencies) and were inefficient at producing recognizable speech.

Using vocoders

Therefore, Mesgarani's team decided instead to use vocoders. Vocoders are a category of voice codecs that analyze the human voice for audio data compression by being trained on recordings of real speech.

"This is the same technology used by Amazon Echo and Apple Siri to give verbal responses to our questions," said Dr. Mesgarani, who is also an associate professor of electrical engineering at Columbia's Fu Foundation School of Engineering and Applied Science.

Dr. Mesgarani then teamed up with Ashesh Dinesh Mehta, MD, Ph.D., a neurosurgeon at Northwell Health Physician Partners Neuroscience Institute and co-author of the paper, in order to train the vocoders on epilepsy patients who were already undergoing brain surgeries.

When in surgery, the patients were asked to listen to speech by different people. The researchers then measured the resulting neural patterns and used them to train the vocoders on real and varied human speech.

The end result was not immediately clear, it had to be cleaned up by artificial intelligence neural networks. But finally, the researchers were able to produce intelligible, albeit robotic-sounding, speech that could be understood rather easily. They further tested the speech to see if people could recognize its content.

"We found that people could understand and repeat the sounds about 75% of the time, which is well above and beyond any previous attempts," said Dr. Mesgarani.

The improvement in intelligibility was especially evident when comparing the new recordings to the earlier, spectrogram-based attempts.

"The sensitive vocoder and powerful neural networks represented the sounds the patients had originally listened to with surprising accuracy." 

The findings were published in Scientific Reports.

message circleSHOW COMMENT (1)chevron