Engineers translate brain signals directly into speech



Credit: CC0 Public Domain

In a first scientist, Columbia's neurologists created a system that translates thought into intelligible and recognizable speech. By monitoring someone's brain activity, technology can reconstruct the words a person hears with unprecedented clarity. This breakthrough, which harnesses the power of speech synthesizers and artificial intelligence, could lead to new ways for computers to communicate directly with the brain. It also lays the foundation for helping people who can not speak, such as those who live with ALS or recovering from a stroke, recovering their ability to communicate with the outside world.

These results were published today in Scientific Reports.

"Our voices help connect us to our friends, family and the world around us, which is why losing the power of a person's voice due to injury or illness is so devastating," said Nima Mesgarani, Ph.D. , senior author of the article. a principal investigator at the Mortimer B. Behavior Mind Behavior Institute Institute at Columbia University. "With today's study, we have a potential way to restore that power. We have shown that with the right technology, these people's thoughts could be decoded and understood by any listener."

Decades of research have shown that when people talk – or even imagine talking – revealing patterns of activity appear in their brains. Distinct (but recognizable) patterns of signals also arise when we hear someone talking or imagining listening. Specialists, trying to record and decode these patterns, see a future in which thoughts need not remain hidden within the brain – but instead could be translated into verbal speech at will.

But accomplishing this feat proved challenging. Early efforts to decode brain signals by Dr. Mesgarani and others focused on simple computational models that analyzed spectrograms, which are visual representations of sound frequencies.

But as this approach failed to produce anything resembling intelligible speech, Dr. Mesgarani's team turned to a vocoder, a computer algorithm that can synthesize speech after being trained in recordings of speaking people.

"This is the same technology used by Amazon Echo and Apple Siri to give verbal answers to our questions," said Dr. Mesgarani, who is also an associate professor of electrical engineering at the Fu Foundation's School of Engineering and Applied Sciences in Colombia .

A representation of the initial approaches to rebuild speech using linear models and spectrograms. Credit: Nima Mesgarani / Columbia's Zuckerman Institute

To teach the vocoder to interpret brain activity, Dr. Mesgarani joined Ashesh Dinesh Mehta, MD, Ph.D., a neurosurgeon at the Institute of Neuroscience at Northwell Health Physician Partners and co-author of today's article. Dr. Mehta treats patients with epilepsy, some of whom must undergo regular surgeries.

"Working with Dr. Mehta, we asked patients with epilepsy who were already undergoing brain surgery to hear phrases spoken by different people while we measured patterns of brain activity," said Dr. Mesgarani. "These neural patterns trained the vocoder."

The researchers then asked the same patients to listen to the speakers by reciting digits between 0 and 9 while recording the brain signals that could be transmitted by the vocoder. The sound produced by the vocoder in response to these signals was analyzed and cleansed by neural networks, a type of artificial intelligence that mimics the structure of neurons in the biological brain.

Representation of Dr. Mesgarani's new approach, which uses a vocoder and a deep neural network to reconstruct speech. Credit: Nima Mesgarani / Columbia's Zuckerman Institute

The end result was a robotic voice reciting a sequence of numbers. To test the accuracy of the recording, Dr. Mesgarani and his team instructed people to listen to the recording and report what they heard.

"We found that people can understand and repeat sounds in about 75 percent of the time, which is well above and beyond any previous attempt," said Dr. Mesgarani. The improvement in intelligibility was especially evident when comparing the new records with the previous spectrogram-based attempts. "The sensitive vocoder and the powerful neural networks represented the sounds the patients originally heard with surprising precision."

Dr. Mesgarani and his team plan to test more complicated words and sentences to follow, and they want to perform the same tests on brain signals issued when a person speaks or imagines speech. Ultimately, they expect their system to be part of an implant, similar to those used by some patients with epilepsy, which translates the user's thoughts directly into words.

"In this scenario, if the user thinks" I need a glass of water, "our system can pick up the brain signals generated by that thought and turn them into synthesized verbal speech," said Dr. Mesgarani. "That would be a turn of the game. That would give anyone who has lost their ability to speak, whether through injury or illness, the renewed chance to connect to the world around them."

This article is titled "To reconstruct the intelligible speech of the human auditory cortex".

The cognitive hearing aid filters noise

Provided by
Columbia University

Engineers translate brain signals directly into speech (2019, January 29)
recovered January 29, 2019

This document is subject to copyright. In addition to any fair dealing for private study or research,
may be reproduced without written permission. Content is provided for informational purposes only.


Source link