Thursday 24 August 2023

How Researchers Are Using AI to Help Stroke Victim Speak Again

A team of researchers used AI to demonstrate the first instance in which spoken words and facial expressions have been synthesized from brain signals. The study, conducted with a stroke victim and published Wednesday in Nature, represents a remarkable step toward helping more patients regain their voice.

Scientists used a sophisticated AI to decode Ann Johnson’s brain signals into written and vocalized language through a digital avatar. The AI was trained to recognize phonemes, or sound units, as opposed to individual words. These sounds, like “ow” and “ah” can ultimately form any word and therefore broaden the program’s vocabulary.

Johnson, who was paralyzed and rendered unable to speak in a stroke 18 years ago, chose an avatar that matched her appearance and provided researchers a video of her wedding speech to develop its voice. As she spoke, her model displayed expressions such as smiles and pursed lips.

Researchers hope to use the technology to help people who are unable to speak due to strokes or conditions such as cerebral palsy and ALS. They anticipate that in a short time, the technology will allow for real-time conversations where patients use digitized versions of themselves to convey tone, inflection, and complicated emotions. “We’re just trying to restore who people are,” the team’s leader, Dr. Edward Chang, told The New York Times.

The team devised an algorithm which translates brain activity into audio waveforms, which produces spoken words. Initially, they didn’t anticipate testing the avatar or the audio on Johnson, but positive results early on convinced them to tackle the harder propositions. The avatar was programmed with data on muscle movements decoded from Johnson’s brain signals as she made a variety of emotional expressions.

Johnson’s program currently generates around 78 words-per-minute. Conversational speech is about 160 WPM. Indicating how quickly the technology is developing, a similar study just two years ago generated between 15–18 WPM. There were small errors in deciphering Johnson’s speech at times. For instance, “Maybe we lost them,” was decoded as “Maybe we that name." Still, it correctly deciphered every single word in nearly half of her sentences.

For Johnson, who now works as a trauma counselor, the feeling of being able to engage in something resembling a real-time conversation with her family was revitalizing. “It let me feel like I was a whole person again,” Johnson wrote to NYT. She’s hoping she can utilize it in her own field. “My shot at the moon was that I would become a counselor and use this technology to talk to my clients.”

You can see Johnson testing the remarkable technology in the video below.



from Men's Journal https://ift.tt/wmTaFsk

No comments:

Post a Comment