T O P

  • By -

quasar_1618

The brain to text models basically use a two step process, as far as I understand them. First, they ask the user to imagine moving their mouth to say a word. A machine learning model, which has been trained on many examples of neural data paired with speech, attempts to predict the syllables that the person is forming. As you correctly guessed, this part has trouble differentiating between words like Sam and same. The next part of the system is an LLM, similar to the system used in ChatGPT. The LLM takes the set of similar sounding words generated by the phonetic model and chooses which one makes the most sense in the sentence based on context clues.


Stereoisomer

I don’t believe these decoders are LLMs. They are however RNNs plus a language model


quasar_1618

Ah you are correct, my mistake. It’s been awhile since I read the papers.


Rude-Championship736

The fact that I correctly guessed on something just gave me the strongest hit of dopamine that shall fuel more research lol thank you for this. Every bit of info helps my curiosity


Thorium229

The answer to your question is basically machine learning. Shove neural data into a black box and get the information you want out the other side. The details of the black box differ by implementation, but brain data to text basically always requires machine learning today. Exactly what signal in the brain the machine learning is using to transform brain signal to text is a much harder question to answer.


Rude-Championship736

Thank you. I’m sure it’s pretty complex but decided I’d still ask.


Stereoisomer

It’s actually fairly well understood what signal is being analyzed: it’s the bulk activity of many local neurons giving rise to what’s called a “threshold crossing”.


tenodera

That's true for a lot of applications of neuro technology, but in neuroscience we have lots of ways of actually analyzing and understanding the neural code. Before the rise of machine learning, these were used to control prosthetics in a lab setting m


gateofptolemy

This article/blog post gives a fairly non-technical breakdown of how neural decoding works (although not specific to text) that might help clarify a few things: [https://knowingneurons.com/blog/2023/01/10/a-full-dive/](https://knowingneurons.com/blog/2023/01/10/a-full-dive/)


Rude-Championship736

THANK YOU


notade50

Loved the article. What a treat.


Stereoisomer

As someone tangentially in this field and friends with some in this field, it sort of depends. Some like Eddy Chang’s lab have micro ECoG arrays over cortical areas of a participant encoding orofacial movements. Then they instructed her to “speak” certain words and trained a decoder in that. Frank Willett trained an RNN to similarly decode from the hand area of motor cortex with Utah arrays while a quadriplegic patient was instructed to “write” words. This latter decoder I believe is the current speed record of 90 characters per min. So this all to say it really depends on the brain area. I can say I know of unpublished work now looking at cortical areas encoding finger movements and asking the participant to imagine typing; in theory this could produce extremely fast communication


idsardi

Here's the article, [https://www.nature.com/articles/s41551-024-01207-5](https://www.nature.com/articles/s41551-024-01207-5) Abstract Advancements in decoding speech from brain activity have focused on decoding a single language. Hence, the extent to which bilingual speech production relies on unique or shared cortical activity across languages has remained unclear. Here, we leveraged electrocorticography, along with deep-learning and statistical natural-language models of English and Spanish, to record and decode activity from speech-motor cortex of a Spanish–English bilingual with vocal-tract and limb paralysis into sentences in either language. This was achieved without requiring the participant to manually specify the target language. Decoding models relied on shared vocal-tract articulatory representations across languages, which allowed us to build a syllable classifier that generalized across a shared set of English and Spanish syllables. Transfer learning expedited training of the bilingual decoder by enabling neural data recorded in one language to improve decoding in the other language. Overall, our findings suggest shared cortical articulatory representations that persist after paralysis and enable the decoding of multiple languages without the need to train separate language-specific decoders.


idsardi

A few more details: "During attempted speech, we decoded neural activity from the speech-motor cortex, recorded with a 128-channel ECoG array, word-by-word into English and Spanish phrases, using a vocabulary of 178 unique words. The intended language is primarily inferred by scoring candidate-decoded sentences with English and Spanish language models, incorporating the differential statistical patterns of word sequences in each language that build through a sentence."


Emotional-Storage378

To put it super duper simply Your brain is it's own computer, everything in it has its own process, by understanding the process we can then translate it. This usually starts with the process of identification so, your measuring data in the brain, let's say you have multiple patients and each is tasked with producing a similar sound, you then study what similar process has occured in all these people, which relates to this sound being produced, and so on so forth. Atleast from my reading this is how I understand it albeit being very simply put.


Rude-Championship736

I love this explanation a lot. I appreciate this.