A series of innovations in medicine has recently been joined by semantic decoder – a mind reader. Based on artificial intelligence (AI), it can translate brain activity into text and is an invention that for the first time allows non-invasive reading of a person’s thoughts. Using only data from functional magnetic resonance imaging, i.e., fMRI scanning, it can reconstruct speech with incredible accuracy while people listen to a story or silently imagine its content. Previous language decoding systems required surgical implants, but this decoder increases the prospects for new ways to restore speech to patients who have difficulty communicating due to a stroke damaging their speech center or suffering from motor neuron diseases.
Too Noisy and Too Slow
Neuroscientist Dr. Alexander Huth, the lead researcher at the University of Texas in Austin, which resulted in the decoder after fifteen years, stated that everyone is very surprised at how well the decoder works. This achievement overcomes the fundamental limitation of fMRI, which is that, although the technique can map brain activity to a specific location with incredibly high resolution, there is an inherent time lag that prevents real-time tracking of activity. The delay exists because fMRI scanning measures the blood flow response to brain activity that peaks and returns to baseline over approximately ten seconds, meaning that even the most powerful scanner cannot improve this. Dr. Huth called it ‘a noisy and slow substitute for neural activity,’ which is a limitation that has hindered the ability to interpret brain activity as a response to natural speech because it provides ‘a mixture of information’ spread over several seconds. However, the emergence of large language models based on artificial intelligence (AI) supporting OpenAI’s ChatGPT has enabled a new approach. These models can represent the semantic meaning of speech, allowing scientists to see which pattern of neural activity corresponds to strings of words with specific meanings. The decoder was trained to align brain activity with the meaning of text using the large language model GPT-1, the predecessor of ChatGPT. The study involved three volunteers who spent sixteen hours in the scanner listening to podcasts. Later, the same individuals were scanned while listening to a new story or imagining telling a story, and the decoder was used to generate text solely from their brain activity. About half the time, the text almost precisely matched the intended meaning of the original words.
– Our system operates at the level of ideas, semantics, and meaning. That is why what we extract is not exact words, but the essence – said Dr. Huth.
How It Reads
For example, when participants imagined the sentence ‘I still don’t have a driver’s license.’, the decoder interpreted it as ‘He hasn’t even started learning to drive yet.’. In another case, the sentence ‘I didn’t know whether to scream, cry, or run away. Instead, I said: ‘Leave me alone!” was translated as ‘He started screaming and crying and said: ‘Leave me alone!” While in the scanner, the volunteers were also shown four short and silent videos, and the decoder, reading their brain activity, accurately described part of the content. The scientific text on the semantic decoder was published in the journal Nature Neuroscience, where Dr. Huth wrote: ‘For a non-invasive method, this is a real step forward compared to what has been done before.’ Sometimes the decoder makes mistakes in reading thoughts because it does not distinguish between pronouns and genders, but scientists still do not know how to solve this problem. Another challenge lies ahead – the decoder was personalized during the research, and when the model was tested on another person, it did not show the same results. Although there is still much work to be done on its refinement before practical application, scientists are already thinking about how to prevent misuse. Co-author of the decoder Jerry Tang, a PhD student at the University of Texas in Austin, stated that they take very seriously the concern that it could be used for malicious purposes.
– We want to ensure that people will use this technology only when it helps them – said Tang.
Brain-Computer Interface
So far, there is no divided scientific opinion on the semantic decoder. For example, computer neuroscientist at the University of Oxford, Professor Tim Behrens, who was not involved in the invention, described it as ‘technically extremely impressive’ and stated that it opens many experimental possibilities, including reading the thoughts of someone who is dreaming or exploring how new ideas emerge from the background activity of the brain.
—
—
– These generative models allow you to see what is in the brain at a new level. This means that you can really read something deep from fMRI – said Behrens.
Professor Shinji Nishimoto from Osaka University, a pioneer in reconstructing visual images from brain activity, described the decoder as a significant advancement and a foundation for new discoveries.
– It has shown that the brain represents continuous linguistic information during perception and imagination in a compatible way. This discovery could be the basis for the development of a brain-computer interface – said Nishimoto.
Stroke Prevention
The research team at the University of Texas in Austin intends to continue research and see if its decoder can be applied to other brain imaging systems, such as functional near-infrared spectroscopy (fNIRS). What actually happens and how a stroke damages speech and how its consequences are treated is explained by Professor Dr. Branko Malojčić, FESO, FWSO, head of the Day Hospital and TIA Center at the Clinical Hospital Center Zagreb, president of the European Society of Neurosonology and Cerebral Hemodynamics (ESNCH), and chair of the Editorial Board of the Educational Portal (eSTEP) of the European Stroke Organization (ESO).
– Brain cells are extremely sensitive to reduced blood flow, i.e., the supply of nutrients. Already about four minutes of complete interruption of blood flow causes the death (infarction) of those receiving blood from the affected artery, while surrounding cells enter a hypoenergetic state that causes them to stop functioning, but with optimal therapy, they can survive. Patients in the area, i.e., artery, affected by the stroke can develop two types of speech disorders: dysarthria is a disorder of speech articulation that arises from the loss of control over the muscles involved in speech production, while aphasia is a disorder of understanding or producing the content elements of speech (language). Simply put, a patient with dysarthria slurs, gets their tongue tangled, or stutters, while a patient with aphasia does not understand what is being said to them or produces sounds that others do not understand as words – emphasizes Professor Malojčić and stresses that the best treatment is prevention, followed by acute treatment in specialized centers (units for stroke treatment).
Possible Recovery
According to him, both methods significantly reduce the degree of disability in the case of a stroke. If dysarthria or aphasia persist after a stroke, speech therapy is conducted, the results of which depend on early initiation and persistence.
– Today we believe that with intensive rehabilitation, additional recovery is possible even a year or two after the onset of a stroke. Unfortunately, due to the aforementioned sensitivity of the brain to nutrient deficiencies, it is often the case that despite rehabilitation, at least some degree of deficit remains, which can mean an increased risk of aspiration pneumonia in dysarthria or difficulty in social contact in aphasia – says Professor Malojčić.
He further states that when it comes to stroke and its consequences, new technology is playing an increasingly important role in treatment. Interesting new methods involving robotics and artificial intelligence are becoming more common. Robotics allows for the movement of limbs in any position and with extremely precisely defined intensity, which can assist standard physical therapy.
– Artificial intelligence helps researchers refine existing methods or direct them towards new ones. The research of Dr. Huth and colleagues, who used functional magnetic resonance imaging (fMRI) to investigate patterns of speech information formation in the brain, is an example of this. A method that has existed for years, which can track the dynamics of changes in the activity of individual brain regions, has so far been unable to reconstruct short-term, rapid changes in real time, but with the help of artificial intelligence, the achieved speed of detection has allowed for the recognition of how certain brain regions respond to specific speech terms, and then the computer could reconstruct the meaning from the list of activated regions before the presented speech terms – explains Professor Malojčić and adds that similar results have so far only been achieved through invasive methods (electrode implantation, which is heavily invested in by Elon Musk), so this is certainly a promising tool. However, before its full application in everyday practice, i.e., rehabilitation of patients with speech disorders, the experiment must be confirmed on real patients and in different indications.
