Hearing voices is a hallmark of schizophrenia and other psychotic disorders, occurring in 60-80% of cases. These voices are typically identified as belonging to other people and may be voicing the person’s thoughts, commenting on their actions or ideas, arguing with each other or telling the person to do something. Importantly, these auditory hallucinations are as subjectively real as any external voices. They may in many cases be critical or abusive and are often highly distressing to the sufferer.
However, many perfectly healthy people also regularly hear voices – as many as 1 in 25 according to some studies, and in most cases these experiences are perfectly benign. In fact, we all hear voices “belonging to other people” when we dream – we can converse with these voices, waiting for their responses as if they were derived from external agents. Of course, these percepts are actually generated by the activity of our own brain, but how?
There is good evidence from neuroimaging studies that the same areas that respond to external speech are active when people are having these kinds of auditory hallucinations. In fact, inhibiting such areas using transcranial magnetic stimulation may reduce the occurrence or intensity of heard voices. But why would the networks that normally process speech suddenly start generating outputs by themselves? Why would these outputs be organised in a way that fits speech patterns, as opposed to random noise? And, most importantly, why does this tend to occur in people with schizophrenia? What is it about the pathology of this disorder that makes these circuits malfunction in this specific way?
An interesting approach to try and get answers to these questions has been to model these circuits in artificial neural networks. If you can generate a network that can process speech inputs and find certain conditions under which it begins to spontaneously generate outputs, then you may have an informative model of auditory hallucinations. Using this approach, a couple of studies from several years ago from the group of Ralph Hoffman have found some interesting clues as to what may be going on, at least on an abstract level.
Their approach was to generate an artificial neural network that could process speech inputs. Artificial neural networks are basically sets of mathematical functions modelled in a computer programme. They are designed to simulate the information-processing functions carried out by individual neurons and, more importantly, the computational functions carried out by an interconnected network of such neurons. They are necessarily highly abstract, but they can recapitulate many of the computational functions of biological neural networks. Their strength lies in revealing unexpected emergent properties of such networks.
The particular network in this case consisted of three layers of neurons – an input layer, an output layer, and a “hidden” layer in between – along with connections between these elements (from input to hidden and from hidden to output, but crucially also between neurons within the hidden layer). “Phonetic” inputs were fed into the input layer – these consisted of models of speech sounds constituting grammatical sentences. The job of the output layer was to report what was heard – representing different sounds by patterns of activation of its forty-three neurons. Seems simple, but it’s not. Deciphering speech sounds is actually very difficult as individual phonetic elements can be both ambiguous and variable. Generally, we use our learned knowledge of the regularities of speech and our working memory of what we have just heard to anticipate and interpret the next phonemes we hear – forcing them into recognisable categories. Mimicking this function of our working memory is the job of the hidden layer in the artificial neural network, which is able to represent the prior inputs by the pattern of activity within this layer, providing a context in which to interpret the next inputs.
The important thing about neural networks is they can learn. Like biological networks, this learning is achieved by altering the strengths of connections between pairs of neurons. In response to a set of inputs representing grammatical sentences, the network weights change in such a way that when something similar to a particular phoneme in an appropriate context is heard again, the pattern of activation of neurons representing that phoneme is preferentially activated over other possible combinations.
The network created by these researchers was an able student and readily learned to recognise a variety of words in grammatical contexts. The next thing was to manipulate the parameters of the network in ways that are thought to model what may be happening to biological neuronal networks in schizophrenia.
There are two major hypotheses that were modelled: the first is that networks in schizophrenia are “over-pruned”. This fits with a lot of observations, including neuroimaging data showing reduced connectivity in the brains of people suffering with schizophrenia. It also fits with the age of onset of the florid expression of this disorder, which is usually in the late teens to early twenties. This corresponds to a period of brain maturation characterised by an intense burst of pruning of synapses – the connections between neurons.
In schizophrenia, the network may have fewer synapses to begin with, but not so few that it doesn’t work well. This may however make it vulnerable to this process of maturation, which may reduce its functionality below a critical threshold. Alternatively, the process of synaptic pruning may be overactive in schizophrenia, damaging a previously normal network. (The evidence favours earlier disruptions).
The second model involves differences in the level of dopamine signalling in these circuits. Dopamine is a neuromodulator – it alters how neurons respond to other signals – and is a key component of active perception. It plays a particular role in signalling whether inputs match top-down expectations derived from our learned experience of the world. There is a wealth of evidence implicating dopamine signalling abnormalities in schizophrenia, particularly in active psychosis. Whether these abnormalities are (i) the primary cause of the disease, (ii) a secondary mechanism causing specific symptoms (like psychosis), or (iii) the brain attempting to compensate for other changes is not clear.
Both over-pruning and alterations to dopamine signalling could be modelled in the artificial neural network, with intriguing results. First, a modest amount of pruning, starting with the weakest connections in the network, was found to actually improve the performance of the network in recognising speech sounds. This can be understood as an improvement in the recognition and specificity of the network for sounds which it had previously learned and probably reflects the improvements seen in human language learners, along with the concomitant loss in ability to process or distinguish unfamiliar sounds (like “l” and “r” for Japanese speakers).
However, when the network was pruned beyond a certain level, two interesting things happened. First, its performance got noticeably worse, especially when the phonetic inputs were degraded (i.e., the information was incomplete or ambiguous). This corresponds quite well with another symptom of schizophrenia, especially those who experience auditory hallucinations - sufferers show phonetic processing deficits under challenging conditions, such as a crowded room.
The second effect was even more striking – the network started to hallucinate! It began to produce outputs even in the absence of any inputs (i.e., during “silence”). When not being driven by reliable external sources of information, the network nevertheless settled into a state of activity that represented a word. The reason the output is a word and not just a meaningless pattern of neurons is that the previous learning that the network undergoes means that patterns representing words represent “attractors” – if some random neurons start to fire, the weighted connections representing real words will rapidly come to dominate the overall pattern of activity in the network, resulting in the pattern corresponding to a word.
Modeling alterations in dopamine signalling also produced both a defect in parsing degraded speech inputs and hallucinations. Too much dopamine signalling produced these effects but so did a combination of moderate over-pruning and compensatory reductions in dopamine signalling, highlighting the complex interactions possible.
The conclusion from these simulations is not necessarily that this is exactly how hallucinations emerge. After all, the artificial neural networks are pretty extreme abstractions of real biological networks, which have hundreds of different types of neurons and synaptic connections and which are many orders of magnitude more complex numerically. But these papers do provide aat least a conceptual demonstration of how a circuit designed to process speech sounds can fail in such a specific and apparently bizarre way. They show that auditory hallucinations can be viewed as the outputs of malfunctioning speech-processing circuits.
They also suggest that different types of insult to the system can lead to the same type of malfunction. This is important when considering new genetic data indicating that schizophrenia can be caused by mutations in any of a large number of genes affecting how neural circuits develop. One way that so many different genetic changes could lead to the same effect is if the effect is a natural emergent property of the neural networks involved.
Hoffman, R., & Mcglashan, T. (2001). Book Review: Neural Network Models of Schizophrenia The Neuroscientist, 7 (5), 441-454 DOI: 10.1177/107385840100700513
Hoffman, R., & McGlashan, T. (2006). Using a Speech Perception Neural Network Computer Simulation to Contrast Neuroanatomic versus Neuromodulatory Models of Auditory Hallucinations Pharmacopsychiatry, 39, 54-64 DOI: 10.1055/s-2006-931496