The murderous brain - can neuroimaging really distinguish murderers?
A new study claims that neuroimaging can be
used to distinguish the brains of murderers from non-murderers. It follows in a
long tradition of attempts to find biological indicators of violent criminality,
from faces to skull bumps to genes to brains. But are the data convincing? Does
this study really accomplish what it claims? Is it even based on a well-founded
question? And what are the ethical implications?
Here is the abstract of the paper, by Ashly
Sajous-Turner and colleagues:
Homicide is a significant societal
problem with economic costs in the billions of dollars annually and
incalculable emotionaimpact on victims and society. Despite this high burden,
we know very little about the neuroscience of individuals who commit homicide.
Here we examine brain gray matter differences in incarcerated adult males who
have committed homicide (n = 203) compared to other non-homicide offenders (n =
605; total n = 808). Homicide offenders’ show reduced gray matter in brain
areas critical for behavioral control and social cognition compared with
subsets of other violent and non-violent offenders. This demonstrates, for the
first time, that unique brain abnormalities may distinguish offenders who kill
from other serious violent offenders and non-violent antisocial individuals.
Let’s first think about the reasoning
behind the study, which is encapsulated in the clause: “we know very little about
the neuroscience of individuals who commit homicide”. There is not much more
detail given in the paper itself as to the motivation for the study, but this
clause is rather revealing of a number of underlying assumptions:
1.
Murderousness is something that
there is “a neuroscience of”.
2.
There is, moreover, something in
common in the brains of murderers – some shared difference that can distinguish
them as a group from
non-murderers.
3.
Structural neuroimaging is a
good way to detect such a difference – that is, it should manifest in the volume of some brain regions.
4.
Identifying such brain regions
will tell us something about the biology of violence.
5.
Identifying the pattern of
brain differences will allow us to distinguish murderers from non-murderers.
So, the unstated hypothesis is that
murderers differ biologically from non-murderers, and that we can discover the
murderous essence by looking at the size of bits of the brain. In addition, it
is implied that it would be a good thing if we could do that.
Would it be a good thing? Good for whom? As
mentioned above, many people have tried to find some kind of biological marker
to distinguish those inclined to violence and especially to murder. In the
1800’s, phrenology was looked to as a way of telling who was bad and who was
mad. As
described in this excellent piece by James Bradley: “Two ideas lay at the heart
of phrenology’s seductive power. First, different areas of the brain were
associated with different mental capacities or faculties. And, as the brain
developed, it shaped the skull.” That
meant that the detailed landscape of bumps and depressions on a person’s skull
gave a window to their personality, including possibly murderous instincts.
Francis Galton, the father of eugenics,
looked to physiognomy for markers of criminality, painstakingly creating
composite photographs of the faces of criminals and comparing them to
upstanding members of Victorian society.
This tradition has been recently revived with the brute force of machine
learning, in a study in China claiming to be able to discriminate criminals
from law-abiding citizens based on pictures of their faces. (Additional analyses suggest a bias in the collection of photos that may be
driving the effect).
Genetic variants have also been invoked as
underpinning a murderous instinct in some people. The most famous of these is
the gene encoding monoamine oxidase-A, or MAOA, an enzyme involved in serotonin
metabolism. Serious mutations in this gene are indeed associated with a drastic
increase in violent criminality. Fortunately, such mutations are extremely
rare. Unfortunately, the idea was extended to a very common variation in the
same gene, which was associated with all kinds of behavioral outcomes in
candidate gene analyses, which have subsequently been shown to be spurious.
This hasn’t stopped the idea of the “warrior gene” from taking hold in the
public consciousness, however, nor has it prevented information on a
defendant’s MAOA common genotype being used in court.
The
premise
All of these approaches assume that
murderers differ biologically from non-murderers (in a way that at least partly
explains their murderousness). Is there any reason to think this is the case? Well,
yes, there is.
First of all, the vast majority of murderers
are men. Worldwide, 96% of homicides are committed by men (and 78% of the
victims are male). There is lots of evidence from other species that suggests
this reflects an innate difference in physical aggressiveness between the sexes,
with an underlying neural basis. So, yes, it seems some brains may be more
murdery than others. This at least establishes the principle that there could
be something biologically different between individuals (in addition to sex) that
influences likelihood to commit homicide.
That idea is further supported by twin and
adoption studies showing that antisocial behaviour, physical aggression,
arrest, and incarceration for violent crimes are all partly heritable. That is,
being genetically related to someone with high levels of such behaviors makes
it more likely that a person will also show such behaviors. This effect is
additional to the significant effects of growing up in the same household with
people showing such behaviors. (For those keeping score, genetic effects (i.e.,
the heritability) tend to explain ~30-40% of the variance and the effect of the
shared environment tends to be about the same).
So, okay, it’s not crazy to think that some
biological factors affecting an individual’s psychological make-up make a contribution
to likelihood to commit homicide.
But is there any reason to think that
murderers would differ in one particular
way? The group comparison design of the study implies this hypothesis. But
if the Anna Karenina model applies – “happy families are all alike; every
unhappy family is unhappy in its own way” – then there may be little in common
between the biological factors making one person murderous versus those at play
in another. If each murderer were unhappy in his own way (biologically speaking),
you would never detect that in a group average comparison.
What might such biological factors be? What
psychological traits might increase one’s likelihood to commit homicide? (Whether
one actually does or not would be hugely context-dependent, but let’s continue
with the idea that some people may be innately more murderous than others, in
general). You could imagine any or all of the following traits might be
involved: impulsivity, mood stability, aggressiveness, threat sensitivity,
punishment sensitivity, intelligence, executive function, vengefulness, agreeableness,
jealousness, honesty, narcissism, psychopathy, empathy, moral reasoning,
general meanness, sensitivity to alcohol or drugs, and on and on.
So, you could have one murderer with high threat
sensitivity, impulsivity, and aggressiveness under the influence of alcohol,
and another with high narcissism and psychopathy and deficient moral reasoning.
If you took a hundred murderers, you might have a hundred different profiles.
This rather undermines the design of the experiment from the get-go.
The
design of the experiment
The next implicit assumption in the
experimental design – that differences in such traits should be manifest in the
size of various brain regions – is also not well justified. Why would we expect
that to be the case? Is that how psychological traits are determined? By the
relative size of bits of the brain?
This is just the modern, neuroimaging “blobology” version
of phrenology. The reasoning behind this seems to be: brain region X is “involved
in” Y (or, in even worse wording, “does” Y). Therefore, if brain region X is
bigger, a person will be better at Y.
It’s 2019 – is this really how we
understand the relationship between the complex functions of the mind and the
neural substrates that carry them out? Just based on the real estate occupied
by supposed functional modules? Can we actually map complex cognitive functions
to little bits of the brain? And is bigger better?
This paper is not unusual in adopting this
logic – it is a vague and unquestioned starting position for many similar
studies. The literature is chock-full of reports of such correlations – that
is, in fact, the main methodology of a lot of what can be called “psychology,
with added neuroimaging”. But to the best of my knowledge, despite thousands of
claims, there are few robust, well-replicated examples correlating the size of
little bits of the brain with variation in psychological traits.
There are, in fact, all kinds of parameters
that could vary that would affect the function or tuning of a circuit (involving
many, distributed regions) that would not be visible by structural
neuroimaging: variation in levels of neurotransmitter receptors, or
distribution of specific types of interneurons, or altered density of dendritic
spines, or differences in many other aspects of synaptic microarchitecture, or
neurochemistry, or connectivity, etc. Size isn’t everything. In fact, we have
little reason to think it’s anything.
So, the experimental design is not, in my
view, well founded. As is quite common, the implicit assumptions underlying it
go completely unexamined in the paper (and presumably unchallenged by the reviewers
or the editor).
The methodology
Now, what about the methodology and the
findings? Clearly, they found something, otherwise we wouldn’t have heard about
it. (This is not a trivial statement – the general existence of publication
bias bears directly on how much weight we should put in the “positive” results,
especially if a study was not pre-registered).
On the plus side, the sample is very large
and has a good control group: 203 convicted murderers compared with 605 people
convicted of other crimes, all from the same male prison population, all
scanned under the same conditions on the same scanner (as far as I can tell). The
control group was further broken down into violent and non-violent offenders.
The hypothesis being tested – or maybe we
should say the idea being explored – is that the brains of murderers would show
some structural differences to the brains of non-murderers. More particularly,
the researchers stated:
We hypothesize that homicide offenders
will have deficits in areas of executive functioning and limbic control areas
within the prefrontal cortex and anterior temporal cortex compared to
non-homicide offenders.
They didn’t actually specifically test that
hypothesis, however. Instead, the analyses performed were highly exploratory,
looking for effects in any direction, anywhere in the brain. To do that, the
researchers use a method known as voxel-based morphometry to look for
differences in size of bits of the brain. Basically, this takes the scan of an
individual’s brain and warps it into a common template space to allow averaging
and comparison across groups. The amount of warping that is required to get the
individual’s scan into the template is recorded on a voxel-by-voxel basis and
that is taken as an estimate of the amount of grey matter in that bit of the
brain in that individual, relative to everyone else.
The next step is to average out those
values in the common template for each group and compare them. That raises a
statistical problem, given that you are performing tests on over a million 1mm3
voxels in the brain. Various statistical methods can be used to correct for
these multiple comparisons. Here, the authors used the False Discovery Rate. In
addition, a number of possible confounding factors that differed between the
groups were included in the analysis of variance (ANOVA) model, as listed below.
(PCLR refers to a test for psychopathy). The idea is that controlling for these
factors in the ANOVA gives you confidence that any differences observed are not
just driven by that factor, making it more likely that they are driven by the
factor of interest – murder.
One way ANOVA was performed on a
voxel-by-voxel basis over the whole brain using SPM12 to evaluate differences
in regional gray matter volumes between Homicide (n = 203), Violent
Non-Homicide (n = 475) and Minimally Violent (n =130) offenders, with all three
groups included as factors in each analysis. The ANOVA model included each
subject’s total brain volume (i.e., gray matter plus white matter), PCLR
total scores, substance use severity, age at time of scan, IQ, and time in
prison variables as covariates. Whole brain analyses using the False Discovery
Rate for control over Type I error, were performed for all comparisons.
With that methodology in mind, let’s look
at the findings.
The
murdery bits
The authors found dozens of clusters of
voxels (location information is given for 47) showing a difference in grey
matter volume between murderers and non-murderers. All of the differences were
a relative reduction in volume in the murderers. A comparison of murderers to
the subset of violent non-homicide offenders gave largely similar results while
comparisons between the control subsets, violent (non-homicide)
and minimally violent offenders, “yielded mostly null results, and no results survived correction for multiple comparisons”.
So, what are we to make of these findings?
The first question is: should we trust that they are “real” and not a
statistical blip? The answer is very difficult to discern. The statistics have
all been done in the standard way for this kind of study, but is that standard
actually rigorous enough?
The typical approach is to let the software
do the statistical jiggery-pokery, and if some difference comes out as
significant, then it’s taken to be real. I find that a little unconvincing, and
the reason is empirical: the literature is full of studies performing exactly
these kinds of analyses on neuroimaging data – using FDR and controlling for
all kinds of confounds – and claiming some significant differences between
groups, only for them to fail to replicate in subsequent studies.
At some point, you have to ask if the
stats are doing what we think they’re doing. There is, in fact, some debate
about whether the FDR is an appropriate measure and how it should be
implemented – by voxels, or by clusters of a certain size, for example. In
addition, there is considerable doubt as to the possibility of effectively
“controlling for” possible confounding variables that differ between the
subgroups.
Some claim that “control for” should really read: “attempt to adjust for using
unrealistic linear assumptions”, while others argue it is flat out
impossible to correct for confounds, using ANCOVA.
You would like your stats to give you an
indication of how robust and generalizable a finding is – that isn’t actually
what they do, but it is implicitly how they are (mis)interpreted. In this case,
for example, the authors claim to have discovered something about “murderers”,
that is, murderers in general, not just about the murderers in their sample.
But the best way to see if your findings
are robust and generalizable is to directly test whether they replicate in a
separate sample. For this kind of population, this is obviously a tall order,
and it is understandable that the authors were not able to accomplish it. However,
without such replication, and with the caveats about the statistical
methodology, it is hard to know how much trust to place in the findings.
Interpreting
the findings
Taking them at face value, the authors focus on prominent differences
in areas of the cingulate cortex, insula, prefrontal cortex, and orbitofrontal
cortex. These structures are implicated in many aspects of behavioral control
that could be seen as relevant to “murderousness” (emotional regulation,
impulse control, weighing possible negative consequences of an action, and
others). However, differences also appear in many other parts of the brain,
including for example, the somatosensory, auditory, and visual cortices, and
the cerebellum, which are less obviously implicated in the kinds of cognitive
tasks we might expect to be involved.
Prior work has also reported some
differences between the brains of violent criminals or psychopaths and controls
and the authors point to some overlap in the brain regions where such
differences have been found, as well as some variation. This raises an
interesting question as to how we should assess whether an imaging finding
replicates. Should we expect the same cluster of voxels to differ consistently
across different samples? (This may not even be comparable if the template
space is defined from the subjects in each sample). If we just see some
clusters differing in the same broad region, then, is that a replication? How
are we defining a “region”? If any clusters in any regions “replicate” should
we focus on those as consistent signal and ignore the others?
The tendency of course, when faced with a
long list of findings, is to focus on the ones that make the most sense,
according to your prior knowledge (also known as your preconceived biases). The
same dynamic is seen in the analysis and discussion of results from genomics
approaches, such as genome-wide association studies or transcriptomics. Faced
with a list of hundreds of genes that their study has just “implicated”,
researchers tend to pick a few favorites and tell a story about them, while
ignoring the rest. The positive hits in your prior regions of interest can be
taken as supporting its involvement, but is that really how we should update
our hypotheses? By selectively attending to the evidence that confirms our
priors, while ignoring evidence implicating other regions?
There is an additional problem inherent in
VBM analyses, which is that human brains are quite variable. Not just in
overall size or subtle variations in shape – people also show quite a bit of
variation in the layout and shape and size of various functional areas, defined
by which bits are active when we are doing various tasks or which bits tend to
be talking to each other. Our brains are so unique, in fact, that this
distribution of functional activations is referred to as a “neural fingerprint”.
So, when you do VBM, and you are warping
voxels from some little bit of the brain in one individual into a common space,
there is no guarantee that it belongs to a functionally homologous region in
another individual. It’s more likely to than not, perhaps, and an average map
can be made, but there will still be lots of idiosyncratic variability in the
layout, which makes assigning function to regions in the template more
challenging. Indeed, newer approaches are defining these functional regions on
an individual basis first, before performing any kind of group averaging.
Finally, there is the issue of causality.
This kind of observational study can only provide a correlation. Taking the
findings at face value, a plausible interpretation is that the brain
differences cause the murderousness.
But it is certainly also conceivable that these differences arise as a consequence of traumatic and violent
experiences, which people who eventually became murderers are likely to have gone
through. Or maybe they’re due to some other confound
that was not fully corrected for or not anticipated at all. Who knows? Maybe
they’re a marker of guilt and remorse.
To sum up, we have the observation of a
profile of differences across many regions between the brains of murderers and
non-murderers in this sample. There are, I think, legitimate questions about
the statistical robustness of the findings in the first instance. There is also
a question as to whether, if they are “real” and not just statistical blips,
they are really driven by the factor on which the groups were chosen (murder) and not by some known or unknown confounding variable. Finally, even if they are taken
at face value, it is hard to know what the overall profile means or whether it
really tells us anything about the (varied) psychology of murderers that we
didn’t know before.
Practical
implications
Research like this has real-world impacts,
whether or not they are intended by the authors or warranted by the strength of
the data. You can bet that reports of these findings will increase the use of
brain scans as supposedly exculpatory evidence in murder trials. This practice
is already happening, based on previous reports of a similar nature, and has
been seen for genetic findings as well, despite the fact that the associations
have been shown to be spurious. Both types of evidence have proven effective in
getting sentences reduced, on the basis that the defendant is biologically
predisposed to violence and thus cannot be held fully responsible for it.
But the flip side of this appeal to
biological essentialism is that such a person may be deemed less likely to be
rehabilitated and more prone to recidivism. Indeed, you could see prosecutors
or parole boards using the same evidence to argue, on a different basis, for
longer sentences. Estimating the likelihood of future criminality or violence
is, of course, a normal part of such decisions – the question is whether a
brain scan of an individual can actually give you any accurate or helpful
information in that regard.
And the answer is no. Group average differences do not necessarily allow prediction of individuals and the question is untested in this study.
In fairness, the authors discuss this
explicitly:
While this report demonstrates aggregate
differences between homicide offenders and other violent offenders that are
highly statistically significant, this should not be mistaken for the ability
to identify individual homicide offenders using brain data alone, nor should
this work be interpreted as predicting future homicidal behavior.
However, that caveat is quarantined to the
final section of the paper on “Limitations and future directions”. Having such
a section is absolutely standard practice and certainly a good one, as it is
used to explicitly lay out limitations of the experimental design and
alternative interpretations of the data. But the practice of corralling those
concerns into one section, and presenting them as an afterthought, frees
authors to blithely ignore them in the way they present and interpret their
findings in the rest of the paper. If challenged, they can always point to the
final section to show how rigorous and objective they have actually been and
how up-front and circumspect about possible weaknesses of their claims.
If you were cynical, you might call it the
“covering your ass” section. Having it there gives licence to make the boldest
claims in all the other sections of the paper, especially the title and the
abstract, and, crucially, the press release. In this case, the authors undermine their
own caveat by ending the abstract with this claim:
This demonstrates, for the first time,
that unique brain abnormalities may distinguish offenders who kill from other
serious violent offenders and non-violent antisocial individuals.
They may argue that this sentence is
intended to mean just that there are aggregate, group average differences
between murderers and non-murderers in their sample. But a reasonable person is
likely to read the word “distinguish” as implying that these “unique brain
abnormalities” can, literally, be used to distinguish individual murderers from individual
non-murderers. Similarly loose language is used in this
tweet by Jean Decety, one of the senior authors of the paper:
“Our study, which
includes 808 incarcerated males, demonstrates unique brain abnormality (in
ventromedial prefrontal cortex and insula) that differentiate offenders who
killed from other violent offenders”
That sounds pretty definitive. The
differences are apparently unique to murderers (though this hasn’t been shown - indeed, the paper itself states that "the localized deficits in gray matter exhibited in this sample of homicide offenders are not necessarily specific to homicidal behavior") and
also highly specific to just a couple brain regions (though the data also do not
show that). If I’m a defense lawyer I may go looking for
someone to scan my defendant’s brain and tell me they show evidence of this
“unique brain abnormality”. If I’m on a parole board, I might be similarly
interested. Indeed, why wait until someone has committed a murder to use such
data to predict future crime? Why not get in there first and identify
“high-risk” individuals?
This may seem far-fetched for brain scans,
as it’s highly impractical, but just wait until genome-wide association studies
find some hits for criminality and see how easy it will be to generate a
polygenic score that supposedly predicts this trait for individuals.
Whether the authors intend it or not, this
study feeds into a narrative of biological essentialism that conveniently lets
us ignore all the messy social factors and complex individual experiences that
may lead a person to commit murder. If we can track some objective indicator of
a biological risk of this behavior and supposedly put a number on it, you can
be sure that that number will be applied to individuals and used in all kinds
of unexpected ways, whether or not it has any actual validity.
Comments
Post a Comment