If genomics is the answer, what's the question? A commentary on PsychENCODE
There was much excitement in the press and in the psychiatric research community recently as a flurry of papers was published, presenting the work of the PsychENCODE project. This project, involving the work of many labs, aimed to deploy the powerful tools of genomics to dissect the landscape of gene regulation in the human brain, with the ultimate goal of revealing the molecular underpinnings of psychiatric disease.
Genome-wide
association studies (GWAS) have revealed hundreds of common genetic variants
that are statistically associated with increased risk of psychiatric disorders,
such as schizophrenia, ADHD, bipolar disorder, and, to a lesser extent, autism.
What they have not revealed is how
such variants increase risk of disease. The PsychENCODE project aimed to
generate a set of data that would allow researchers to answer that question.
There
are a number of challenges in going from identification of an associated risk
variant to elucidation of its biological effects. First, the common variants
that are actually assayed are just markers – they tag a bunch of other genetic
variants that tend to be co-inherited with them in little segments of the
chromosome. Identifying the actual causal variant in that segment is not so straightforward.
Second, and relatedly, many of those
variants do not occur in the coding region of a gene – they do not change the
amino acid sequence of a protein. Instead, they alter the DNA sequence of a
regulatory region – a piece of DNA that acts as a binding site for proteins
that regulate the expression of a nearby gene. Or at least it was always
assumed that a nearby gene would be the most likely target of regulation. It
turns out that many regulatory regions regulate genes that are some distance
away on the chromosome, via three-dimensional interactions between proteins
bound to different, often distant sites.
Third,
even if one can find the causal variant and the gene it regulates, analysing
the effects of the variation on gene expression is not so easy – especially in
the brain. What you’d like to do is directly measure expression of the gene –
levels of the mRNA transcript (or relative levels of multiple different
transcripts) or of the protein itself, in the relevant cell types – in people
with the risk variant versus those without. As brain tissue is inaccessible in
life, people have tended to use blood cells as a proxy, but this has obvious
and serious limitations, in that gene expression patterns in blood differ
substantially from brain.
Finally
– and this is the big one – even if we catalogue all the changes in gene
expression associated with all the associated risk variants, understanding how
these changes collectively contribute to the emergence of a psychiatric
disorder, with some specific profile of attendant cognitive, perceptual and
behavioural symptoms in any individual, remains an enormous challenge.
The
PsychENCODE project generates and analyses a huge amount of data that goes some
way to addressing the first three challenges. The results presented are certainly
an important advance compared to our previous knowledge, which was largely
inferred from studies in animals. The papers look in detail at gene expression
in embryonic and fetal human development, in cerebral organoids in culture, in parts of the
adult human cortex (post mortem), and, for comparison, in the macaque brain. (As well as other topics).
They cross-correlate levels of particular transcripts and three-dimensional
chromosomal contacts with underlying genetic variants to give a much better
picture of which variants are functionally important and how various kinds of
genes are regulated in the brain.
These
are all useful and informative data about the landscape of variation in gene
expression in the human brain. But they leave almost untouched the final
challenge – understanding how genetic risk ultimately causes pathophysiology
and psychopathology. It is striking, in fact, that this question is completely
glossed over in the set of papers and most of the accompanying commentary.
Examining the hypothesis
The
PsychENCODE papers frame psychiatric disorders as problems of altered gene
expression. That is the implicit rationale for the entire approach – that if we
could figure out the profiles of altered gene expression, we would better
understand the nature of the conditions. But it is not clear at all what the
actual hypothesis is that they aim to test, nor are the underlying premises of
the genomic approach made explicit or critically examined. If genomics is the
answer, what is the question?
The
lack of a firm conceptual footing is clearest in the studies looking at
patterns of gene expression in post mortem brain samples from patients with
psychiatric illness.
The
idea seems to be that we can find in the brains of adult sufferers of
conditions like schizophrenia some recognisable and consistent pattern of gene
expression that will reveal the biology underlying the condition or its
symptoms.
There
are multiple technical and conceptual problems with this approach:
1.
Clinical heterogeneity: Schizophrenia is not a thing. It is a collection of
things. Like most psychiatric categories, it is a diagnosis of exclusion – if
you show some of a set of symptoms, and if doctors can’t find any specific
“organic” cause for them, then you may get labelled with “schizophrenia”, but
really it is a placeholder. It basically says “we don’t know what’s wrong with
you, but you look like these other folks over here and so we’re going to give
you all this label, for now, until we figure out some better way to
discriminate between you all”.
In
addition, two patients may both get a diagnosis of schizophrenia but not share
a symptom in common. Bipolar disorder, autism, and other psychiatric diagnoses
are likewise not natural kinds, showing tremendous clinical heterogeneity in
symptom profiles and course of illness. When we are looking at the post mortem
brain of a patient, are we expecting to see a profile of gene expression that
relates to having the condition (the trait) or to the particular symptoms they
were having, either chronically or acutely at the time of their death (the
state)?
2.
Genetic heterogeneity: All of these categories encompass tremendous genetic heterogeneity. This is most obvious in autism, where rare mutations with large
individual effects can be discovered for many patients. But it is true for
schizophrenia and bipolar disorder too, with probably a greater role for a load
of weaker rare mutations. In all cases, the polygenic background of common
variation plays an important part in determining the eventual phenotype. But the
particular profile of common risk variants will also be heterogeneous between
different patients.
This
underlying heterogeneity is completely overlooked by the PsychENCODE papers. If
there are so many, and such diverse, underlying genetic origins, why should we
expect convergent profiles of gene expression across patients?
3.
Regional heterogeneity. The Gandal et al study looks at gene expression in the
prefrontal cortex and the temporal cortex. Why? Is this the locus of these
conditions? Are other parts of the brain not also affected? Could altered gene
expression in, say, dopaminergic neurons in the midbrain not be involved in
schizophrenia, for example?
4.
Cellular heterogeneity: These studies take what I call the “melon-baller”
approach – they scoop out a little chunk of the brain and analyse gene
expression within it. But each chunk is made up of hundreds of different cell
types, and each cell type will have its own characteristic profile of gene
expression. Any difference in average gene expression levels between bits of
prefrontal cortex in patients and controls will most likely represent a change
in cellular composition.
Is
this what we expect underlies the symptoms of these conditions? More or fewer excitatory
or inhibitory neurons or astrocytes or microglia or oligodendrocytes? This
might be a useful method to detect neurodegeneration or inflammation or
gliosis, but there is not really any evidence or reason to suspect that these
kinds of processes are key causal factors. There may well be some inflammation
or other changes as consequences of
these conditions and the lifetime of altered behaviour that accompanies them,
but that is quite another matter.
5.
Developmental origins: These psychiatric conditions are neurodevelopmental disorders.
Many lines of converging evidence support the idea that the symptoms have their
origins in altered processes of neural development, and the genes that have
been implicated to date are enriched for those involved in these processes of
neural development. That means that the genetic profiles of interest (if there
are any) would be the ones during development, not in adults. Many genes show
very different profiles of expression in development versus adults – some
switch off entirely, others become much more ubiquitous than at early stages. Patterns
in adults are thus not a good guide to possible differences in development.
Looking
in post mortem samples may therefore be many decades too late to detect altered
gene expression of interest.
If all you've got is a hammer...
So,
these studies are founded on only the vaguest and most general premise: that
there should exist some kind of convergent pattern of altered gene expression across cell types in certain regions of the brain in adult patients who
all have been given the same clinical diagnosis, and that such a change in gene
expression in the adult brain underlies and in some way can explain the
symptoms.
But
consideration of the issues described above shows there is really no logical
justification for this expectation. In fact, it ignores everything we know
about these conditions. Yes, they have genetic origins. That does not mean they
have proximal molecular underpinnings. They have neural underpinnings.
We
can’t explain hallucinations or delusions in terms of patterns of gene
expression. We can’t even explain the neural pathophysiological states
underlying such symptoms in terms of patterns of gene expression. It’s just the
wrong level. These states emerge from the trajectories and dynamics of the
neural system, over decades of development and experience. Genetic insults at
the start may increase the probability of one trajectory or another but they do
not underlie or explain the emergent states in any kind of direct or
informative way.
Though
they are not presented in this way, the results from the PsychENCODE papers
strongly support this conclusion. They present many statistically significant
findings – mostly of a single type: “enrichment” of genes from one list (e.g.,
GWAS hits) among genes on another list (e.g., those with expression affected in
prefrontal cortex, or those associated with some particular cell type more than
others, or those showing transcript splice form alterations, or those showing
chromatin configuration differences, etc.).
There
is such a wealth of data here that these kinds of “Omomics” analyses can
generate many positive findings, in terms of statistical significance. But what
do they mean? Even by the most charitable reading, it is hard to see how any of
the results presented have advanced our understanding of the biology of these
conditions. While we certainly should not have expected these first analyses to
reveal everything right away, we might reasonably have expected them to reveal something.
Simply
put, nothing really definitive comes out. In fact, the strongest result from
these studies is a general one, and it is “negative”: there are no convergent patterns of gene expression in adult brain
that characterise these various psychiatric conditions. That could have been
predicted from the discussion presented above, but at least we can now say it
has strong empirical support.
Now,
defenders of this approach might counter by saying that the first tranche of
papers simply present this huge dataset that can now be analysed by many others
and that can generate new hypotheses for further study. And there is no doubt that they will generate many more papers.
But
it is not clear to me that they actually do generate new hypotheses – not ones
that are experimentally testable at any rate. The problem relates to
concentrating on the common variant, polygenic component of risk. This involves
so many variants, each with such a minuscule effect, that it is almost
impossible to follow up on in experimental systems (as discussed here).
I
guess if all you have is a hammer, everything looks like a nail. But
ultimately, these psychiatric disorders are not a problem genomics alone can
solve. At some stage we have to hand the problem over to neuroscientists. This
means giving them something they can work with experimentally – not just lists
of genes compared to other lists of genes.
My own bet is on rare mutations with
large effects, where it may be much more feasible to identify strong biological
effects and follow the trajectory of events that leads from altered development
or function of some particular cell types and circuits in the developing brain
to the ultimate emergence of particular pathophysiological states. (Not that
that approach doesn’t have its own challenges!)
Comments
Post a Comment