“If there really were rare, highly penetrant mutations that
cause schizophrenia, linkage would have found them”. This argument is often
trotted out in discussions of the genetic architecture of schizophrenia, which
centre on the question of whether it is caused by rare, single mutations in
most cases or whether it is due to unfortunate combinations of thousands of
common variants segregating in the population. (Those are the two extreme
starting positions).
It is true that many genetic linkage studies have been
performed to look for mutations that are segregating with schizophrenia across
multiple affected members in families. It is also true that these have been
unsuccessful in identifying specific genes, but what does this tell us? Does it
really rule out or even argue against the idea that most cases are caused by a
single, rare mutation? (In the sense that, if the person did not have that
mutation, they would not be expected to have the disorder).
This depends very much on the details of how these studies
were carried out, their underlying assumptions, their specific findings and the
real genetic architecture of the disorder. The idea of genetic linkage studies
is that if you have a disease segregating in a particular family, you can use
neutral genetic markers across the genome to look at the inheritance of
different segments of chromosomes through the pedigree and track which ones
co-segregate with the disease. For example, maybe all the affected children
inherited a particular segment of chromosome 7 from mom, which can be tracked
back to her dad and which is also carried by two of her brothers, who are
affected, but not her sister, who is unaffected.
The problem is this: for each transmission from parent to
child, 50% of the parent’s DNA is passed on (one copy of each chromosome, which
is a shuffled version of the parent’s two copies of that chromosome – usually
one segment from grandma, one from granddad, though sometimes there is a little
more shuffling). If we only have one such transmission to look at, then we can
only narrow down the region carrying a presumptive mutation to 50% of the
genome – not much help really. In order for linkage studies to have power, you
need to get data from many such transmissions and you therefore need big
pedigrees – really huge pedigrees, actually, with information across multiple
generations and preferably extending to lots of second or third degree
relatives.
Where linkage studies of other diseases have been successful, those are the
kinds of pedigrees that have been analysed. But they are not easy to find. That
is especially true for schizophrenia, for a simple and tragic reason – this is
a devastating disorder that strikes at an early age and causes very substantial
impairment. It is associated with much higher mortality and drastically reduced
fecundity (about a third the number of offspring), on average. The result is
that people with the disorder tend to have fewer children and, if a mutation
causing it is segregating in a pedigree, one would expect the pedigree to be
smaller overall.
So, finding really large pedigrees where schizophrenia is
clearly segregating across multiple generations has not been easy – in fact,
there are very few reported that would be large enough by themselves to allow a
highly powered linkage study. (Here
are some exceptions: Lindholm et al., 2001; Teltsh et al., 2008; Myles-Worsley
et al. 2011)
Another thing that is absolutely imperative for linkage
studies to work is that you know you are looking at the right phenotype – you must
be certain of the affected status of each member of the pedigree. The analyses
can tolerate misassignment of a few people, and can incorporate models of
incomplete penetrance – where not all carriers of the mutation necessarily
develop the disease. But if too many individuals are misassigned, the noise
outweighs the signal. This is a particular problem for neuropsychiatric
disorders, which we are now realising have highly overlapping genetic etiology.
This is seen at the epidemiological level, in terms of shared risk across
clinical categories, but also in the effects of particular, identified
mutations, none of which is specific for a specific disorder. All the known
mutations predispose to disease across diagnostic boundaries, manifesting in
some people as schizophrenia, in others as bipolar disorder, autism, epilepsy,
intellectual disability or other conditions.
Thus, schizophrenia does not typically “breed true” – there
are few very large pedigrees where schizophrenia appears across multiple
individuals in the absence of some other neuropsychiatric conditions in the
family. Such mixed diagnosis pedigrees were typically excluded from linkage
studies on the assumption that schizophrenia represents a natural kind at a
genetic level. In fact, they might have been the most useful (and still could
be) if what is tracked is neuropsychiatric disease more broadly.
Given the scarcity of large pedigrees where schizophrenia was
clearly segregating across multiple generations, researchers tried another
approach, which is to find many smaller pedigrees and analyse them together. If
schizophrenia is caused by the same mutation in different families across a
population, this method should find it. That assumption holds for some simple
Mendelian diseases where there is only or predominantly one genetic locus
involved – such as Huntington’s disease or cystic fibrosis. But you can see
what would happen to your study if it does not hold – if the disorder can in
fact be caused by mutations in any of a large number of different genes – any
real signals from specific families would be diluted by noise from all the
other families.
Many such studies have been published, some combining very
large numbers of families (in the hundreds). These have failed to localise any
clearly consistent linkage regions, never mind specific genes, that harbour
schizophrenia-causing mutations. This leads to one (and only one) very firm
conclusion: schizophrenia is not like cystic fibrosis or Huntington’s disease –
it is not caused by mutations at a single genetic locus or, indeed, at a small
number of loci.
Nothing else can be concluded from these negative
results.
In particular, they do not argue against the possibility
that schizophrenia is indeed caused by specific, single mutations in each individual
or each family where it is segregating, if such mutations can occur at any of a
large number of loci; i.e., if the disorder is characterised by very high
genetic heterogeneity. This is not an outlandish model – one only has to look
at conditions like intellectual disability, epilepsy, congenital deafness or
various kinds of inherited blindness for examples of conditions that can be
caused by mutations in dozens or even hundreds of different genes.
As it happens, the schizophrenia linkage studies have not
necessarily been completely negative – many have found some positive linkages
peaks, pointing to particular regions of the genome. These studies have not had
the power to refine these signals down to a specific mutation, however, and
most specific findings have not been replicated across other studies. It is
therefore hard to tell if the statistical signals represent true or false
positives in each study. But this lack of replication is to be expected under a
model of extreme heterogeneity.
So, we can lay that argument to rest – the absence of
evidence from linkage studies is not the evidence of anything – it does not, at
least, bear on the current debate.
It should be stressed, however, that the failure of linkage
also does not provide positive support for the model of extreme genetic
heterogeneity – it is simply consistent with it. There are additional lines of
evidence that argue against the most extreme version of the multiple rare
variants model – the one that says each case is caused by a single mutation. I and
others have argued that that is the best theoretical starting point and that we
should complicate the model as necessary – but not more than necessary – to accommodate
empirical findings (as opposed to jumping immediately to a massively polygenic
model of inheritance, which has some very shaky underlying assumptions).
Such empirical findings include the incomplete penetrance of
schizophrenia-associated mutations (which manifest as schizophrenia in only a
percentage of carriers) and the range of additional phenotypes that they can
cause. These findings suggest a prominent role for genetic modifiers –
additional genetic variants in the background of each individual that modify
the phenotypic expression of the primary mutation. This is to be expected – in
fact, it is observed for even the most classically “Mendelian” disorders. In some cases, it may be impossible to even identify one mutation as
“primary” – perhaps two or three mutations are required to really cause the
disorder. Alternatively, some families, especially those with a very high
incidence of mental illness, may have more than one causal mutation
segregating, possibly coming from both parental lines (complicating linkage
studies in just those families that look most promising).
One hope for finding causal mutations is the current
technical ease and cost-effectiveness of sequencing the entire genome (or the
part coding for proteins – the exome) of large numbers of individuals. The
first whole-exome-sequencing study of schizophrenia has recently been
published, with results that seem disappointing at first glance. The authors
sequenced the exomes of 166 people with schizophrenia, identifying around a
couple hundred very rare, protein-changing mutations in each person. This is
normal – each of us typically carries that kind of burden of rare, possibly
deleterious mutations. Finding which ones might be causing disease is the
tricky bit – the hope is that multiple hits in the same gene(s) might emerge
across the affected people. (This has been the case recently for similar studies of autism, with larger sample sizes and looking specifically for de
novo, rather than inherited, mutations). No clear hits emerged from this study
and follow-up of specific candidate mutations in a much larger sample did not
provide strong support for any of them. (It should be stressed, this was a test
for very particular mutations, not for the possible effects of any mutations in
a given gene).
Again, we should be cautious about over-extrapolating from
these negative data. The justified conclusion is that there are no moderately
rare mutations segregating in the population that cause this disorder. These
findings do not rule out (or even speak to) the possibility that the disease is
caused by very rare mutations, specific instances of which would not be
replicated in a wider population sample.
Much larger sequencing studies will be required to resolve
this question. If, like intellectual disability, there are hundreds of genetic
loci where mutations can result in schizophrenia, then samples of thousands of
individuals will be required to find enough multiple hits to get good
statistical evidence for any specific gene (allowing for heterogeneity of
mutations at each locus). Such studies will emerge over the next couple of
years and we will then be in a position to see how much more complicated our
model needs to be. If even these larger studies fail to collar specific
culprits, then we will have to figure out ways to resolve more complex genetic
interactions that can explain the heritability of the disorder. For now, there are
no grounds to reject the working model of extreme genetic heterogeneity with a
primary, causal mutation in most cases.







