Why have genetic linkage studies of schizophrenia failed?
“If there really were rare, highly penetrant mutations that cause schizophrenia, linkage would have found them”. This argument is often trotted out in discussions of the genetic architecture of schizophrenia, which centre on the question of whether it is caused by rare, single mutations in most cases or whether it is due to unfortunate combinations of thousands of common variants segregating in the population. (Those are the two extreme starting positions).
It is true that many genetic linkage studies have been performed to look for mutations that are segregating with schizophrenia across multiple affected members in families. It is also true that these have been unsuccessful in identifying specific genes, but what does this tell us? Does it really rule out or even argue against the idea that most cases are caused by a single, rare mutation? (In the sense that, if the person did not have that mutation, they would not be expected to have the disorder).
This depends very much on the details of how these studies were carried out, their underlying assumptions, their specific findings and the real genetic architecture of the disorder. The idea of genetic linkage studies is that if you have a disease segregating in a particular family, you can use neutral genetic markers across the genome to look at the inheritance of different segments of chromosomes through the pedigree and track which ones co-segregate with the disease. For example, maybe all the affected children inherited a particular segment of chromosome 7 from mom, which can be tracked back to her dad and which is also carried by two of her brothers, who are affected, but not her sister, who is unaffected.
The problem is this: for each transmission from parent to child, 50% of the parent’s DNA is passed on (one copy of each chromosome, which is a shuffled version of the parent’s two copies of that chromosome – usually one segment from grandma, one from granddad, though sometimes there is a little more shuffling). If we only have one such transmission to look at, then we can only narrow down the region carrying a presumptive mutation to 50% of the genome – not much help really. In order for linkage studies to have power, you need to get data from many such transmissions and you therefore need big pedigrees – really huge pedigrees, actually, with information across multiple generations and preferably extending to lots of second or third degree relatives.
Where linkage studies of other diseases have been successful, those are the kinds of pedigrees that have been analysed. But they are not easy to find. That is especially true for schizophrenia, for a simple and tragic reason – this is a devastating disorder that strikes at an early age and causes very substantial impairment. It is associated with much higher mortality and drastically reduced fecundity (about a third the number of offspring), on average. The result is that people with the disorder tend to have fewer children and, if a mutation causing it is segregating in a pedigree, one would expect the pedigree to be smaller overall.
So, finding really large pedigrees where schizophrenia is clearly segregating across multiple generations has not been easy – in fact, there are very few reported that would be large enough by themselves to allow a highly powered linkage study. (Here are some exceptions: Lindholm et al., 2001; Teltsh et al., 2008; Myles-Worsley et al. 2011)
Another thing that is absolutely imperative for linkage studies to work is that you know you are looking at the right phenotype – you must be certain of the affected status of each member of the pedigree. The analyses can tolerate misassignment of a few people, and can incorporate models of incomplete penetrance – where not all carriers of the mutation necessarily develop the disease. But if too many individuals are misassigned, the noise outweighs the signal. This is a particular problem for neuropsychiatric disorders, which we are now realising have highly overlapping genetic etiology. This is seen at the epidemiological level, in terms of shared risk across clinical categories, but also in the effects of particular, identified mutations, none of which is specific for a specific disorder. All the known mutations predispose to disease across diagnostic boundaries, manifesting in some people as schizophrenia, in others as bipolar disorder, autism, epilepsy, intellectual disability or other conditions.
Thus, schizophrenia does not typically “breed true” – there are few very large pedigrees where schizophrenia appears across multiple individuals in the absence of some other neuropsychiatric conditions in the family. Such mixed diagnosis pedigrees were typically excluded from linkage studies on the assumption that schizophrenia represents a natural kind at a genetic level. In fact, they might have been the most useful (and still could be) if what is tracked is neuropsychiatric disease more broadly.
Given the scarcity of large pedigrees where schizophrenia was clearly segregating across multiple generations, researchers tried another approach, which is to find many smaller pedigrees and analyse them together. If schizophrenia is caused by the same mutation in different families across a population, this method should find it. That assumption holds for some simple Mendelian diseases where there is only or predominantly one genetic locus involved – such as Huntington’s disease or cystic fibrosis. But you can see what would happen to your study if it does not hold – if the disorder can in fact be caused by mutations in any of a large number of different genes – any real signals from specific families would be diluted by noise from all the other families.
Many such studies have been published, some combining very large numbers of families (in the hundreds). These have failed to localise any clearly consistent linkage regions, never mind specific genes, that harbour schizophrenia-causing mutations. This leads to one (and only one) very firm conclusion: schizophrenia is not like cystic fibrosis or Huntington’s disease – it is not caused by mutations at a single genetic locus or, indeed, at a small number of loci.
Nothing else can be concluded from these negative results.
In particular, they do not argue against the possibility that schizophrenia is indeed caused by specific, single mutations in each individual or each family where it is segregating, if such mutations can occur at any of a large number of loci; i.e., if the disorder is characterised by very high genetic heterogeneity. This is not an outlandish model – one only has to look at conditions like intellectual disability, epilepsy, congenital deafness or various kinds of inherited blindness for examples of conditions that can be caused by mutations in dozens or even hundreds of different genes.
As it happens, the schizophrenia linkage studies have not necessarily been completely negative – many have found some positive linkages peaks, pointing to particular regions of the genome. These studies have not had the power to refine these signals down to a specific mutation, however, and most specific findings have not been replicated across other studies. It is therefore hard to tell if the statistical signals represent true or false positives in each study. But this lack of replication is to be expected under a model of extreme heterogeneity.
So, we can lay that argument to rest – the absence of evidence from linkage studies is not the evidence of anything – it does not, at least, bear on the current debate.
It should be stressed, however, that the failure of linkage also does not provide positive support for the model of extreme genetic heterogeneity – it is simply consistent with it. There are additional lines of evidence that argue against the most extreme version of the multiple rare variants model – the one that says each case is caused by a single mutation. I and others have argued that that is the best theoretical starting point and that we should complicate the model as necessary – but not more than necessary – to accommodate empirical findings (as opposed to jumping immediately to a massively polygenic model of inheritance, which has some very shaky underlying assumptions).
Such empirical findings include the incomplete penetrance of schizophrenia-associated mutations (which manifest as schizophrenia in only a percentage of carriers) and the range of additional phenotypes that they can cause. These findings suggest a prominent role for genetic modifiers – additional genetic variants in the background of each individual that modify the phenotypic expression of the primary mutation. This is to be expected – in fact, it is observed for even the most classically “Mendelian” disorders. In some cases, it may be impossible to even identify one mutation as “primary” – perhaps two or three mutations are required to really cause the disorder. Alternatively, some families, especially those with a very high incidence of mental illness, may have more than one causal mutation segregating, possibly coming from both parental lines (complicating linkage studies in just those families that look most promising).
One hope for finding causal mutations is the current technical ease and cost-effectiveness of sequencing the entire genome (or the part coding for proteins – the exome) of large numbers of individuals. The first whole-exome-sequencing study of schizophrenia has recently been published, with results that seem disappointing at first glance. The authors sequenced the exomes of 166 people with schizophrenia, identifying around a couple hundred very rare, protein-changing mutations in each person. This is normal – each of us typically carries that kind of burden of rare, possibly deleterious mutations. Finding which ones might be causing disease is the tricky bit – the hope is that multiple hits in the same gene(s) might emerge across the affected people. (This has been the case recently for similar studies of autism, with larger sample sizes and looking specifically for de novo, rather than inherited, mutations). No clear hits emerged from this study and follow-up of specific candidate mutations in a much larger sample did not provide strong support for any of them. (It should be stressed, this was a test for very particular mutations, not for the possible effects of any mutations in a given gene).
Again, we should be cautious about over-extrapolating from these negative data. The justified conclusion is that there are no moderately rare mutations segregating in the population that cause this disorder. These findings do not rule out (or even speak to) the possibility that the disease is caused by very rare mutations, specific instances of which would not be replicated in a wider population sample.
Much larger sequencing studies will be required to resolve this question. If, like intellectual disability, there are hundreds of genetic loci where mutations can result in schizophrenia, then samples of thousands of individuals will be required to find enough multiple hits to get good statistical evidence for any specific gene (allowing for heterogeneity of mutations at each locus). Such studies will emerge over the next couple of years and we will then be in a position to see how much more complicated our model needs to be. If even these larger studies fail to collar specific culprits, then we will have to figure out ways to resolve more complex genetic interactions that can explain the heritability of the disorder. For now, there are no grounds to reject the working model of extreme genetic heterogeneity with a primary, causal mutation in most cases.