Is a polygenic model of schizophrenia genetics really proven?

Kenneth Kendler’s article on the nature of genetic variation and the nature of schizophrenia claims that theory and empirical evidence have proven the polygenic architecture of this disorder. In fact, both theory and data are entirely consistent with a very different model of high genetic heterogeneity, where the disorder is largely caused in individuals by one or a few mutations in any of a large number of genes, incorporating important and complex effects of genetic background. 

KK provides a scholarly overview of the history of ideas in these intertwined fields1. While historically interesting, the early arguments between biometricians and Mendelians about continuous versus dichotomous traits conflate two distinct questions: (i) what type of genetic variation contributes to the gradual evolution of new species?, and (ii) what type of genetic variation causes disease? There is no reason to expect these to have the same answer and many reasons not to.

With regard to the genetic architecture of SZ, KK presents the history of various models, from those positing a single major locus to those invoking polygenic mechanisms based on the work of Fisher, Falconer and others. Of course, the single major locus model has long since been rejected and the current debate is really between (i) models of extreme genetic heterogeneity, where the disease is largely caused by one or a small number of rare mutations in each affected individual (in any of a large number of different genes), and (ii) polygenic models involving the combined effects of thousands of common variants that “constitute the gene pool of our species”.

The only reference KK makes to models of genetic heterogeneity regrettably repeats a commonly held but mistaken notion, i.e., that the (negative) results of linkage analyses for SZ refute the theory that the disorder is a “common pathway for a large number of rare quasi-Mendelian disorders”, based on the idea that multiple linkage peaks would have been found if that were the case. This is demonstrably false. Most SZ linkage studies bundled together many small families, as large multiplex SZ pedigrees are rare. If the disorder shows a high level of genetic heterogeneity, combining families will necessarily obscure real linkage signals2. Recent simulations bear this out: in cases where a disorder is associated with decreased fitness and high genetic heterogeneity, linkage studies are predicted to fail3.

KK also presents several lines of positive evidence as supporting – indeed proving – that the polygenic model of SZ is correct. First, he argues that the existence of a phenotypic continuum between clinically diagnosable SZ and SZ-like personality disorders in first-degree relatives proves a polygenic model. It does not. Many classical single-gene disorders show incomplete penetrance and variable expressivity. In some cases, these are due to modifier genes in the background, but – as for SZ itself – phenotypes often vary substantially even between monozygotic twins. What these observations really highlight is that psychiatric diagnostic categories do not represent distinct biological phenotypes, but only one of a range of possible outcomes. The clinical and etiological overlap between SZ and other neurodevelopmental disorders, including autism, epilepsy and intellectual disability reinforces this point4.

Second, KK claims that recent genome-wide association studies and related analyses “have shown that for schizophrenia, Fisher’s model is largely correct”. This interpretation is not warranted by the data. A recent, very large-scale GWAS identified 108 loci with common single-nucleotide polymorphisms (SNPs) showing positive association signals with disease (higher frequency in cases than controls)5. However, GWAS signals do not identify causal variants or inform as to their allelic frequency. Numerous examples of synthetic associations caused by rare mutations have been demonstrated and the fact that rare mutations in many of the loci implicated are known to confer high risk for neuropsychiatric diseases supports this possibility5.

But even if the causal variants are common, this does not imply that the polygenic model is correct. The GWAS signal is a population-level average statistic and does not speak to how these variants act in individuals. Rather than acting in purely polygenic fashion – a hypothetical mechanism never actually demonstrated to cause disease – common variants may instead act as important modifiers of risk due to rare variants or environmental perturbations – a perfectly well-established mechanism (e.g., ref. 6).

Genome-Wide Complex Trait Analyses also cannot determine the number of contributing loci per individual, the number of causal variants across the population or the frequency of causal variants. This is stated clearly by Lee et al: From the analyses we have performed, we cannot estimate a distribution of the allele frequency of causal variants”7. These analyses merely show (or claim) that extremely small statistical increases in risk can be detected across distant relatedness, presuming the technical assumptions and methods are valid7,8. In any case, GCTA analyses for SZ show that most genetic risk is NOT associated with common variants.

Genetic epidemiology at the population level can point to loci of interest but the findings do not restrict or even really inform on the genetic architecture of the disorder in individuals. The empirical data are perfectly consistent with a model of high genetic heterogeneity, where most cases are associated with one or a small number of high-risk mutations9, and where the phenotypic expression of these mutations is affected by genetic background10.

Finally, it seems strange to draw moral conclusions about how we should think of or treat people with SZ based on the genetic architecture of the disease. There does not need to be a continuum of risk across the population for healthy people to feel sympathy for those affected. It is very clear, from monozygotic twin concordance rates of ~50%, that those who have SZ were at very high risk of developing it, on average, with the corollary that the majority of the population had effectively zero risk. “Liability” may be normally distributed but that is an imaginary statistical construct – actual risk is clearly not continuous, under any model of genetic architecture. No moral conclusions derive from that fact.

[Postscript: I haven't gone in to all the positive evidence for an important role for rare mutations of large effect in the etiology of SZ, but see here for many examples and more details: The Genetic Architecture of Neurodevelopmental Disorders.]

10.1017/S003329171000070X (2011).
10.1186/gb-2012-13-1-237 (2012).

* I originally wrote this as a letter to the editor at Molecular Psychiatry, in response to the article referenced by Kenneth Kendler, but they didn't like it, so I just decided to post it here instead.


  1. Hi Kevin,

    Enjoyed the reply to KK (and many of your postings). I have a few related questions. You make a strong case that the common variant hypothesis is not proven and that rare variants could account for a much bigger proportion of schizophrenia that we now understand. Seems reasonable. But aren't there all sorts of other possibilities too? Where to gene-gene interactions fit into all of this? Or gene-environment interactions? What about many gene interactions and other super complex models? I guess there are arguments that many gene effects are additive, but this just seems strange from an intuitive standpoint. Why wouldn't evolution have mixed genetic influences together? I like the idea that modifier genes might act on a phenotype determined by other genes. But this also seems like a slippery slope. Don't modifiers become indistinguishable from interacting genes at some point, when they have a big enough impact on the phenotype? I guess my question comes down to this (possibly a straw man question based on my incomplete understanding): couldn't many additive common variants acting together account for 5% of cases (or some other single digit percentage) and many rare variants acting independently account for another 5%, and much of the remaining genetic influence be hidden by complex, multi-layered interactions that might be pretty near impossible to disentangle?

  2. Thanks Dwight, for your comment. Personally, I think all those mechanisms are in play. I absolutely think gene-gene interactions are crucial in understanding individual risk. I just don't think it's likely that they can all be between thousands of common variants of tiny effect in an individual. (The infinitesimal, death-by-a-tousand-cuts model). That is, I think it is likely to take at least one big insult to push development out of its normally robust path.

    Rather than thinking of some cases being caused by collections of common variants and others being caused by single rare mutations, I think it's much more likely that all cases involved a combination of both. And if there are enough alleles involved then you are right - the designation of some as primary and some as modifiers becomes fuzzy. One way to look at it is whether any particular allele increases risk generally (true for many rare mutations) or only in people with certain other mutations. If the latter, then I'll happily call it a modifier. (The Y chromosome is a good example!)

    But really these are all still open questions (which is the main point of my piece).


Post a Comment

Popular posts from this blog

Undetermined - a response to Robert Sapolsky. Part 1 - a tale of two neuroscientists

Grandma’s trauma – a critical appraisal of the evidence for transgenerational epigenetic inheritance in humans

Undetermined - a response to Robert Sapolsky. Part 2 - assessing the scientific evidence