Monday, April 14, 2014

The Trouble with Epigenetics, Part 3 – over-fitting the noise

The idea of transgenerational epigenetic inheritance of acquired behaviors is in the news again, this time thanks to a new paper in Nature Neuroscience (who seem to have a liking for this sort of thing).

The paper is provocatively titled:  Implication of sperm RNAs in transgenerational inheritance of the effects of early trauma in mice”. The abstract claims that:

We found that traumatic stress in early life altered mouse microRNA (miRNA) expression, and behavioral and metabolic responses in the progeny. Injection of sperm RNAs from traumatized males into fertilized wild-type oocytes reproduced the behavioral and metabolic alterations in the resulting offspring.”

Unfortunately, the paper provides no evidence to back up those extraordinary claims. It is, regrettably, a prime example of over-fitting the noise. That is, finding patterns in a mass of messy data, like faces in clouds, and building hypotheses on them after the fact. If any change in any parameter will do, it isn’t hard to find support that “something happens”. I have written about this problem before, exemplified by previous papers from this group. I normally try not to be sarcastic here, but I don’t have time to edit today, so you’re getting raw, unfiltered exasperation this time.

There are some documented examples of transgenerational effects mediated by RNAs in sperm, especially in worms and plants. Almost all of these involve repression of transposon or transgene insertions. This is not believed to be a widespread phenomenon in mammals, however, and you don’t need to (and shouldn’t!) take my word for it – the following is from a very recent review by leaders in this field:

"...epigenetic inheritance is usually—if not always—associated with transposable elements, viruses, or transgenes and may be a byproduct of aggressive germline defense strategies. In mammals, epialleles can also be found but are extremely rare, presumably due to robust germline reprogramming. How epialleles arise in nature is still an open question, but environmentally induced epigenetic changes are rarely transgenerationally inherited, let alone adaptive, even in plants. Thus, although much attention has been drawn to the potential implications of transgenerational inheritance for human health, so far there is little support."

Shutting down a transposon in gametes and the resultant offspring is one thing – it’s a pretty straightforward molecular mechanism, actually. Using such a mechanism to transmit a behavioural change induced by an experience in the previous generation is something else entirely. What that would require is the following sequence of events: animal has an experience, experience is registered by the brain (so far, so good), signal is transmitted to the gametes (hmm, by what?), relevant gene or genes are specifically modified (how? why just those genes?), modification is maintained in the zygote through “genome rebooting” (what, now?), modification is maintained throughout subsequent development of the animal and the brain (really?), but in a selective way so that somehow in the adult it only affects expression in certain brain regions so as to initiate an appropriate behavioural change in the offspring (ah, c’mon, now you’re taking the piss...).

That is why my skepticometer gets pegged by studies that make such claims without documenting or even suggesting a plausible mechanism by which such events could occur. The current paper takes a stab at one part of that, by looking at small non-coding RNAs as a possible mediator. Unfortunately, the paper is… well, let me show you.

The authors use a paradigm which they developed previously (in one of the papers which I criticised here), to induce what they call a traumatic stress. This involves “unpredictable maternal separation combined with unpredictable maternal stress (MSUS) for 3 hours daily from postnatal day 1 through 14 (PND 1–14)”. The pups don’t like that, apparently, and the authors claim they grow up to show “depressive-like behaviours”. I find those behavioural data a bit shaky, but they get much worse in the following generations, when the responses vary in one test, in one sex in one generation and then in another test in the other sex in the next. It all looks like noise to me, and, the authors neither correct for all these multiple tests, nor provide any hypothesis to account for these fluctuating effects.

In their 2010 paper, they looked at DNA methylation of some candidate genes in F1 sperm and F2 brains to see if they could find a molecular mechanism. Here’s the figure:

There are a lot of asterisks on there, indicating some changes that are statistically significant (alone), but you’ll notice how many different measurements they have made and, also, I hope, the lack of consistency in the supposed effects from F1 to F2. Importantly, there is no independent replication – just one big experiment with the stats done on the whole lot at once. It is no surprise that some data points come out as significant. I’m thinking of green jelly beans

You can see the same kind of thing in the figure below from this recent paper, which also got a lot of media attention: “Parental olfactory experience influences behavior and neural structure in subsequent generations”

In both cases, the data look to me like noise.

Now, back to this latest paper. Amidst a load of somewhat peripheral data, the data supporting the two main claims of the paper are the following: First, the authors claim that the maternal separation protocol alters the levels of various small non-coding RNA molecules in the sperm of the F1 mice (the ones whose moms were cruelly taken away). The data for that claim are in Supplemental Figure 2, which I reproduce below. You will see it derives from three pools of mice for the control condition and three pools from the MSUS condition.

 I see no consistent pattern of changes here. There looks to be as much variability within conditions as between. (Take MSUS pool 2 out and you wouldn’t be left with much signal, I would wager). I am sure there is some statistical test that would give you a significant result, but if you torture the data enough, they’re bound to try to tell you something.

Their next figure takes some of these specific miRNAs and examines their expression levels in sperm, serum and various brain regions of F1 and F2 mice. Again, the data are all over the place. They’re up, they’re down, they’re not changed. They’re changed in hippocampus but some go in the opposite direction in hypothalamus and none are changed in cortex (Supplementary Figure 10 for those reading along at home).

That’s some noisy noise right there. Notably, they see no changes in sperm of F2, even though the F3 supposedly still show behavioural changes, rather undermining their own case for the link between these two (non-)events.

Their next step is the one that the main conclusion rests on – to show that injection of small RNAs from the sperm of an MSUS F1 mouse into a fertilised oocyte can induce the suite of behavioural changes they (claim to) see in the F2 generation under normal conditions.

Amazingly, they do not actually show those data. We get summary t statistics claiming there are some differences but are not treated to the actual data themselves. So, we can’t evaluate the effect sizes or the underlying variability of the data. Here’s how it reads in the paper:

They do show a supposed effect on metabolism in the MSUS-RNA-injected animals, which is a difference seen in one experiment with 8 animals per group in glucose levels, not at baseline, but after stress. I have no idea what’s going on here or what the hypothesis is supposed to be – are we supposed to expect greater or lower glucose? At baseline or after a stress? Whatever is happening, it’s not consistent between the F1 and the F2 in the traditional paradigm, nor between the traditional F2 and the MSUS-RNA-injected F2. 

Finally, we are shown that the levels of one of the miRNAs differs in the hippocampus of the MSUS-RNA-injected F2. Doesn’t look very convincing to me, by itself – it’s the kind of result one might want replicated before publishing, but more to the point: Why that one? What about all the others whose levels fluctuated so happily in the figure shown above?

Overall, there’s no there there. It’s all sound and fury, signifying nothing. I would give it the ultimate insult by saying it’s not even wrong, but it is.

Nevertheless, this paper is sure to be latched onto by the woo crowd who seem to think that epigenetics is some kind of magic. (Now I have that Queen song running in my head - you're welcome). We can change our genes! They’re not our destiny! Toxins cause autism because epigenetics! Hooray!

Evolution appears to have made us mammals very delicate creatures. If you look sideways at a mouse these days you can permanently alter its genes, it seems, along with those of its kids and grandkids. Of course, you’d think another look might change them back if they're so sensitive, but apparently not. I’m sure your genes (ooh, and brain circuits!) have been changed by reading this, for which I can only apologise.

Monday, March 24, 2014

Gay genes? Yeah, but no, well kind of… but, so what?

Sexual preference is one of the most strongly genetically determined behavioural traits we know of. A single genetic element is responsible for most of the variation in this trait across the population. Nearly all (>95%) of the people who inherit this element are sexually attracted to females, while about the same proportion of people who do not inherit it are attracted to males. This attraction is innate, refractory to change and affects behaviour in stereotyped ways, shaped and constrained by cultural context. It is the commonest and strongest genetic effect on behaviour that we know of in humans (in all mammals, actually). The genetic element is of course the Y chromosome.
The idea that sexual behaviour can be affected by – even largely determined by – our genes is therefore not only not outlandish, it is trivially obvious. Yet claims that differences in sexual orientation may have at least a partly genetic basis seem to provoke howls of scepticism and outrage from many, mostly based not on scientific arguments but political ones.

The term sexual orientation refers to whether your sexual preference matches the typical preference based on whether or not you have a Y chromosome. It is important to realise that it therefore refers to four different states, not two: (i) has Y chromosome, is attracted to females; (ii) has Y chromosome, is attracted to males; (iii) does not have Y chromosome, is attracted to males; (iv) does not have Y chromosome, is attracted to females. We call two of these states heterosexual and two of them homosexual. (This ignores the many individuals whose sexual preferences are not so exclusive or rigid).

A recent twin study confirms that sexual orientation is moderately heritable – that is, that variation in genes contributes to variation in this trait. These effects are detected by looking at pairs of twins and determining how often, when one of them is homosexual, the other one is too. This rate is much higher (30-50%) in monozygotic, or identical, twins (who share all of their DNA sequence), than in dizygotic, or fraternal, twins (who share only half of their DNA), where the rate is 10-20%. If we assume that the environments of pairs of mono- or dizygotic twins are equally similar, then we can infer that the increased similarity in sexual orientation in pairs of monozygotic twins is due to their increased genetic similarity.

These data are not yet published (or peer reviewed) but were presented by Dr. Michael Bailey at the recent American Association for the Advancement of Science meeting (Feb 12th 2014) and widely reported on. They confirm and extend findings from multiple previous twin studies across several different countries, which have all found fairly similar results (see here for more details). Overall, the conclusion that sexual orientation is partly heritable was already firmly made. 

The reaction to news of this recent study reveals a deep disquiet with the idea that homosexuality may arise due to genetic differences. First, there are those who scoff at the idea that such a complex behaviour could be determined by what may be only a small number of genetic differences – perhaps only one. As I recently discussed, this view is based on a fundamental misunderstanding of what genetic findings really mean. Finding that a trait (a difference in some system) can be affected by a single genetic difference does not mean a single gene is responsible for crafting the entire system – it simply means that the system does not work normally in the absence of that gene. (Just as a car does not work well without its steering wheel).

Others have expressed a variety of personal and political reactions to these findings, ranging from welcoming further evidence of a biological basis for sexual orientation to worry that it will be used to label homosexuality a genetic disorder and even to enable selective abortion based on genetic prediction. The latter possibility may be made more technically feasible by the other aspect of the recently reported study, which was the claim that they have mapped genetic variants affecting sexual orientation to two specific regions of the genome. (This doesn’t mean they have identified specific genetic variants but may be a step towards doing so).

Let’s explore what the data in this case really show and really mean. A variety of conclusions can be drawn from this and previous studies:

1.     Differences in sexual orientation are partly attributable to genetic differences.
2.     Sexual orientation in males and females is controlled by distinct sets of genes. (Dizygotic twins of opposite sex show no increased similarity in sexual orientation compared to unrelated people – if a female twin is gay, there is no increased likelihood that her twin brother will be too, and vice versa).
3.     Male sexual orientation is rather more strongly heritable than female.
4.     The shared family environment has no effect on male sexual orientation but may have a small effect on female sexual orientation.
5.     There must also be non-genetic factors influencing this trait, as monozygotic twins are still often discordant (more often than concordant, in fact).

The fact that sexual orientation in males and females is influenced by distinct sets of genetic variants is interesting and leads to a fundamental insight: heterosexuality is not a single default state. It emerges from distinct biological processes that actively match the brain circuitry of (i) males or (ii) females to their chromosomal and gonadal sex so that most individuals who carry a Y chromosome are attracted to females and most people who do not are attracted to males. is being regulated, biologically, is not sexual orientation (whether you are attracted to people of the same or opposite sex), but sexual preference (whether you are attracted to males or females). Given how complex the processes of sexual differentiation of the brain are (involving the actions of many different genes), it is not surprising that they can sometimes be impaired due to variation in those genes, leading to a failure to match sexual preference to chromosomal sex. Indeed, we know of many specific mutations that can lead to exactly such effects in other mammals – it would be surprising if similar events did not occur in humans.

These studies are consistent with the idea that sexual preference is a biological trait – an innate characteristic of an individual, not strongly affected by experience or family upbringing. Not a choice, in other words. We didn’t need genetics to tell us that – personal experience does just fine for most people. But this kind of evidence becomes important when some places in the world (like Uganda, recently) appeal to science to claim (wrongly) that there is evidence that homosexuality is an active choice and use that claim directly to justify criminalisation of homosexual behaviour.

Importantly, the fact that sexual orientation is only partly heritable does not at all undermine the conclusion that it is a completely biological trait. Just because monozygotic twins are not always concordant for sexual orientation does not mean the trait is not completely innate. Typically, geneticists use the term “non-shared environmental variance” to refer to factors that influence a trait outside of shared genes or shared family environment. The non-shared environment term encompasses those effects that explain why monozygotic twins are actually less than identical for many traits (reflecting additional factors that contribute to variance in the trait across the population generally). terminology is rather unfortunate because “environmental” does not have its normal colloquial meaning in this context. It does not necessarily mean that some experience that an individual has influences their phenotype. Firstly, it encompasses measurement error (just the difficulty in accurately measuring the trait, which is particularly important for behavioural traits). Secondly, it includes environmental effects prior to birth (in utero), which may be especially important for brain development. And finally, it also includes chance or noise – in this case, intrinsic developmental variation that can have dramatic effects on the end-state or outcome of brain development. This process is incredibly complex and noisy, in engineering terms, and the outcome is, like baking a cake, never the same twice. By the time they are born (when the buns come out of the oven), the brains of monozygotic twins are already highly unique.

Genetic differences may thus change the probability of an outcome over many instances, without determining the specific outcome in any individual. 

A useful analogy is to handedness. Handedness is only moderately heritable but is effectively completely innate or intrinsic to the individual. This is true even though the preference for using one hand over the other emerges only over time. The harsh experiences of many in the past who were forced (sometimes with deeply cruel and painful methods) to write with their right hands because left-handedness was seen as aberrant – even sinful – attest to the fact that the innate preference cannot readily be overridden. All the evidence suggests this is also the case for sexual preference.
What about concerns that these findings could be used as justification for labelling homosexuality a disorder? These are probably somewhat justified – no doubt some people will use it like that. And that places a responsibility on geneticists to explain that just because something is caused by genetic variants – i.e., mutations – does not mean it necessarily should be considered a disorder. We don’t consider red hair a disorder, or blue eyes, or pale skin, or – any longer – left-handedness. All of those are caused by mutations.

The word mutation is rather loaded, but in truth we are all mutants. Each of us carries hundreds of thousands of genetic variants, and hundreds of those are rare, serious mutations that affect the function of some protein. Many of those cause some kind of difference to our phenotype (the outward expression of our genotype). But a difference is only considered a disorder if it negatively impacts on someone’s life. And homosexuality is only a disorder if society makes it one.

Sunday, February 23, 2014

Reductionism! Determinism! Straw-man-ism!

“Reductionism!” is a charge often flung at geneticists, from accusers in the popular press and also, not infrequently, from many fellow scientists. What is it that leads so many people to so fundamentally misunderstand what genetics is about?

Whenever someone presents results showing that variation in such and such a trait is partly influenced by genetic variation, or even showing more specifically that mutations in a particular gene can predispose to a particular outcome, someone is sure to shout: “Reductionism! Single genes can’t cause complex traits – it’s patent nonsense to say that they can! Biological organisms are complex systems interacting in complex ways in an ever-changing environment – particular behaviours can’t be simply determined by genes”.

Of course they are right, but they’re also arguing against something no one is claiming. A couple recent examples illustrate this phenomenon. One is the reaction to coverage of a presentation at the American Association for the Advancement of Science meeting by Dr. Michael Bailey, which described results of a very large twin study, confirming that sexual orientation has a strong genetic component (explaining 30-40% of the variance in this trait).

Nick Cohen, writing in The Observer, quoted geneticist Steve Jones’ reaction to the coverage of this story:

“The idea that they could find a reductionist explanation for a phenomenon as complicated as human sexuality was, well, optimistic. All you could say was genetic inheritance probably influenced it. But then you could say the same about anything.”

Cohen goes on to say:

“Suppose researchers claim to identify gay genes. Their discovery would be pseudo-science. A Gordian knot of environmental, cultural and hormonal influences would be as important in determining sexual preference.”

“To put it another way – if you go along with crude reductionism, you can expect to find yourself at the mercy of crude reductionists.”

In fact, the scientists presenting these findings were quite careful to spell out that their findings do not show that sexual orientation is completely determined by genes in general or any specific genes in particular. They simply show that genetic variation contributes to variation in this trait and that certain locations in the genome may contain some of the genetic variants responsible.

There have been similar reactions to the discussion of the effects of genetic variation on intelligence and educational achievement. Again, the authors of these studies are circumspect about the impact of genetic effects, highlighting the important role of the environment, and emphasising the complexity of the genetic effects.

The main problem, it seems to me, is a fundamental misunderstanding of what genetics as a science studies and how it relates to the function of complex systems. The following statements are not contradictory:

1.     The function of a complex system emerges from the complex and dynamic interactions between all of the components of the system, in a context- and experience-dependent manner.
2.     Variation in single components of the system (or in multiple components) can affect how it functions.

Geneticists investigate the second question. Showing that variation in Gene X affects the behaviour or outcome of a system is not the same as saying that Gene X fully determines that behaviour or fully accounts for the entire system. Gene X is just a piece of DNA sitting in a cell somewhere – it doesn’t do anything by itself. But a difference in Gene X can account for a difference in how the system works.

There’s nothing reductionist about that, except from a methodological point of view, in that scientists often focus on individual components of complex systems, one at a time, in order to get a handle on that complexity and figure out how the whole system works. That has been an extraordinarily successful approach, but does not mean that scientists employing methodological reductionism also ascribe to philosophical reductionism – the idea that the system really can be explained simply from the properties of its lower-level components – that it is no more than the sum of its parts, or that its function can be said to be caused by any one part. a car – the function of this wonderful piece of machinery depends on the integrity of all the components and emerges from their interactions. To say that any one component is somehow responsible for the function of the whole system is nonsense. If you just have a steering wheel, you’re not going to get very far. But it’s also true that if you don’t have a steering wheel, you’re going to have difficulties driving anywhere. A change to one component can drastically affect how even the most complex system functions.

Genetics as an experimental approach studies how a system varies or fails, when individual components are disrupted and uses that information to infer which components are involved in which processes. By contrast, biochemistry or systems biology study how the components of a system are put together and how those interactions mediate its function. These are two complementary experimental approaches to understanding complex systems. (For a very funny analogy along these lines, see here).

As well as inducing mutations in model organisms to learn how various biological processes work, geneticists also study how naturally occurring genetic variation in a population affects various traits. Again, showing that differences in genes contribute to differences in traits is not the same as claiming those genes alone produce or determine everything about the system in question.

Part of the confusion may arise from the two different meanings of the word gene – one based on heredity and the other on molecular biology. In the molecular biology sense, a gene codes for a protein, which carries out some function as a component of some biochemical or cellular system. This is a productive definition of the gene in relation to the system. By contrast, a gene in the heredity sense really means a variant or mutation in the molecular-biology-gene – something that alters its function and thus alters the system. This is a disruptive definition of a gene. Such variants can be passed on and contribute to variation in some trait in the population.

The relationship between a gene (as a piece of DNA coding for a protein) and its function in a biochemical system is thus very different from the relationship between a gene (as a unit of heredity – i.e., a genetic variant) and its effects on some phenotype. For one thing, the effects of a genetic variant can be extremely indirect, cascading across levels from the molecular and biochemical to cellular, physiological and sometimes behavioural. Trying to understand how the phenotypes caused by disrupting a gene relate to the molecular function of the normal gene product is the main enterprise of experimental genetics.

Of course, many geneticists contribute to this confusion (often aided and abetted by journalists and headline writers) by using the egregious “gene for” construct. So, we end up talking about “genes for” schizophrenia or “genes for” intelligence or other traits or conditions – it sounds like these phrases are referring to the productive molecular biology sense of a gene, but really they are using the disruptive heredity sense. What those phrases really refer to is mutations that alter how genes work and that thereby contribute to variation in a particular trait or the likelihood of developing some condition, in the context of the incredibly complex biological system that is a human being, which develops over many years in varying environments. All those qualifiers don’t make for great headlines, I’ll admit, but that’s what the shorthand “gene for” construction really means.

So, claims of genetic influence on various traits are really much more modest than many people seem to think. All they say is that the function of the system in question can vary due to variation in one or more of its components. Hardly the grand threat to civilisation which some people perceive.

While I am at it, here are some other common and equally misplaced arguments against genetic findings:

“I don’t want Trait X to be genetic or innate, so I will simply refuse to believe those data and counter by playing my Reductionist! card”. Political agendas don’t trump scientific facts, thankfully, but this is still a very popular move.

“If genes underlying Trait X are discovered, people will misuse this knowledge”. This may well be true, but it does not speak to the underlying facts of the matter of whether the trait is really influenced by said genes. (More on this in a later post).

“The effects of genetic variation in Gene X are only probabilistic – not everyone with that mutation develops Condition X – how can you therefore say it is causal?” This is akin to saying that because not everyone who smokes gets lung cancer, you can’t really say that smoking causes lung cancer. (Which is true, if you want to be pedantic about the word “causes”, but we can certainly say it causes a much higher probability of developing lung cancer. That is a valid, informative and useful statement and we can make analogous statements about the effects of mutations).

A related one: “The effects of variation in Gene X are only apparent at the statistical level”. Well, the effect of the Y chromosome on height is also only apparent at the statistical level (by comparing the average height of men versus women) but that doesn’t mean it’s not real or useful information. It does mean that this kind of information can’t be reliably applied to make predictions about individuals, which is something some geneticists claim they can do, so this criticism hits the mark in those cases.

“Geneticists have yet to find any specific genes affecting Trait X, therefore it is not really genetic”. This one really gets my goat, especially because people are often referring to negative results from genome-wide association studies (GWAS) or the failure of such studies to explain all the heritability of a trait or disorder and claiming that this implies that it is not really heritable after all. As GWAS only look at one kind of genetic variation (common genetic variants) the only thing such negative results imply is that GWAS may be looking in the wrong place and that heritability is likely to also involve many rare genetic variants.  

I am not suggesting that geneticists never overstep the mark and claim more than they should on the basis of specific findings – this field is no more immune from hype than any other – some might argue it is more susceptible, in fact. That is all the more reason to reserve valid arguments against over-extrapolation of genetic results for when they are needed, rather than levelling the blanket, straw-man charge of reductionism! at the field as a whole.

Tuesday, January 7, 2014

On genetic causality: forwards and backwards

Genetics is getting more complicated. Previously clear and strong links between particular mutations and particular diseases are becoming muddied and weaker with increasing knowledge. Such mutations were usually initially identified in families with a heavy burden of illness, where the mutation segregated clearly with illness. But with our increasing ability to sequence large numbers of people, we are now seeing that many such mutations have a much more variable presentation.

Even classically “Mendelian” mutations, such as those causing cystic fibrosis and Huntington’s disease, are subject to modifying effects in the genetic background. The same mutation in one person may not cause the same symptoms or disease progression in another. And for more complex “disorders”, such as autism, epilepsy or schizophrenia, these effects are far more endemic. Even in cases where a primary mutation is identifiable, there may often be additional genetic factors that strongly influence the phenotype (not to mention intrinsic developmental variation, environmental factors and personal experiences, which may all also have a very large influence). Many such mutations can often be found in individuals without any clinical diagnosis. And in many cases, a disease may emerge due to non-additive interactions between multiple mutations, none of which can be said to be primary.

Given these complexities (even for Mendelian disorders), several commentators, including Anne Buchanan and Ken Weiss, here, and Gholson Lyon, here, have recently questioned the validity of the whole idea of making definitive, categorical genetic diagnoses based on single mutations. Both pieces make excellent and valid points.
Buchanan and Weiss have argued, convincingly, that the highly variable effects of many specific mutations make them almost useless for prediction of disease based on genotype. While I agree completely about the inherent complexity of relating single genotypes to phenotypes (as discussed here), I think it is important not to throw the baby out with the bathwater. In particular, a clear distinction should be drawn between explanation and prediction, as the probability relationships are entirely different in these two directions.

This can be illustrated with a couple of examples of specific mutations that increase risk of neurodevelopmental disorders. Most mutations associated with these conditions show “incomplete penetrance” – that simply means that not everyone who carries the mutation develops the disease (or, more accurately, not all carriers are given the diagnosis). For example, about 30% of carriers of a chromosomal deletion at 22q11.2 develop psychosis and would meet criteria for a diagnosis of schizophrenia. This is a hugely increased risk over the baseline population rate of ~1%, but obviously still far from a majority of carriers.

[As an aside, it is important to note that the value determined for the penetrance depends entirely on what phenotype we are assessing. If it is whether the individual has been given a diagnosis of schizophrenia, then it is around 30% for 22q11.2 deletions. But if it includes clinically determined intellectual disability, developmental delay or autism, then the penetrance approaches 100%. Indeed, a recent study found general effects on cognition even in clinically “unaffected” carriers of this and many other recurrent chromosomal aberrations only sometimes associated with frank disease].

What can we say, based on these numbers? For prediction, we are asking, given the presence of mutation X, what is the likelihood of disease Y? The only thing we can currently base that on is the frequency of disease in carriers of a given mutation. To follow the example above, given the presence of a 22q11.2 deletion, the risk of developing schizophrenia is 30%. Other known mutations associated with neurodevelopmental disorders have differing penetrance – for example, only ~6% of carriers of a NRXN1 deletion develop schizophrenia and only a third are clinically affected overall (versus nearly 100% of 22q11 deletion carriers).

Those numbers make predictions of the prognosis of individual mutation-carriers pretty fuzzy. With a disease like schizophrenia, this kind of prediction is clinically important as there may be methods to intervene during pre-morbid or prodromal phases of the illness, prior to the onset of frank psychosis and the full clinical syndrome. But current medical interventions in individuals at high risk of developing psychosis employ the crude hammer of antipsychotic medication, with all the attendant downsides and potentially serious side-effects – not something to be taken lightly or administered without strong justification. 

On the other hand, risks of the magnitude referred to above may well represent actionable information in terms of prenatal screening and reproductive decisions.
Nevertheless, predictions based on genetic information will remain drastically underpowered until we reach a point where the risk associated with an individual’s entire genome-type, and not just with a single mutation, can be assessed. Making predictions is hard, especially about the future (Niels Bohr or Yogi Berra, depending on who you ask).

But what about going in the opposite direction? This is really a very different situation. If we find an individual with disease Y and with mutation X, can we infer that the mutation is the cause of the disease? Here, we start with two givens (two rare events) and want to infer the likely relationship between them (based on their known contingency). So, if we have a patient with schizophrenia and a test shows they carry a 22q11.2 deletion, how strongly can we infer that that deletion is the primary cause of their illness?

I suppose there is a fancier statistical way to do this, but naïvely, we can say that if that person did not have that mutation, their likelihood of having schizophrenia would only have been ~1% (given no other relevant information). So, I think it right to say, intuitively, that it is 30-fold more likely that their disease was caused by the 22q11 deletion than by some other, unknown factor. We can put more definite numbers on this as follows:

Likelihood of causality = (P(Disease|Mutation) – P(Disease|No information)) /P(Disease|Mutation)

The P(A|B) notation means the probability of A, given B, which we are going to compare to the prior probability of A, given no knowledge of B. Because we take the presence of the mutation as a given, these calculations should be independent of the frequency of the mutation (I think). For 22q11 deletions, this odds ratio comes to 29/30, which corresponds to about a 96.7% probability. For NRXN1 deletions, the penetrance is much lower – 6.4% vs 1% baseline – but the inference of causality still comes out to 84.4%. (Another way to word this is, if we take 1000 individuals with NRXN1 deletions, we would expect 64 to have schizophrenia. But 10 of those would be expected anyway, so we can say the increased burden in this group, which we can equate to the likelihood of causality of the NRXN1 mutation in any individual is 54/64 = 84.4%).

I feel like I may have just committed some egregious statistical sin with the way that last statement is worded, but it’s not that important. Those calculations are very naïve (and not something any clinical geneticist actually carries out), but I think they capture the general intuition – if the known penetrance of a mutation for a particular disease is higher, then the inference of causality is stronger when you find someone with both the disease and the mutation. They also illustrate a surprising result: even in cases where predictive power is quite low (only about 6%), post hoc explanatory power may still be quite high – because now we’re given the presence of disease, an otherwise rare event.

[This is somewhat analogous to interpreting medical tests in a Bayesian framework, by comparing the false positive rate to the underlying prevalence of the condition being screened for (the prior probability) – see here for a great example of this counter-intuitive effect, in the context of autism].

Now, when we use a word like “cause” we are wading into some treacherous philosophical waters. When I use it here, I do not mean that the presence of the mutation is a sufficient cause of the illness, nor is it a complete explanation of the person’s phenotype. But calculations of the type shown above give a value to the strength of the inference that a particular mutation was a necessary condition for the emergence of illness in that individual. They allow us to assign a probability to the idea that, of all the factors and events that led to illness in this person, the presence of the mutation was a difference-maker. It was the main culprit, even if there were multiple accomplices.

This is not causality in a reductive sense (where a single cause fully explains the entire phenotype), but in a counterfactual sense (where a single difference explains a difference in the phenotype – in this case, developing disease versus not developing it). It says, if cause X had not been the case, then phenotype Y would not have arisen. For cases like cystic fibrosis and Huntington’s disease, this inference is rock solid – these disorders do not arise without mutations in the CFTR gene or the Htt gene (even if the disease symptoms and progression can be affected by modifying mutations in other genes). For examples like the mutations listed above that lead to common neurodevelopmental disorders, where there are multiple causes across the population, the best we can do is assign a probability of causal involvement for any particular potentially pathogenic mutation discovered, based on rates of illness across many carriers of that mutation, compared to the baseline rate.
At least, that’s usually the best we can do for humans – we can do a lot better in animal models that are amenable to experimental manipulation. When worm or fly or mouse geneticists map and identify a mutation that they think is causing a particular phenotype, they can do two different experiments to test that hypothesis. First, they can introduce the same mutation into a different animal and see if it reproduces the phenotype. And second, they can repair the mutation in the initial line of animals and see if it rescues the phenotype.

Obviously we can’t do those kinds of things in humans, but we can approach those kinds of experimental tests of causality in two ways. First, we can introduce the putatively causal mutation into an animal and see if it recapitulates known aspects of the disease phenotype (in an animal sense). This is very indirect and suffers from many caveats (especially in knowing which phenotypes to look for and in interpreting negative results) but a positive result in some validated assay does give some confidence that the suspect mutation is having an important and relevant effect.

The second approach relies on two fairly new technologies – the first is the development of induced pluripotent stem cells (iPS cells) from human patients. These can be differentiated in a dish into many different cell types and tissues, which can be tested for cellular-level phenotypes relevant to the function of the damaged gene. This system is obviously highly simplified and far from ideal, especially for disorders that manifest at a physiological or even psychological level, but even in those cases, they must arise initially from changes in the way cells function and these may be definable if we can assay the right cell types in the right ways.

Testing causality of a particular mutation for any such phenotype in a patient’s cells can now be achieved using an even newer technology: the CRISPR method of genome editing. This uses an RNA guide molecule to direct an enzyme to cut the DNA in the genome at a specific position (with astonishingly, game-changingly high efficiency). If a non-mutant template is supplied, this break will be repaired in such a way as to change the sequence of DNA in that region, providing the means to revert a mutation to the “wild-type” version. Then one can determine whether it was really that single mutation that led to the cellular phenotype or, alternatively, if it was not involved at all or only one of many factors contributing. (Exciting proof of principle of this approach was recently provided in a mouse model of cataracts and in cultured intestinal stem cells from cystic fibrosis patients).
Now, for most diseases, we don’t currently have good animal models or proxies at the cellular level. But there is an analogous approach to the rescue experiment that can be performed in humans for some conditions – that is to treat with a medication that targets the candidate pathogenic molecular mechanism. If the patient improves, then we can conclude that that mutation was in fact making a major contribution to their illness. This is the “House, M.D.” method of confirming a diagnosis (it’s never lupus).

Of course, for most mutations, no such specifically tailored medication currently exists. But there are a few exceptions for neurodevelopmental disorders. Fragile X syndrome is one – this condition is a common cause of autism, accounting for 2-3% of cases. Research over several decades has established the nature of the molecular defect in Fragile X patients and the cellular consequences in how nerve cell synapses work, and is beginning to elucidate the emergent physiological consequences on neural networks and brain systems. This detailed knowledge has led to the identification of candidate cellular components that can be targeted to restore the balance of the biochemical pathway affected by the Fragile X mutation. This approach shows great promise in animal models of the disorder and is currently in clinical trials.

Tuberous sclerosis is another genetic condition also often associated with symptoms of autism. It is caused by mutations in either one of two other genes, which also encode proteins that function in synapses. However, when these genes are mutated the biochemical defect is the opposite of that when the Fragile X gene is mutated. It turns out that if this pathway is either too active or not active enough, the functions of neural synapses are impaired, especially in how they change in response to activity. Either situation can lead to autism. In mice, crossing Fragile X mutants with tuberous sclerosis mutants actually restores the balance of this pathway and the resultant double mutants are much more normal than either single mutant alone.

So, if a child comes into a clinic with symptoms of autism, it is important to know if they have mutations in Fragile X or the tuberous sclerosis genes because the medication that may prove beneficial for Fragile X patients would be likely to exacerbate symptoms in those with tuberous sclerosis mutations. (And, of course, there are hundreds of other potential causes of autistic symptoms that may also respond differently or not at all).

But even for cases where no targeted medication exists, the identification of a putatively pathogenic mutation can still inform clinical treatment. Once a large enough database is generated, clinicians will be able to ask how patients with different mutations respond to currently available medications. Perhaps schizophrenic people with 22q11.2 deletions respond better to typical antipsychotics than people with NRXN1 deletions. Or maybe some medications should be avoided in the presence of certain mutations – that is the case for mutations in a sodium channel gene, which are associated with Dravet syndrome, a common form of epilepsy. Patients with these mutations should not be treated with traditional anticonvulsants as this is known to worsen their seizures.

To avoid semantic arguments, we should just probably not use the term “genetic diagnosis” and replace it with “genetic information”. I agree completely that a genetic diagnosis will often be too categorical and definitive, conferring a label based only on one component of a person’s genetic make-up, which may in turn be only one factor in their disease. But despite these complexities, the identification of major mutations still provides very useful genetic information that will often be relevant to the patient’s prognosis and treatment.

With thanks to Dan Bradley, John McGrath (@John_J_McGrath), Gholson Lyon (@GholsonLyon), Svetlana Molchanova (@Svetadotfi) and Shane McKee (@shanemuk) for useful and stimulating discussions.

The following articles have some interesting philosophical discussion of causality, especially in relation to genetics:

Mackie, J.L. (1965) Causes and conditions. American Philosophical Quarterly, vol 2, no. 4.

Meehl (1997) Specific Etiology and Other Forms of Strong Influence: Some
Quantitative Meanings. The Journal of Medicine and Philosophy, 1977, vol. 2, no. 1.

Waters, C.K. (2007) Causes that make a difference. Journal of Philosophy 104 (11):551-579

Kendler, K.S. (2012) The dappled nature of causes of psychiatric illness: replacing the organic–functional/hardware–software dichotomy with empirically based pluralism Apr;17(4):377-88.

Thursday, November 7, 2013

The dark arts of statistical genomics

Whereof one cannot speak, thereof one must be silent” - Wittgenstein

That’s a maxim to live by, or certainly to blog by, but I am about to break it. Most of the time I try to write about things I feel I have some understanding of (rightly or wrongly) or at least an informed opinion on. But I am writing this post from a position of ignorance and confusion.

I want to discuss a fairly esoteric and technical statistical method recently applied in human genetics, which has become quite influential. The results from recent studies using this approach have a direct bearing on an important question – the genetic architecture of complex diseases, such as schizophrenia and autism. And that, in turn, dramatically affects how we conceptualise these disorders. But this discussion will also touch on a much wider social issue in science, which is how highly specialised statistical claims are accepted (or not) by biologists or clinicians, the vast majority of whom are unable to evaluate the methodology.

Speak for yourself, you say! Well, that is exactly what I am doing.
The technique in question is known as Genome-wide Complex Trait Analysis (or GCTA). It is based on methods developed in animal breeding, which are designed to measure the “breeding quality” of an animal using genetic markers, without necessarily knowing which markers are really linked to the trait(s) in question. The method simply uses molecular markers across the genome to determine how closely an animal is related to some other animals with desirable traits. Its application has led to huge improvements in the speed and efficacy of selection for a wide range of traits, such as milk yield in dairy cows.   

GCTA has recently been applied in human genetics in an innovative way to explore the genetic architecture of various traits or common diseases. The term genetic architecture refers to the type and pattern of genetic variation that affects a trait or a disease across a population. For example, some diseases are caused by mutations in a single gene, like cystic fibrosis. Others are caused by mutations in any of a large number of different genes, like congenital deafness, intellectual disability, retinitis pigmentosa and many others. In these cases, each such mutation is typically very rare – the prevalence of the disease depends on how many genes can be mutated to cause it.

For common disorders, like heart disease, diabetes, autism and schizophrenia, this model of causality by rare, single mutations has been questioned, mainly because such mutations have been hard to find. An alternative model is that those disorders arise due to the inheritance of many risk variants that are actually common in the population, with the idea that it takes a large number of them to push an individual over a threshold of burden into a disease state. Under this model, we would all carry many such risk variants, but people with disease would carry more of them.
That idea can be tested in genome-wideassociation studies (GWAS). These use molecular methods to look at many, many sites in the genome where the DNA code is variable (it might be an “A” 30% of the time and a “T” 70% of the time). The vast majority of such sites (known as single-nucleotide polymorphisms or SNPs) are not expected to be involved in risk for the disease, but, if one of the two possible variants at that position is associated with an increased risk for the disease, then you would expect to see an increased frequency of that variant (say the “A” version) in a cohort of people affected by the disease (cases) versus the frequency in the general population (controls). So, if you look across the whole genome for sites where such frequencies differ between cases and controls you can pick out risk variants (in the example above, you might see that the “A” version is seen in 33% of cases versus 30% of controls). Since the effect of any one risk variant is very small by itself, you need very large samples to detect statistically significant signals of a real (but small) difference in frequency between cases and controls, amidst all the noise.  

GWAS have been quite successful in identifying many variants showing a statistical association with various diseases. Typically, each one has a tiny statistical effect on risk by itself, but the idea is that collectively they increase risk a lot. But how much is a lot? That is a key question in the field right now. Perhaps the aggregate effects of common risk variants explain all or the majority of variance in the population in who develops the disease. If that is the case then we should invest more efforts into finding more of them and figuring out the mechanisms underlying their effects.

Alternatively, maybe they play only a minor role in susceptibility to such conditions. For example, the genetic background of such variants might modify the risk of disease but only in persons who inherit a rare, and seriously deleterious mutation. This modifying mechanism might explain some of the variance in the population in who does and does not develop that disease, but it would suggest we should focus more attention on finding those rare mutations than on the modifying genetic background. 

For most disorders studied so far by GWAS, the amount of variance collectively explained by the currently identified common risk variants is quite small, typically on the order of a few percent of the total variance.
But that doesn’t really put a limit on how much of an effect all the putative risk variants could have, because we don’t know how many there are. If there is a huge number of sites where one of the versions increases risk very, very slightly (infinitesimally), then it would require really vast samples to find them all. Is it worth the effort and the expense to try and do that? Or should we be happy with the low-hanging fruit and invest more in finding rare mutations? 

This is where GCTA analyses come in. The idea here is to estimate the total contribution of common risk variants in the population to determining who develops a disease, without necessarily having to identify them all individually first. The basic premise of GCTA analyses is to not worry about picking up the signatures of individual SNPs, but instead to use all the SNPs analysed to simply measure relatedness among people in your study population. Then you can compare that index of (distant) relatedness to an index of phenotypic similarity. For a trait like height, that will be a correlation between two continuous measures. For diseases, however, the phenotypic measure is categorical – you either have been diagnosed with it or you haven’t.

So, for diseases, what you do is take a large cohort of affected cases and a large cohort of unaffected controls and analyse the degree of (distant) genetic relatedness among and between each set. What you are looking for is a signal of greater relatedness among cases than between cases and controls – this is an indication that liability to the disease is: (i) genetic, and (ii) affected by variants that are shared across (very) distant relatives.
The logic here is an inversion of the normal process for estimating heritability, where you take people with a certain degree of genetic relatedness (say monozygotic or dizygotic twins, siblings, parents, etc.) and analyse how phenotypically similar they are (what proportion of them have the disease, given a certain degree of relatedness to someone with the disease). For common disorders like autism and schizophrenia, the proportion of monozygotic twins who have the disease if their co-twin does is much higher than for dizygotic twins. The difference between these rates can be used to estimate how much genetic differences contribute to the disease (the heritability).

With GCTA, you do the opposite – you take people with a certain degree of phenotypic similarity (they either are or are not diagnosed with a disease) and then analyse how genetically similar they are.

If a disorder were completely caused by rare, recent mutations, which would be highly unlikely to be shared between distant relatives, then cases with the disease should not be any more closely related to each other than controls are. The most dramatic examples of that would be cases where the disease is caused by de novo mutations, which are not even shared with close relatives (as in Down syndrome). If, on the other hand, the disease is caused by the effects of many common, ancient variants that float through the population, then enrichment for such variants should be heritable, possibly even across distant degrees of relatedness. In that situation, cases will have a more similar SNP profile than controls do, on average.

Now, say you do see some such signal of increased average genetic relatedness among cases. What can you do with that finding? This is where the tricky mathematics comes in and where the method becomes opaque to me. The idea is that the precise quantitative value of the increase in average relatedness among cases compared to that among controls can be extrapolated to tell you how much of the heritability of the disorder is attributable to common variants. How this is achieved with such specificity eludes me.

Let’s consider how this has been done for schizophrenia. A 2012 study by Lee and colleagues analysed multiple cohorts of cases with schizophrenia and controls, from various countries. These had all been genotyped for over 900,000 SNPs in a previous GWAS, which hadn’t been able to identify many individually associated SNPs.

Each person’s SNP profile was compared to each other person’s profile (within and between cohorts), generating a huge matrix. The mean genetic similarity was then computed among all pairs of cases and among all pairs of controls. Though these are the actual main results – the raw findings – of the paper, they are remarkably not presented in the paper. Instead, the results section reads, rather curtly:

Using a linear mixed model (see Online Methods), we estimated the proportion of variance in liability to schizophrenia explained by SNPs (h2) in each of these three independent data subsets. … The individual estimates of h2 for the ISC and MGS subsets and for other samples from the PGC-SCZ were each greater than the estimate from the total com­bined PGC-SCZ sample of h2 = 23% (s.e. = 1%)

So, some data we are not shown (the crucial data) are fed into a model and out pops a number and a strongly worded conclusion: 23% of the variance in the trait is tagged by common SNPs, mostly functionally attributable to common variants*. *[See important clarification in the comments below - it is really the entire genetic matrix that is fed into the models, not just the mean relatedness as I suggested here. Conceptually, the effect is still driven by the degree of increased genetic similarity amongst cases, however]. This number has already become widely cited in the field and used as justification for continued investment in GWAS to find more and more of these supposed common variants of ever-decreasing effect.

Now I’m not saying that that number is not accurate but I think we are right to ask whether it should simply be taken as an established fact. This is especially so given the history of how similar claims have been uncritically accepted in this field. 

In the early 1990s, a couple of papers came out that supposedly proved, or at least were read as proving, that schizophrenia could not be caused by single mutations. Everyone knew it was obviously not always caused by mutations in one specific gene, in the way that cystic fibrosis is. But these papers went further and rejected the model of genetic heterogeneity that is characteristic of things like inherited deafness and retinitis pigmentosa. This was based on a combination of arguments and statistical modelling.

The arguments were that if schizophrenia were caused by single mutations, they should have been found by the extensive linkage analyses that had already been carried out in the field. If there were a handful of such genes, then this criticism would have been valid, but if that number were very large then one would not expect consistent linkage patterns across different families. Indeed, the way these studies were carried out – by combining multiple families – would virtually ensure you would not find anything. The idea that the disease could be caused by mutations in any one of a very large number (perhaps hundreds) of different genes was, however, rejected out of hand as inherently implausible. [See here for a discussion of why a phenotype like that characterising schizophrenia might actually be a common outcome].
The statistical modelling was based on a set of numbers – the relative risk of disease to various family members of people with schizophrenia. Classic studies found that monozygotic twins of schizophrenia cases had a 48% chance (frequency) of having that diagnosis themselves. For dizygotic twins, the frequency was 17%. Siblings came in about 10%, half-sibs about 6%, first cousins about 2%. These figures compare with the population frequency of ~1%.

The statistical modelling inferred that this pattern of risk, which decreases at a faster than linear pace with respect to the degree of genetic relatedness, was inconsistent with the condition arising due to single mutations. By contrast, these data were shown to be consistent with an oligogenic or polygenic architecture in affected individuals.

There was however, a crucial (and rather weird) assumption – that singly causal mutations would all have a dominant mode of inheritance. Under that model, risk would decrease linearly with distance of relatedness, as it would be just one copy of the mutation being inherited. This contrasts with recessive modes requiring inheritance of two copies of the mutation, where risk to distant relatives drops dramatically. There was also an important assumption of negligible contribution from de novo mutations. As it happens, it is trivial to come up with some division of cases into dominant, recessive and de novo modes of inheritance that collectively generate a pattern of relative risks similar to observed. (Examples of all such modes of inheritance have now been identified). Indeed, there is an infinite number of ways to set the (many) relevant parameters in order to generate the observed distribution of relative risks. It is impossible to infer backwards what the actual parameters are. Not merely difficult or tricky or complex – impossible.

Despite these limitations, these papers became hugely influential. The conclusion – that schizophrenia could not be caused by mutations in (many different) single genes – became taken as a proven fact in the field. The corollary – that it must be caused instead by combinations of common variants – was similarly embraced as having been conclusively demonstrated.

This highlights an interesting but also troubling cultural aspect of science – that some claims are based on methodology that many of the people in the field cannot evaluate. This is especially true for highly mathematical methods, which most biologists and psychiatrists are ill equipped to judge. If the authors of such claims are generally respected then many people will be happy to take them at their word. In this case, these papers were highly cited, spreading the message beyond those who actually read the papers in any detail.

In retrospect, these conclusions are fatally undermined not by the mathematics of the models themselves but by the simplistic assumptions on which they are based. With that precedent in mind, let’s return to the GCTA analyses and the strong claims derived from them.

Before considering how the statistical modelling works (I don’t know) and the assumptions underlying it (we’ll discuss these), it’s worth asking what the raw findings actually look like.

While the numbers are not provided in this paper (not even in the extensive supplemental information), we can look at similar data from a study by the same authors, using cohorts for several other diseases (Crohn’s disease, bipolar disorder and type 1 diabetes).

Those numbers are a measure of mean genetic similarity (i) among cases, (ii) among controls and (iii) between cases and controls. The important finding is that the mean similarity among cases or among controls is (very, very slightly) greater than between cases and controls. All the conclusions rest on this primary finding. Because the sample sizes are fairly large and especially because all pairwise comparisons are used to derive these figures, this result is highly statistically significant. But what does it mean?

The authors remove any persons who are third cousins or closer, so we are dealing with very distant degrees of genetic relatedness in our matrix. One problem with looking just at the mean level of similarity between all pairs is it tells us nothing about the pattern of relatedness in that sample.

Is the small increase in mean relatedness driven by an increase in relatedness of just some of the pairs (equivalent to an excess of fourth or fifth cousins) or is it spread across all of them? Is there any evidence of clustering of multiple individuals into subpopulations or clans? Does the similarity represent “identity by descent” or “identity by state”? The former derives from real genealogical relatedness while the latter could signal genetic similarity due to chance inheritance of a similar profile of common variants – presumably enriched in cases by those variants causing disease. (That is of course what GWAS look for).  

If the genetic similarity represents real, but distant relatedness, then how is this genetic similarity distributed across the genome, between any two pairs? The expectation is that it would be present mainly in just one or two genomic segments that happen to have been passed down to both people from their distant common ancestor. However, that is likely to track a slight increase in identity by state as well, due to subtle population/deep pedigree structure. Graham Coop put it this way in an email to me: “Pairs of individuals with subtly higher IBS genome-wide are slightly more related to each other, and so slightly more likely to share long blocks of IBD.”

If we are really dealing with members of a huge extended pedigree (with many sub-pedigrees within it) – which is essentially what the human population is – then increased phenotypic similarity could in theory be due to either common or rare variants shared between distant relatives. (They would be different rare variants in different pairs). 

So, overall, it’s very unclear (to me at least) what is driving this tiny increase in mean genetic similarity among cases. It certainly seems like there is a lot more information in those matrices of relatedness (or in the data used to generate them) than is actually used – information that may be very relevant to interpreting what this effect means.

Nevertheless, this figure of slightly increased mean genetic similarity can be fed into models to extrapolate the heritability explained – i.e., how much of the genetic effects on predisposition to this disease can be tracked by that distant relatedness. I don’t know how this model works, mathematically speaking. But there are a number of assumptions that go into it that are interesting to consider.

First, the most obvious explanation for an increased mean genetic similarity among cases is that they are drawn from a slightly different sub-population than controls. This kind of cryptic population stratification is impossible to exclude in ascertainment methods and instead must be mathematically “corrected for”. So, we can ask, is this correction being applied appropriately? Maybe, maybe not – there certainly is not universal agreement among the Illuminati on how this kind of correction should be implemented or how successfully it can account for cryptic stratification.

The usual approach is to apply principal components analysis to look for global trends that differentiate the genetic profiles of cases and controls and to exclude those effects from the models interpreting real heritability effects. Lee and colleagues go to great lengths to assure us that these effects have been controlled for properly, excluding up to 20 components. Not everyone agrees that these approaches are sufficient, however.
Another major complication is that the relative number of cases and controls analysed does not reflect the prevalence of the disease in the population. In these studies, there were about equal numbers of each in fact, versus a 1:100 ratio of cases to controls in the general population for disorders like schizophrenia or autism. Does this skewed sampling affect the results? One can certainly see how it might. If you are looking to measure an effect where, say, the fifth cousin of someone with schizophrenia is very, very slightly more likely to have schizophrenia than an unrelated person, then ideally you should sample all the people in the population who are fifth cousins and see how many of them have schizophrenia. (This effect is expected to be almost negligible, in fact. We already know that even first cousins have only a modestly increased risk of 2%, from a population baseline of 1%. So going to fifth cousins, the expected effect size would likely only be around 1.0-something, if it exists at all).

You’d need to sample an awful lot of people at that degree of relatedness to detect such an effect, if indeed it exists at all. GCTA analyses work in the opposite direction, but are still trying to detect that tiny effect. But if you start with a huge excess of people with schizophrenia in your sample, then you may be missing all the people with similar degrees of relatedness who did not develop the disease. This could certainly bias your impression of the effect of genetic relatedness across this distance.

Lee and colleagues raise this issue and spend a good deal of time developing new methods to statistically take it into account and correct for it. Again, I cannot evaluate whether their methods really accomplish that goal. Generally speaking, if you have to go to great lengths to develop a novel statistical correction for some inherent bias in your data, then some reservations seem warranted.

So, it seems quite possible, in the first instance, that the signal detected in these analyses is an artefact of cryptic population substructure or ascertainment. But even if it we take it as real, it is far from straightforward to divine what it means.

The model used to extrapolate heritability explained has a number of other assumptions. First, is that all genetic interactions are additive in nature. [See here for arguments why that is unlikely to reflect biological reality]. Second, it assumes that the relationship between genetic relatedness and phenotypic similarity is linear and can be extrapolated across the entire range of relatedness. At least, all you are supposedly measuring is the tiny effect at extremely low genetic relatedness – can this really be extrapolated to effects at close relatedness? We’ve already seen that this relationship is not linear as you go from twins to siblings to first cousins – those were the data used to argue for a polygenic architecture in the first place.

This brings us to the final assumption implicit in the mathematical modelling – that the observed highly discontinuous distribution of risk to schizophrenia actually reflects a quantitative trait that is continuously (and normally) distributed across the whole population. A little sleight of hand can convert this continuous distribution of “liability” into a discontinuous distribution of cases and controls, by invoking a threshold, above which disease arises. While genetic effects are modelled as exclusively linear on the liability scale, the supposed threshold actually represents a sudden explosion of epistasis. With 1,000 risk variants you’re okay, but with say 1,010 or 1,020 you develop disease. That’s non-linearity for free and I’m not buying it.

I also don’t buy an even more fundamental assumption – that the diagnostic category we call “schizophrenia” is a unitary condition that defines a singular and valid biological phenotype with a common etiology. Of course we know it isn’t – it is a diagnosis of exclusion. It simply groups patients together based on a similar profile of superficial symptoms, but does not actually imply they all suffer from the same condition. It is a place-holder, a catch-all category of convenience until more information lets us segregate patients by causes. So, the very definition of cases as a singular phenotypic category is highly questionable.

Okay, that felt good.

But still, having gotten those concerns off my chest, I am not saying that the conclusions drawn from the GCTA analyses of disorders like schizophrenia and autism are not valid. As I’ve said repeatedly here, I am not qualified to evaluate the statistical methodology. I do question the assumptions that go into them, but perhaps all those reservations can be addressed. More broadly, I question the easy acceptance in the field of these results as facts, as opposed to the provisional outcome of arcane statistical exercises, the validity of which remains to be established. 

Facts are stubborn things, but statistics are pliable.” – Mark Twain