Monday, January 4, 2016

Sex on the brain – a tale of two studies

The issue of whether there are biological differences between male and female brains is a fraught one and an area where political positions or prior expectations seem to have a strong influence on the interpretation of scientific data. These trends are illustrated by two papers published in the last couple years, which, despite fairly comparable findings, were interpreted in almost polar opposite fashions.

Both studies found strong group differences between male and female brains, one in volume of brain areas, the other in structural connectivity. But the authors of one study went on to (over)interpret these group differences as the basis for sex differences in cognition, while the other downplayed them entirely and instead emphasised the inherent variability within genders to conclude that there was no such thing as a “male brain” or a “female brain”.  Both received extensive coverage in the media, fuelled by the associated press releases, resulting in headlines making hilariously contradictory claims, even in the same newspaper! 

The 2013 study was described with these headlines:

Brain Connectivity Study Reveals Striking Differences Between Men and Women

The 2015 study with these:

The brains of men and women aren’t really that different, study finds

Men are from Mars, women are from Venus? New brain study says not (The Guardian again!)

Let’s look at the more recent one first, to see what the data actually show and how they were analysed and interpreted. Daphna Joel and colleagues analysed MRI scans of 169 females and 112 males, and segmented them into 116 regions using a standard brain atlas. By analysing how much warping was required to map each brain onto a reference template, it was possible to compare the relative grey matter volume of all these regions across the two sexes. From this group comparison the 10 regions showing the largest sex differences were chosen for subsequent analyses.

So far, so good: the primary finding is that there are statistically significant group differences between males and females in grey matter volume across many brain regions. That’s nothing new – a recent meta-analysis of 167 studies confirms consistent group sex differences in many brain areas between men and women

The authors went on, however, to ask what could have been a more interesting question: across those 10 regions, how “male” or “female” were the structures of individual brains? This is where the subjectivity comes in – there are many ways to analyse these data and the authors chose arguably the most simplistic and extreme one, which enabled them to draw the conclusion that male and female brains are not categorically different. 

They report that: “35% percent of brains showed substantial variability, and only 6% of brains were internally consistent”. Importantly they chose to classify only those subjects showing extreme male or female values for all 10 regions as “internally consistent”. A quick look at panel E of the figure below shows that while such brains may indeed be rare, most of the female brains showed a mostly female pattern (lots of pink) while most of the male brains showed a mostly male pattern (lots of blue – don’t blame me, I didn’t pick the colours!). 

There is, in fact, nothing at all surprising in their finding of substantial variability within individuals. To explain why, consider the distributions for height for males and females.
These distributions are very wide and mostly overlapping but there is a strong and consistent group difference in the mean – the distribution for males is shifted to the right. For any individual, however, knowing their sex gives almost no predictive power for how tall they are. What the group difference does suggest is the following: if I know how tall a particular woman is, I can say that if she had been a man (but was genetically otherwise identical) she would probably have been a little taller than that. She may happen to fall at the low or high end of the overall spectrum for other reasons, but that prediction remains the same. The existence of the group difference does not suggest that all males should be at the extreme “male” end of the height distribution or they’re not really very manly at all. That would be true if all other things were equal, but they’re not equal, and those other variations, which have nothing to do with sex, have a much bigger effect on final height than the sex effect does.

Now, consider what will happen if we have ten different variables, each showing that same sort of wide distribution with an even smaller group sex effect. If the volumes of different brain regions vary independently within individuals (taking overall brain volume out of the equation - as shown here, for example), then we should expect some of these values to fall more towards the male end and others more towards the female end in any individual simply due to that underlying variation, which has nothing to do with sex. It would be extremely unlikely to end up at the extreme end for all ten regions, by chance, and such individuals should thus be extremely rare, as observed.

So, the fact that each individual shows this kind of pattern does not mean that each of us has a “mosaic brain” that is partly male and partly female, as claimed by the authors. It is simply exactly what is expected given that sex is only one of the factors affecting the size of each of these regions. We can’t know for each individual what the size of each region would have been if their sex were different (which is really what we’d like to know) – we can only deduce from the group average effects that there would likely have been some effect.

The headlines suggesting that male and female brains are not that different are thus not well supported by these findings at all. The group differences are clear and highly significant. And even if very few of the males or females are at the extreme end of the distribution for all ten of these regions, the overall pattern suggests that you could build a very good classifier from the volumes of these ten regions taken together, which would be quite successful at predicting whether a given brain scan came from a male or a female. Indeed, this would have been a far more objective test of whether MRI volumetric differences between male and female brains are categorical or dimensional.

Given this, it is interesting to ask why the authors chose to analyse and present their data in the way they did. This is what they say in the introduction to the paper:

"Documented sex/gender* differences in the brain are often taken as support of a sexually dimorphic view of human brains (female brainvs. male brain), and consequently, of a sexually dimorphic view of human behavior, cognition, personality, attitudes, and other gender characteristics (3). Joel (4, 5) has recently argued that the existence of sex/gender differences in the brain is not sufficient to conclude that human brains belong to two distinct categories. Rather, such a distinction requires the fulfillment of two conditions: one, the form of the elements that show sex/gender differences should be dimorphic, that is, with little overlap between the forms of the elements in males and females. Two, there should be a high degree of internal consistency in the form of the different elements of a single brain (e.g., all elements have the maleform)."

It seems pretty clear from that that the authors set out to show that male and female brains are not that different, or at least not dimorphic. In particular, they take aim at a paper by Madura Ingalhalikar and colleagues (their reference 3, above), which is the second paper I wish to discuss. These authors found comparable group difference results as Joel et al (using a different measure of brain structure), yet reached almost opposite conclusions.

They used diffusion tensor imaging to define the structural connectivity networks across the brains of 949 youths (428 males and 521 females). They then analysed these networks using a variety of statistical measures of regional and global connectivity and compared these between males and females. They found that females had greater connectivity between hemispheres than males, on average, while males had greater connectivity within each hemisphere. Males also showed greater local connectivity and concomitantly increased modularity in the network (again, on average).

(In this figure from the paper, the top panel shows connections that are stronger in males, the bottom those that are stronger in females; blue are intrahemispheric, orange are interhemispheric).

Once again, so far, so good – the results look significant and interesting. (It would have been nice to see the analyses done with a discovery and replication sample, instead of one big group but at least it is a large sample). Where these authors got onto shakier ground was in extrapolating their findings as explanations for a variety of group differences in cognition between men and women. The participants in the structural connectivity analysis were part of a larger sample for which cognitive data had already been obtained, showing sex differences in a variety of domains. Such differences have been widely documented and range from quite small to fairly large (see here for a meta-analysis). 

However, the idea that the structural connectivity network differences observed are the cause of such cognitive differences is entirely speculative. I have nothing against speculation, per se, and the discussion section of a paper is a perfect place to explore the possible implications of one’s results. Where this got a bit out of hand was in the associated press release and the consequent media coverage. This is from the press release itself: 

"“These maps show us a stark difference--and complementarity--in the architecture of the human brain that helps provide a potential neural basis as to why men excel at certain tasks, and women at others,” said Verma. [Regini Verma, senior author]

For instance, on average, men are more likely better at learning and performing a single task at hand, like cycling or navigating directions, whereas women have superior memory and social cognition skills, making them more equipped for multitasking and creating solutions that work for a group. They have a mentalistic approach, so to speak. "

Those kinds of assertive generalisations, and especially the idea that the connectivity findings provide a neural basis for them, are not at all supported by the data and rightly provoked howls of protest from the scientific community. This included commentary by Joel and colleagues , to which Ingalhalikar and colleagues responded.  The unfortunate outcome was that the authors’ over-extrapolation ended up undermining trust in their primary findings, which actually look quite solid in themselves.

To my mind, both these studies over-reached in the interpretation of their results, ironically drawing opposite conclusions from what are broadly comparable primary findings. More generally, it also seems that a little more humility is in order in drawing sweeping conclusions from these kinds of studies, given the crudeness of group-wise volumetric and tractography analyses and the very low resolution of MRI scans. Even if such scans showed no consistent group differences between male and female brains, this would not imply that male and female brains are not different. It would only imply such differences could not be detected by MRI. We know there are many differences in the numbers of neurons in small brain regions or numbers of connections between regions in male and female brains that are invisible to MRI, not to mention sex differences in densities of synaptic spines or other subcellular parameters that have also been demonstrated (as in this recent example). 

A final note: why should we care? Why should we investigate sex differences in the brain? And if we find them, what are their implications for public policies? Many people are rightly concerned that demonstrations of biological differences in brain structure between males and females will be used to reinforce the idea of systematic differences in cognitive abilities and justify sexism. Of course, even if such differences were large and consistent across individuals, it would not imply one version is better than the other. But more importantly, the distributions for cognitive domains are so overlapping and the sex effects typically so small that inferring anything about the cognitive profiles of individuals on the basis of these group differences is, simply put, a very bad bet. Sex differences for interests are a little bit bigger, but still by no means categorical and there is likely a strong cultural reinforcement of gender norms in this area.

There are, however, other areas where there are more robust sex differences. The most obvious but also the most commonly over-looked of these is sexual preference – something in the brains of males makes the vast majority of them sexually attracted to females, and vice versa. This is by far the strongest genetic effect on behaviour that we know of in humans (mediated by the SRY gene on the Y chromosome). It would therefore be interesting to find out how that preference is wired into the brain, as an exemplar for how genes can influence innate behaviour. Sex differences in physical aggression are also large and another important topic to understand (as are differences in idiotic behaviour as measured by the Darwin awards!).

Finally, though, a main reason we should care is due to the large sex differences in prevalence of psychiatric conditions, which range from autism, ADHD and Tourette syndrome (much more common in males), to schizophrenia and dyslexia (more common in males), to depression (more common in females) and eating disorders (much more common in females). There is strong and consistent evidence, for example, that females are somewhat protected against the effects of mutations that typically cause autism in males. Females may carry such mutations with relatively little clinical effect; conversely, females who do have autistic symptoms tend to have larger or more severe mutations than affected males (suggesting that it takes a more drastic insult at the genetic level to push a female brain into a clinically autistic state). Understanding how sex influences vulnerability to these conditions is thus a hugely important question.

Too important to let politics, bias or spin affect our interpretation of scientific findings. 

Thursday, December 3, 2015

On literature pollution and cottage-industry science

A few days ago there was a minor Twitterstorm over a particular paper that claimed to have found an imaging biomarker that was predictive of some aspect of outcome in adults with autism. The details actually don’t matter that much and I don’t intend to pick on that study in particular, or even link to it, as it’s no worse than many that get published. What it prompted, though, was more interesting – a debate on research practices in the field of cognitive neuroscience and neuroimaging, particularly relating to the size of studies required to address some research questions and the scale of research operation they might entail.

What kicked off the debate was a question of how likely the result they found was to be “real”; i.e., to represent a robust finding that would replicate across future studies and generalise to other samples of autistic patients. I made a fairly uncompromising prediction that it would not replicate, which was based on the fact that the finding derived from: a small sample (n=31, in this case, but split into two), an exploratory study (i.e., not aimed at or constrained by any specific hypothesis, so that group differences in pretty much any imaging parameter would do) and lack of a replication sample (to test directly, with exactly the same methodology, whether the findings from the study were robust, prior to bothering anyone else with them).

The reason for my cynicism is twofold: first, the study was statistically under-powered, and such studies are theoretically more likely to generate false positives. Second, and more damningly, there have been literally hundreds of similar studies published using neuroimaging measures to try and identify signatures that would distinguish between groups of people or predict the outcome of illness. For psychiatric conditions like autism or schizophrenia I don’t know of any such “findings” that have held up. We still have no diagnostic or prognostic imaging markers, or any other biomarkers for that matter, that have either yielded robust insights into underlying pathogenic mechanisms or been applicable in the clinic.

There is thus strong empirical evidence that the small sample, exploratory, no replication design is a sure-fire way of generating findings that are, essentially, noise.

This is by no means a problem only for neuroimaging studies; the field of psychology is grappling with similar problems and many key findings in cell biology have similarly failed to replicate. We have seen it before in genetics, too, during the “candidate gene era”, when individual research groups could carry out a small-scale study testing single-nucleotide polymorphisms in a particular gene for association with a particular trait or disorder. The problem was the samples were typically small and under-powered, the researchers often tested multiple SNPs, haplotypes or genotypes but rarely corrected for such multiple tests, and they usually did not include a replication sample. What resulted was an entire body of literature hopelessly polluted by false positives.

This problem was likely heavily compounded by publication bias, with negative findings far less likely to be published. There is evidence that that problem exists in the neuroimaging literature too, especially for exploratory studies. If you are simply looking for some group difference in any of hundreds or thousands of possible imaging parameters, then finding one may be a (misplaced) cause for celebration, but not finding one is hardly worthy of writing up.

In genetics, the problems with the candidate gene approach were finally realised and fully grappled with. The solution was to perform unbiased tests for SNP associations across the whole genome (GWAS), to correct rigorously for the multiple tests involved, and to always include a separate replication sample prior to publication. Of course, to enable all that required something else: the formation of enormous consortia to generate the sample sizes required to achieve the necessary statistical power (given how many tests were being performed and the small effect sizes expected).

This brings me back to the reaction on Twitter to the criticism of this particular paper. A number of people suggested that if neuroimaging studies were expected to have larger samples and to also include replication samples, then only very large labs would be able to afford to carry them out. What would the small labs do? How would they keep their graduate students busy and train them?

I have to say I have absolutely no sympathy for that argument at all, especially when it comes to allocating funding. We don’t have a right to be funded just so we can be busy. If a particular experiment requires a certain sample size to detect an effect size in the expected and reasonable range, then it should not be carried out without such a sample. And if it is an exploratory study, then it should have a replication sample built in from the start – it should not be left to the field to determine whether the finding is real or not.

You might say, and indeed some people did say, that even if you can’t achieve those goals, because the lab is too small or does not have enough funding, at least doing it on a small scale is better than nothing.

Well, it’s not. It’s worse than nothing.

Such studies just pollute the literature with false positives – obscuring any real signal amongst a mass of surrounding flotsam that future researchers will have to wade through. Sure, they keep people busy, they allow graduate students to be trained (badly), and they generate papers, which often get cited (compounding the pollution). But they are not part of “normal science” – they do not contribute incrementally and cumulatively to a body of knowledge.

We are no further in understanding the neural basis of a condition like autism than we were before the hundreds of small-sample/exploratory-design studies published on the topic. They have not combined to give us any new insights, they don’t build on each other, they don’t constrain each other or allow subsequent research to ask deeper questions. They just sit there as “findings”, but not as facts.

Lest I be accused of being too preachy, I should confess to some of these practices myself. Several years ago, while candidate gene studies were still the norm, we published a paper that included a positive association of semaphorin genes with schizophrenia (prompted by relevant phenotypes in mutant mice). It seems quite likely now that that association was a false positive, as a signal from the gene in question has not emerged in larger genome-wide association studies.

And the one neuroimaging study I have done so far, on synaesthesia, certainly suffered from a small sample size (at the time it was considered decent), and no replication sample. In our defense, our study was itself designed as a replication of previous findings, combining functional and structural neuroimaging. While our structural findings did mirror those previously reported (in general direction and spatial distribution of effects, though not precise regions), our functional results were quite incongruent with previous findings. As we did not have a replication sample built into our own design, I can’t be particularly confident that our findings will generalise – perhaps they were a chance finding in a fairly small sample. (Indeed, the imaging findings in synaesthesia have been generally quite inconsistent and it is difficult to know which findings constitute real results that future research studies could be built on).

If I were designing these kinds of studies now I would use a very different design, with much larger samples and in-built replication (and pre-registration). If that means they are more expensive, so be it. If it means my group can’t do them alone, well that’s just going to be the way it is. No one should fund me, or any lab, to do under-powered studies.

For the neuroimaging field generally that may well mean embracing the idea of larger consortia and adopting common scanning formats that enable combining subjects across centres, or at least subsequent meta-analyses. And it will mean that smaller labs may have to give up on the idea of making a living from studies attempting to find differences between groups of people without enough subjects. You’ll find things – they just won’t be real.

Sunday, November 22, 2015

What do GWAS signals mean?

Genome-wide association studies (GWAS) have been highly successful at linking genetic variation in hundreds of genes to an ever-growing number of traits or diseases. The fact that the genes implicated fit with the known biology for many of these traits or disorders strongly suggests (effectively proves, really) that the findings from GWAS are “real” – they reflect some real biological involvement of those genes in those diseases. (For example, GWAS have implicated skeletal genes in height, immune genes in immune disorders, and neurodevelopmental genes in schizophrenia).

But figuring out the nature of that involvement and the underlying biological mechanisms is much more challenging. In particular, it is not at all straightforward to understand how statistical measures derived at the level of populations relate to effects in individuals. Here, I explore some of the diverse mechanisms in individuals that may underlie GWAS signals.

GWAS take an epidemiological approach to identify genetic variants associated with risk of disease in exactly the same way epidemiologists identify environmental factors associated with risk – they look for factors that are more frequent in cases with a disease than in unaffected controls. For example, smoking is more common in people with lung cancer than in people without lung cancer (even though only a minority of people who smoke get lung cancer). From this we can deduce that smoking may be a risk-modifying factor for lung cancer, and we can measure the strength of that effect. Of course, observational epidemiology cannot prove causation – but it can provide important clues as to the risk architecture of a disease.

For GWAS, the factors in question are not environmental – they are the differences in our DNA that exist at millions of positions across the genome. These “single-nucleotide polymorphisms”, or SNPs, are positions in the genome where the DNA sequence varies between people – sometimes it might be an “A”, sometimes it might be a “T” (or a “G” or a “C”). Of course, any position in the genome can be mutated and likely is mutated in someone on the planet, but such mutations are typically extremely rare. SNPs are different – they are positions where two different versions are both relatively frequent in the population; these versions are thus often referred to as common variants.

GWAS are premised on the simple idea that if any of those common variants at any of those millions of SNPs across the genome is associated with an increased risk of disease, then that variant should be more frequent in cases than in controls. So, if we find variants that are more common in cases than in controls, we can infer that these variants may be causally related to an increased risk of disease.

What that doesn’t tell us is how. How does having one variant over another at that particular site cause an increased risk of that particular disease? I don’t just mean by what biological mechanism; I mean how does risk calculated at the population level relate to effects in individuals?

Statistically, we get two measures out of GWAS for any SNP that is associated. One is the p-value, which is a measure of how unlikely it would be to see a frequency difference of the magnitude we observe, just by chance. You might, for example, find that the “A” version at one SNP is at 25% frequency in controls but 28% frequency in cases. That’s not a big difference, so you’d need a very big sample to make sure it wasn’t noise, which is precisely why GWAS now use sample sizes of tens or even hundreds of thousands of people.

GWAS also apply very rigorous thresholds for statistical significance, in order to correct for the fact that they are testing so many different SNPs. (This follows the logic that, while it is quite unlikely that you will win the lottery yourself, if enough tickets are sold, it won’t be surprising if the lottery is won by somebody). These methods have greatly advanced the trustworthiness of results from the field, far beyond those reported in the benighted “candidate gene era”. But the p-value doesn’t tell us anything about how big of an effect there is – how much of an effect on risk does the difference in frequency between cases and controls reflect?

That number is summarised by the other measure we get for each associated SNP, which is the odds ratio. This reflects the size of the difference in frequency of that variant between cases and controls. It is calculated very simply: say your SNP comes in two versions, or “alleles”: “A” and “G”. We want to convert the difference in absolute frequencies in cases versus controls (say 28% vs 25%, or 62% vs 60%, or whatever it is) into a number that tells us how many times more common is one version in cases versus controls. (The reason is that that number is more easily related to the increased risk associated with having that version).

Here’s an example: If we take 28% and 25% as frequencies of the “A” allele at a certain SNP in cases and controls, respectively, then if you were to select an “A” allele at random from the sample, the odds of it coming from a case versus a control is 0.28/0.25 (=1.12). The odds of the alternative “G” allele occurring in a case versus a control is correspondingly lower: 0.72/0.75 (=0.96). The odds ratio is then 1.12/0.96 = 1.167.  Assuming that the cases and controls are representative of the general population, we can infer that individuals with an “A” allele are 1.167 times more likely to be a case, compared to those with the “G” allele, which is the number we’re after. (Note that this approximation of odds ratio to relative risk only holds when the disease is rare).

If you do the same calculations for 62% vs 60% it works out to 1.09. These odds ratios are on the order of the typical values obtained from GWAS. For comparison, the odds ratio for smoking and lung cancer is around 30. It is calculated in the same way, e.g., from data like these from a study in Spain in the 1980’s (where smoking was apparently astronomically common!): this study found that 98.8% of lung cancer patients were smokers, while “only” 80.3% of controls were smokers. Doing the same calculations as above gives an OR = 29.1, which is consistent with many other studies.

Thus, for either genetic or environmental factors, the odds ratio gives an average increased risk of disease. But, biologically, what is actually going in each individual that collectively gives that signal?

The most straightforward interpretation is that an odds ratio of, say, 1.2 at the population level reflects exactly the same thing at the individual level – each individual who inherits that SNP variant is at 1.2 times greater risk of developing the disease than they would been otherwise. This is the additive model whereby each SNP acts independently of all other factors – it doesn’t matter what other genetic variants a person has, or indeed what environmental factors they may be exposed to – the added effect on risk of this SNP is the same in all carriers.

That is, I think, a pretty common interpretation of what the odds ratio means in individuals, but it is certainly not the only scenario that could produce that result at the population level. In the diagram below, I illustrate several different scenarios that could all yield the same odds ratio across the population.

The additive scenario is illustrated in A. Every person who inherits the risk allele has a slightly increased risk of disease (small red arrows). [This applies whether the SNP that is genotyped in the GWAS has a functional effect itself or tags another common SNP that is the one doing the damage].

It might seem like the odds ratio can be interpreted directly as a multiplier of the baseline risk across the population, i.e., the prevalence of the disease in question. So, if the baseline rate is say 1%, then people with the “A” allele in our example above would have a risk of 1.167%, all other things being equal. The problem with that interpretation is that all other things are not equal.

For example, a condition like autism affects about 1% of the population. This does not mean, however, that everyone in the population had a 1% risk of being born autistic, and that the ones who actually are autistic were just unlucky (statistically speaking, not judgmentally). That 1% is actually made up of people who were at very high risk of being autistic – we know this because people with the same genotype as those with autism (i.e., their monozygotic twins) have a rate of autism of over 80%. What this implies is that the vast majority of the population were at effectively no risk (not at 1% risk).

This suggests that the effects of any SNP are also likely to be highly unequally distributed across the population*, depending on the genetic background, as illustrated in Scenario B. In some people, the risk variant increases risk a little bit (small red arrows), while in others it increases it a lot (bigger red arrows). In others it may have no effect (flat blue line), while in yet others it may actually decrease risk (green downward arrow).

That last situation may seem far-fetched but is actually well described; for example, two mutations that each independently cause epilepsy may paradoxically cancel each other out if they occur together. Similarly, mutations in the fragile X gene, Fmr1, or in the tuberous sclerosis gene, Tsc2, can each cause autism in humans and various neurological and behavioural symptoms when mutated in mice. However, combining them both in mice leads to a rescue of the symptoms caused by either one alone (because they counteract each other at the biochemical level).

These kinds of “epistatic” (non-additive) interactions are generally very common and can be seen for all kinds of complex traits. In terms of how they would contribute to a GWAS signal, a slight preponderance of increased risk when you average those effects across the population would generate a small odds ratio greater than 1. Based on the odds ratio alone, there is no way to distinguish scenarios A and B.

Note that this kind of effect holds for all epidemiological data – the effect sizes obtained are always averages across the population which may hide substantial variability in effect size across individuals. For example, a high-fat diet may be a much higher risk factor for cardiovascular disease in some people than in others, based on their genetic vulnerability.

It is interesting to note that if those kinds of diverse epistatic interactions occur for each SNP, then their aggregate effects will likely always look additive, as these pairwise and higher-order interactions will average out both among and across individuals. That doesn’t mean they could not in principle be decomposed to reveal such effects, as can be done using various genetic techniques in model organisms. So, just because SNP effects seem to combine additively does not rule out multiple epistatic interactions at the biological level.

Scenario C is a special case of epistatic interaction. In this case, the common risk variant has no effect on biological risk at all in most carriers (flat blue lines). However, if it occurs in people with a rare mutation in some specific gene (big purple arrow), which by itself predisposes to the disease with incomplete penetrance (where not everyone with the mutation necessarily develops the disease), then it can have a modifying effect, strongly increasing the likelihood of actual expression of the disease symptoms.

Again, this kind of scenario is well documented and is particularly well illustrated by Hirschsprung disease. This disorder, which affects innervation of the gut, can be caused by mutations in any one of about 18 known genes, one of which encodes the Ret tyrosine kinase. However, mutations in this gene are not completely penetrant – some people with it do not develop disease or have only a mild form. Recent studies have found that simultaneously carrying a common variant in the same gene increases the likelihood that carriers of the rare mutation will show severe disease. The common variant thus modifies the risk of disease substantially, but only in carriers of a rare mutation. (In this case it is in the same gene, but that doesn’t have to be the case). 

The last scenario, D, is quite different. Here, the common variant is not doing anything itself. It’s not even linked to another common variant that is doing something. Instead, it is linked to a rare mutation that causes disease with much higher penetrance. Or, to put it better, the rare mutation is linked to it. Any new mutation must arise on a background of some set of common SNPs (a “haplotype”), with which it will tend to be subsequently co-inherited. If a rare mutation that increases risk of disease rises to an appreciable frequency then it will necessarily increase the frequency of the SNPs in that haplotype in people with the disease, giving rise to what has been called a “synthetic association”.

Any one mutation might be too rare to cause such an effect (especially if it is likely to be selected against precisely because it causes disease), but if you have multiple rare mutations at a given locus, and if they happen to occur by chance more on one haplotype than another, then you could get an aggregate effect that could give a tiny difference in frequency of the sort detected by GWAS.

There are now many documented examples where GWAS signals are explained by synthetic associations with rare mutations in the sample, which have much larger odds ratios (e.g., 1, 2, 3, 4). On the other hand, there are also cases where no such rare mutations have been found (e.g., 5, 6), suggesting that such a mechanism is by no means universal. It is difficult indeed to know how prevalent that situation will turn out to be, though large-scale whole-genome sequencing studies currently underway should help address this question. (See here for theoretical discussions: 7, 8, 9, 10).

Both scenarios C and D are congruent with the repeated finding that many of the genes implicated by GWAS (with small effect sizes) are known to sometimes carry rare mutations linked to a high risk of the same disease. That would fit with a mechanism whereby common variants at a given locus increase the penetrance of rare mutations in the same gene, but have little effect otherwise (scenario C). Or it would fit with GWAS signals actually arising from synthetic association with high-penetrance rare mutations in the population (where the common variant tags these haplotypes but has no effect itself whatsoever; scenario D).

Teasing these various scenarios apart is a challenge, especially as, for any given disease, different scenarios may pertain for different SNPs. One method has been to try and find a functional effect of a common SNP at the molecular level. For example, SNPs may affect the expression of a gene, altering binding of regulatory proteins to the parts of DNA that specify how much of the protein to make, in which cells and under which conditions. Multiple such examples have been documented (sometimes with surprising results, as when the gene thus affected is actually quite distant to the SNP itself).

However, finding some effect of a common SNP on expression of a gene at a molecular level does not explain how it affects disease risk. Any of scenarios A, B or C could still pertain, and even scenario D is not ruled out by such findings. Indeed, it is not even clear what kind of molecular-level effect we should expect to explain a tiny odds ratio. Should we expect a small effect at the molecular level, or a big effect at the molecular level that translates to a small effect at the organismal level? Or a big effect at the organismal level, but only in combination with other genetic or environmental insults?

That leaves something of a Catch-22 situation for researchers looking for functional effects of SNPs at the biological level – too small an effect and it will never be detected in messy biological experiments; too big and it will have a rather glaring discrepancy with the epidemiological odds ratio. In the end, it may prove impossible to definitively investigate such small individual epidemiological effects at the biological level, whether from genetic or environmental factors.

This doesn’t mean individual GWAS signals are not useful, of course – they certainly point to loci of interest for further study and have successfully implicated previously unknown biochemical pathways in various diseases (e.g., autophagy in Crohn’s disease). It does mean, however, that the interpretation of individual SNP associations may remain a bit vague.

On the other hand, while the biological effect of any single SNP in isolation may be small, their aggregate effect should be large, at least if the model of disease being cause by a polygenic load of such common risk alleles is correct. Indeed, even if the burden of common alleles is not by itself sufficient to cause disease (e.g., in a scenario where they act collectively as a polygenic modifier of rare mutations, which I consider the most likely scenario), they may still have biological effects in aggregate on relevant traits.

There is now an ever-growing number of studies taking that approach, correlating polygenic scores of risk for various diseases (based on aggregate SNP burden) with a range of biological phenotypes. Whether this approach will really help reveal underlying pathogenic mechanisms remains to be seen. More on that in a later post.

With thanks to John McGrath for helpful comments and edits.

*The usual way around this is to model the effects of a SNP on the liability scale, rather than the observed scale of risk. This is based on the idea that underlying the observed discontinuous distribution of a disease is a normally distributed burden of liability, which effectively remains latent until some threshold of burden is passed, in which case disease results. As a mathematical model to describe risk across the population this works reasonably well, given a host of assumptions. It is a mistake, however, in my mind, to think that the model reflects pathogenic mechanisms in individuals.

Tuesday, July 28, 2015

The Genetics of Neurodevelopmental Disorders

The Genetics of Neurodevelopmental Disorders is a new book that will be published by Wiley in 2015. It is due out in August (in Europe) and September (in the USA), and is available on Amazon here

I had the pleasure of editing the book, which comprises 14 chapters from world-leading scientists and clinicians. Our aim is to provide a timely synthesis of this fast-moving field where so much exciting progress has been made in recent years. Below I have reproduced the Foreword from the book, which outlines the rationale for writing it and the conceptual principles on which it is based, as well as a summary of the topics covered (giving an overview of the state of the field in the process). There are also links to two chapters that are freely available. On behalf of all the authors, I hope the book will prove useful.

The term “neurodevelopmental disorders” is clinically defined in psychiatry as “a group of conditions with onset in the developmental period… characterized by developmental deficits that produce impairments of personal, social, academic, or occupational functioning” [DSM-5]. This term encompasses the clinical categories of intellectual disability (ID), developmental delay (DD), autism spectrum disorders (ASD), attention-deficit hyperactivity disorder (ADHD), speech and language disorders, specific learning disorders, tic disorders and others.

However, the term can be defined differently, not based on age of onset or clinical presentation, but by an etiological criterion, to mean disorders arising from aberrant neural development. This definition includes many forms of epilepsy (considered either as a distinct disorder or as a co-morbid symptom) as well as disorders like schizophrenia (SZ), which have later onset but which can still be traced back to neurodevelopmental origins. Though the symptoms of SZ itself typically arise only in late teens or early twenties, convergent evidence of epidemiological risk factors during fetal development and very early deficits apparent in longitudinal studies strongly indicate that SZ is a disorder of neural development, though its clinical consequences may remain latent for many years.

Collectively, severe neurodevelopmental disorders affect ~5% of the population (though exact numbers are almost impossible to obtain, due to changing diagnostic criteria and substantial co-morbidity between clinical categories). These disorders impact on the most fundamental aspects of human experience: cognition, language, social interaction, perception, mood, motor control, sense of self. They impair function, often severely, and restrict opportunities for sufferers, as well as placing a heavy burden on families and caregivers. As lifelong illnesses, they also give rise to a substantial economic burden, both in direct healthcare costs and indirect costs due to lost opportunity.

The treatments currently available for neurodevelopmental disorders are very limited and problematic. Intensive educational interventions may help ameliorate some cognitive or behavioural difficulties, such as those associated with ID or ASD, but to a limited extent and without addressing the underlying pathology. With respect to psychiatric symptoms, the mainstays of pharmacotherapy (antipsychotic medication, mood stabilizers, antidepressants and anxiolytics) all emerged between the 1940’s and 1960’s with almost no new drugs being developed since. Most of these treatments were discovered serendipitously, and their mechanisms of action remain poorly understood. In most cases, the existing treatments are only partially effective and can induce serious side effects. This is also true for the range of anticonvulsants, and, for all these drugs, it is typically impossible to predict from symptom profiles alone whether individual patients will benefit from a particular drug or possibly be harmed by it. These difficulties and the attendant poor outcomes for many patients arise from not knowing the causes of disease in particular patients and not understanding the underlying pathogenic mechanisms. Genetic research promises to address both these issues.

Neurodevelopmental disorders are predominantly genetic in origin and have often been thought of as falling into two groups. The first includes a very large number of individually rare syndromes with known genetic causes. Examples include Fragile X syndrome, Down syndrome, Rett syndrome and Angelman syndrome but there are literally hundreds of others. Each of these is clearly caused by a single genetic lesion, sometimes involving an entire chromosome or a section of chromosome, sometimes affecting a single gene. Most are characterised by ID, but many also show high rates of epilepsy, ASD or other neuropsychiatric symptoms.

The second group comprises idiopathic cases of ID, ASD, SZ or epilepsy – those with no currently known cause. Despite the lack of an identified genetic lesion, there is still very strong evidence of a genetic etiology across these categories. All of these conditions are highly heritable, showing high levels of twin concordance, much higher in monozygotic than in dizygotic twins, substantially increased risk to relatives and typically zero effect of a shared family environment, indicating strong genetic causation.

What has not been clear is whether these so-called “common disorders” are simply collections of rare genetic syndromes that we cannot yet discriminate, or whether they have a very different genetic architecture. The dominant paradigm in the field has held that the idiopathic, non-syndromic cases of common disorders like ASD or SZ reflect the extreme end of a continuum of risk across the population. This is based on a model involving the segregation of a very large number of genetic variants, each of small effect alone, which can, above a collective threshold of burden in individuals, result in frank disease.

Recent genetic discoveries are prompting a re-evaluation of this model, as well as casting doubt on the biological validity of clinical diagnostic categories. After decades of frustration, the genetic secrets of these conditions are finally yielding to new genomic microarray and sequencing technologies. These are revealing a growing list of rare, single mutations that confer high risk of ASD, ID, SZ or epilepsy, particularly epileptic encephalopathies.

These findings strongly reinforce a model of genetic heterogeneity, whereby common clinical categories do not represent singular biological entities, but rather are umbrella terms for a large number of distinct genetic conditions. These conditions are individually rare but collectively common. Strikingly, almost all of the identified mutations are associated with variable clinical manifestations, conferring risk across traditional diagnostic boundaries. These findings fit with large-scale epidemiological studies that also show shared risk across these disorders. Thus, while current diagnostic categories may reflect more or less distinct clinical states or outcomes, they do not reflect distinct etiologies.

The “genetics of autism” is thus neither singular nor separable from the “genetics of intellectual disability”, the “genetics of schizophrenia” or the “genetics of epilepsy”. The more general term of “developmental brain dysfunction” has been proposed to encompass disorders arising from altered neural development, which can manifest clinically in diverse ways. This book is about the genetics of developmental brain dysfunction.

A lot can go wrong in the development of a human brain. The right numbers of hundreds of distinct types of nerve cells have to be generated in the right places, they have to migrate to form highly organised structures, and they must extend nerve fibres, which navigate their way through the brain to ultimately find and connect with their appropriate partners, avoiding wrong turns and illicit interactions. Once they find their partners they must form synapses, the incredibly complex and diverse cellular structures that mediate communication between nerve cells. These synapses are also highly dynamic, responding to patterns of activity by strengthening or weakening the connection.

The instructions to carry out these processes are encoded in the genome of the developing embryo. Each of these aspects of neural development requires the concerted action of the protein products of thousands of distinct genes. Mutations in any one of them (or sometimes in several at the same time) can lead to developmental brain dysfunction.

The identification of numerous causal mutations has focused attention on the roles of the genes affected, with a number of prominent classes of neurodevelopmental genes emerging. These include genes involved in early brain patterning and proliferation, those mediating later events of cell migration and axon guidance, and a major class involved in synapse formation and subsequent activity-dependent synaptic refinement, pruning and plasticity. Also highlighted are a number of biochemical pathways and networks that appear especially sensitive to perturbation.

Genetic discoveries thus allow an alternate means to classify disorders, based on the underlying neurodevelopmental processes affected. This provides more etiologically valid and arguably more biologically coherent categories than those based on clinical outcome. For individual patients, the application of microarray and sequencing technologies is already changing clinical practice in diagnosis and management of neurodevelopmental disorders. This will only increase as more and more pathogenic mutations are identified.

Such discoveries also provide entry points to enable the elucidation of pathogenic mechanisms, where exciting progress is being made using cellular and animal models. For any given mutation, this involves defining the defects at a cellular level (in the right cells), and working out how such defects propagate to the levels of neural circuits and systems, ultimately producing pathophysiological states that underlie neuropsychiatric symptoms. Definition of these pathways will hopefully lead to a detailed enough understanding of the molecular or circuit-level defects to rationally devise new therapeutics.

The elucidation of the heterogeneous genetic and neurobiological bases of neurodevelopmental disorders should thus enable a much more personalised approach to diagnosis and treatment for individual patients, and a shift in clinical care for these disorders from an approach based on superficial symptoms and generic medicines, to one based on detailed knowledge of specific causes and mechanisms.

The book is organised into several sections:

Chapters 1-6 cover broad conceptual issues relevant to neurodevelopmental disorders in general. These are informed by recent advances in genomic technologies, which have transformed our view of the genetic architecture of both rare and so-called “common” neurodevelopmental disorders. These chapters will consider the genetic heterogeneity of clinical categories like ASD or SZ, the relative importance of different types of mutations (common vs rare; single-gene vs large deletions or duplications; inherited vs de novo), etiological overlap between clinical categories and complex interactions between two or more mutations or between genetic and environmental factors.     

A preprint of Chapter 1, by me, on The Genetic Architecture of Neurodevelopmental Disorders, is available here

Chapters 7-9 present our current understanding of several different types of disorder, grouped by the neurodevelopmental process impacted. Consideration of disorders from this angle provides a more rational and biologically valid approach than consideration from the point of view of clinical symptoms, which can be arrived at through various routes.

Chapters 10-11 deal with the elucidation of pathogenic mechanisms, following genetic discoveries. They include chapters on cellular models (using induced pluripotent stem cells derived from patients) and animal models (recapitulating pathogenic mutations in mice), which are revealing the routes of pathogenesis, from defects in diverse cellular neurodevelopmental processes to resultant alterations in neural circuits and brain systems, which ultimately impinge on behaviour. The manifestation of these defects in humans also depends on processes of learning and experience-dependent development that proceed for many years after birth. Taking this aspect of development seriously is essential as it is a critical period where symptoms can be exacerbated if neglected or potentially improved by intensive interventions. 

Chapters 13-14 consider the clinical implications of recent discoveries and of the general principles described in earlier chapters. Foremost among these is the recognition of extreme genetic heterogeneity, meaning that understanding what is going on in any particular patient requires knowledge of the specific underlying genetic cause. The dramatic reductions in cost for whole-genome sequencing mean such diagnoses will become far easier to make, with important implications for clinical genetic practice (including preimplantation or prenatal screening or diagnosis). Finally, the study of cellular and animal models of specific disorders is already suggesting potential therapeutic avenues for some conditions. These advances illustrate a general principle – to treat these conditions we need to identify and understand the underlying biology and design therapies to treat the specific cause in each patient and not just the generic symptoms.

A preprint of Chapter 13, by Gholson Lyon and Jason O'Rawe, on Human genetics and clinical aspects of neurodevelopmental disorders is available here.

The full Table of Contents is shown below:

           Kevin J. Mitchell

1.     The Genetic Architecture of Neurodevelopmental Disorders
Kevin J. Mitchell

2.     Overlapping Etiology of Neurodevelopmental Disorders
Eric Kelleher and Aiden Corvin

3.     The Mutational Spectrum of Neurodevelopmental Disorders
Nancy D. Merner, Patrick A. Dion and Guy A. Rouleau

4.     The Role of Genetic Interactions in Neurodevelopmental Disorders
Jason H. Moore and Kevin J. Mitchell

5.     Developmental Instability, Mutation Load, and Neurodevelopmental Disorders
Ronald A. Yeo and Steven W. Gangestad

6.     Environmental Factors and Gene-Environment Interactions
John McGrath

7.     The Genetics of Brain Malformations
M. Chiara Manzini and Christopher A. Walsh

8.     Disorders of Axon Guidance
Heike Blockus and Alain Chédotal

9.     Synaptic Disorders
Catalina Betancur and Kevin J. Mitchell

10.  Human Stem Cell Models of Neurodevelopmental Disorders
Peter Kirwan and Frederick J. Livesey

11.  Animal Models for Neurodevelopmental Disorders
Hala Harony-Nicolas and Joseph D. Buxbaum

12.  Cascading Genetic and Environmental Effects on Development: Implications for Intervention
Esha Massand and Annette Karmiloff-Smith

13.  Human Genetics and Clinical Aspects of Neurodevelopmental Disorders
Gholson J. Lyon and Jason O’Rawe

14.  Progress Toward Therapies and Interventions for Neurodevelopmental Disorders
Ayokunmi Ajetunmobi and Daniela Tropea