Friday, April 27, 2012

Robustness and fragility in neural development

So many things can go wrong in the development of the human brain it is amazing that it ever goes right.  The fact that it usually does – that the majority of people do not suffer from a neurodevelopmental disorder – is due to the property engineers call robustness.  This property has important implications for understanding the genetic architecture of neurodevelopmental disorders – what kinds of insults will the system be able to tolerate and what kind will it be vulnerable to?

The development of the brain involves many thousands of different gene products acting in hundreds of distinct molecular and cellular processes, all tightly coordinated in space and time – from patterning and proliferation to cell migration, axon guidance, synapse formation and many others.  Large numbers of proteins are involved in the biochemical pathways and networks underlying each cell biological process.  Each of these systems has evolved not just to do a particular job, but to do it robustly – to make sure this process happens even in the face of diverse challenges. 

Robustness is an emergent and highly adaptive property of complex systems that can be selected for in response to particular pressures.  These include extrinsic factors, such as variability in temperature, supply of nutrients, etc., but also intrinsic factors.  A major source of intrinsic variation is noise in gene expression – random fluctuations in the levels of all proteins in all cells.  These fluctuations arise due to the probabilistic nature of gene transcription – whether a messenger RNA is actively being made from a gene at any particular moment.  The system must be able to deal with these fluctuations and it can be argued that the noise in the system actually acts as a buffer.  If the system only worked within a narrow operating range for each component then it would be very vulnerable to failure of any single part. 

Natural selection will therefore favour system architectures that are more robust to environmental and intrinsic variation.  In the process, such systems also indirectly become robust to the other major source of variation – mutations. 

Many individual components can be deleted entirely with no discernible effect on the system (which is why looking exhaustively for a phenotype in mouse mutants can be so frustrating – many gene knockouts are irritatingly normal).  You could say that if the knockout of a gene does not affect a particular process, that that means the gene product is not actually involved in that process, but that is not always the case.  One can often show that a protein is involved biochemically and even that the system is sensitive to changes in the level of that protein – increased expression can often cause a phenotype even when loss-of-function manipulations do not.

Direct evidence for robustness of neurodevelopmental systems comes from examples of genetic background effects on phenotypes caused by specific mutations.  While many components of the system can be deleted without effect, others do cause a clear phenotype when mutated.  However, such phenotypes are often modified by the genetic background.  This is commonly seen in mouse experiments, for example, where the effect of a mutation may vary widely when it is crossed into various inbred strains.  The implication is that there are some genetic differences between strains that by themselves have no effect on the phenotype, but that are clearly involved in the system or process, as they strongly modify the effect of another mutation.

How is this relevant to understanding so-called complex disorders?  There are two schools of thought on the genetic architecture of these conditions.  One considers the symptoms of, say, autism or schizophrenia or epilepsy as the consequence of mutation in any one of a very large number of distinct genes.  This is the scenario for intellectual disability, for example, and also for many other conditions like inherited blindness or deafness.  There are hundreds of distinct mutations that can result in these symptoms.  The mutations in these cases are almost always ones that have a dramatic effect on the level or function of the encoded protein. 

The other model is that complex disorders arise, in many cases, due to the combined effects of a very large number of common polymorphisms – these are bases in the genome where the sequence is variable in the population (e.g., there might be an “A” in some people but a “G” in others).  The human genome contains millions of such sites and many consider the specific combination of variants that each person inherits at these sites to be the most important determinant of their phenotype.  (I disagree, especially when it comes to disease).  The idea for disorders such as schizophrenia is that at many of these sites (perhaps thousands of them), one of the variants may predispose slightly to the illness.  Each one has an almost negligible effect alone, but if you are unlucky enough to inherit a lot of them, then the system might be pushed over the level of burden that it can tolerate, into a pathogenic state. 

These are the two most extreme positions – there are also many models that incorporate effects of both rare mutations and common polymorphisms.  Models incorporating common variants as modifiers of the effects of rare mutations make a lot of biological sense.  What I want to consider here is the model that the disease is caused in some individuals purely by the combined effects of hundreds or thousands of common variants (without what I call a “proper mutation”). 

Ironically, robustness has been invoked by both proponents and opponents of this idea.  I have argued that neurodevelopmental systems should be robust to the combined effects of many variants that have only very tiny effects on protein expression or function (which is the case for most common variants).  This is precisely because the system has evolved to buffer fluctuations in many components all the time.  In addition to being an intrinsic, passive property of the architecture of developmental networks, robustness is also actively promoted through homeostatic feedback loops, which can maintain optimal performance in the face of variations, by regulating the levels of other components to compensate.  The effects of such variants should therefore NOT be cumulative – they should be absorbed by the system.  (In fact, you could argue that a certain level of noise in the system is a “design feature” because it enables this buffering).

Others have argued precisely the opposite – that robustness permits cryptic genetic variation to accumulate in populations.  Cryptic genetic variation has no effect in the context in which it arises (allowing it to escape selection) but, in another context – say in a different environment, or a different genetic background – can have a large effect.  This is exactly what robustness allows to happen – indeed, the fact that cryptic genetic variation exists provides some of the best evidence that we have that the systems are robust as it shows directly that mutations in some components are tolerated in most contexts.  But is there any evidence that such cryptic variation comprises hundreds or thousands of common variants? 

To be fair, proving that is the case would be very difficult.  You could argue from animal breeding experiments that the continuing response to selection of many traits means that there must be a vast pool of genetic variation that can affect them, which can be cumulatively enriched by selective breeding, almost ad infinitum.  However, new mutations are known to make at least some contribution to this continued response to selection.  In addition, in most cases where the genetics of such continuously distributed traits have been unpicked (by identifying the specific factors contributing to strain differences for example) they come down to perhaps tens of loci showing very strong and complex epistatic interactions (1, 2, 3).  Thus, just because variation in a trait is multigenic, does not mean it is affected by mutations of small individual effect – an effectively continuous distribution can emerge due to very complex epistatic interactions between a fairly small number of mutations which have surprisingly large effects in isolation.

(I would be keen to hear of any examples showing real polygenicity on the level of hundreds or thousands of variants). 

In the case of genetic modifiers of specific mutations – say, where a mutation causes a very different phenotype in different mouse strains – most of the effects that have been identified have been mapped to one or a small number of mutations which have no effect by themselves, but which strongly modify the phenotype caused by another mutation. 

These and other findings suggest that (i) cryptic genetic variation relevant to disease is certainly likely to exist and to have important effects on phenotype, but that (ii) such genetic background effects can most likely be ascribed to one, several, or perhaps tens of mutations, as opposed to hundreds or thousands of common polymorphisms. 

This is already too long, but it begs the question: if neurodevelopmental systems are so robust, then why do we ever get neurodevelopmental disease?  The paradox of systems that are generally robust is that they may be quite vulnerable to large variation in a specific subset of components.  Why specific types of genes are in this set, while others can be completely deleted without effect, is the big question.  More on that in a subsequent post…

Wednesday, April 4, 2012

De novo mutations in autism

A trio of papers in this week’s Nature identifies mutations causing autism in four new genes, demonstrate the importance of de novo mutations in the etiology of this disorder and suggest that there may be 1,000 or more genes in which high-risk, autism-causing mutations can occur.

These studies provide an explanation for what seems like a paradox: on the one hand, twin studies show that autism is very strongly genetic (identical twins are much more likely to share a diagnosis than fraternal twins) – on the other, many cases are sporadic, with no one else in the family affected. How can the condition be “genetic” but not always run in the family? The explanation is that many cases are caused by new mutations – ones that arise in the germline of the parents. (This is similar to conditions like Down syndrome). The studies reported in Nature are trying to find those mutations and see which genes are affected.

They are only possible because of the tremendous advances in our ability to sequence DNA. The first genome cost three billion dollars to sequence and took ten years – we can do one now for a couple thousand dollars in a few days. That means you can scan through the entire genome in any affected individual for mutated genes. The problem is we each carry hundreds of such mutations, making it difficult to recognise the ones that are really causing disease.

The solution is to sequence the DNA of large numbers of people with the same condition and see if the same genes pop up multiple times. That is what these studies aimed to do, with samples of a couple hundred patients each. They also concentrated on families where autism was present in only one child and looked specifically for mutations in that child that were not carried by either parent – so-called de novo mutations, that arise in the generation of sperm or eggs. These are the easiest to detect because they are likely to be the most severe. (Mutations with very severe effects are unlikely to be passed on because the people who carry them are far less likely to have children).

There is already strong evidence that de novo mutations play an important role in the etiology of autism – first, de novo copy number variants (deletions or duplications of chunks of chromosomes) appear at a significantly higher rate in autism patients compared to controls (in 8% of patients compared to 2% of controls). Second, it has been known for a while that the risk of autism increases with paternal age – that is, older fathers are more likely to have a child with autism. (Initial studies suggested the risk was up to five-fold greater in fathers over forty – these figures have been revised downwards with increasing sample sizes, but the effect remains very significant, with risk increasing monotonically with paternal age). This is also true of schizophrenia and, in fact, of dominant Mendelian disorders in general (those caused by single mutations). The reason is that the germ cells generating sperm in men continue to divide throughout their lifetime, leading to an increased chance of a mutation having happened as time goes on.

The three studies in Nature were looking for a different class of mutation – point mutations or changes in single DNA bases. They each provide a list of genes with de novo mutations found in specific patients. Several of these showed a mutation in more than one (unrelated) patient, providing strong evidence that these mutations are likely to be causing autism in those patients. The genes with multiple hits include CHD8, SCN2A, KATNAL2 and NTNG1. Mutations in the last of these, NTNG1, were only found in two patients but have been previously implicated as a rare cause of Rett syndrome. This gene encodes the protein Netrin-G1, which is involved in the guidance of growing nerves and the specification of neuronal connections. CHD8 is a chromatin-remodeling factor and is involved in Wnt signaling, a major neurodevelopmental pathway, as well as interacting with p53, which controls cell growth and division. SCN2A encodes a sodium channel subunit; mutations in this gene are involved in a variety of epilepsies. Not much is known about KATNAL2, except by homology – it is related to proteins katanin and spastin, which sever microtubules – mutations in spastin are associated with hereditary spastic paraplegia. How the specific mutations observed in these genes cause the symptoms of autism in these patients (or contribute to them) is not clear – these discoveries are just a starting point, but they will greatly aid the quest to understand the biological basis of this disorder.

The fact that these studies only got a few repeat hits also means that there are probably many hundreds or even thousands of genes that can cause autism when mutated (if there were only a small number, we would see more repeat hits). Some of these will be among the other genes on the lists provided by these studies and will no doubt be recognisable as more patients are sequenced. Interestingly, many of the genes on the lists are involved in aspects of nervous system development or function and encode proteins that interact closely with each other – this makes it more likely that they are really involved.

These studies reinforce the fact that autism is not one disorder - not clinically and not genetically either. Like intellectual disability or epilepsy or many other conditions, it can be caused by mutations in any of a very large number of genes. The ones we know about so far make up around 30% of cases – these new studies add to that list and also show how far we have to go to complete it.

We should recognise too that the picture will also get more complex – in many cases there may be more than one mutation involved in causing the disease. De novo mutations are likely to be the most severe class and thus most likely to cause disease with high penetrance themselves. But many inherited mutations may cause autism only in combination with one or a few other mutations.

These complexities will emerge over time, but for now we can aim to recognise the simpler cases where a mutation in a particular gene is clearly implicated. Each new gene discovered means that the fraction of cases we can assign to a specific cause increases. As we learn more about the biology of each case, those genetic diagnoses will have important implications for prognosis, treatment and reproductive decisions. We can aim to diagnose and treat the underlying cause in each patient and not just the symptoms.