Lessons for human genetics from genetic screens in model organisms
Why did the axon cross the midline? That seems like a simple enough biological problem to solve. In the developing nervous system, especially in the anatomically simple spinal cord, some nerve cells send a slender nerve fibre (called an axon) across the midline of the nervous system to connect to cells on the other side. The projections of other neurons are restricted to the same side as their own cell bodies. The connections between the two sides are crucial in coordinating movement of the two sides of the body. But, more importantly for this discussion, this system is simple enough to be genetically tractable – at least it seems so.
When I arrived as a graduate student in the lab of Corey Goodman at the University of California at Berkeley, his group had just carried out a genetic screen in fruit flies to try and understand how this developmental decision was controlled. Flies have an equivalent of a spinal cord, called the ventral nerve cord, and Corey and his colleagues had spent many years characterising the cells that make it up and the repeating patterns of simple circuits in each segment. In the developing embryo it is possible to identify specific neurons that either cross or don’t cross the midline. A collection of antibodies to various proteins expressed on the surface of neurons allowed robust visualisation of these projections and were used as a tool for screening for mutations that affected whether neurons project across the midline or not.
The results illustrate some general points about the logic of genetic screens, which, I think, are instructive for our understanding of the genetic architecture of human traits and the types of genes that may be implicated in them.
The design of the screen was pretty straightforward: generate many thousands of mutant lines of flies (each carrying multiple random mutations) and examine the embryos of each line by staining them with an antibody (BP102) that allowed visualisation of the full axonal scaffold in each embryo. This antibody highlights a stereotyped ladder-like pattern of axonal projections in the ventral nerve cord – one big tract extending longitudinally on each side of the midline and two rungs of the ladder in each segment – the “commissures” of axons projecting across the midline. The phenotypes they were looking for were simply any deviation from this normal pattern, especially ones where the commissures were affected.
The logic of this was simple: it was known, in a general sense, that the axonal projections of developing neurons are guided by molecular cues in their environment, which they detect with specialised receptor proteins expressed on their surface. Some of these interactions are attractive and some are repulsive. Given the differential behaviour of specific neurons with respect to the midline, it seemed likely that some neurons were being attracted to it and others repelled by it. There must thus exist some genes encoding proteins whose job it was to direct these processes – guidance cues and receptors. At the time, hardly any such molecules were known in any system and the hope was that this genetic screen would turn up mutations in just those kinds of genes (by screening for alterations in the anatomical structure they produce).
And it did. There were indeed mutations found in genes encoding guidance cues (like Slit) and receptors (like Roundabout (Robo) and Frazzled) and in a protein involved in dynamically regulating guidance receptor expression (Commissureless). These were hugely important for the field – especially as those genes are highly conserved and equally important in wiring the human nervous system.
But the point I want to make is not about them. It’s about all the other mutations that caused defects in the axonal scaffold that were not in genes encoding guidance cues or receptors. There were mutations that affected early nerve cord pattering, the production of neurons, the cellular identity of specific neurons, the specification of cells at the midline that produce the attractive and repulsive cues, the ability of neurons to extend an axon, and on and on. Defects in any of these diverse processes could indirectly lead to an aberrant pattern of axonal projections.
Corey and his colleagues used a series of other antibodies to further characterise each mutant line that showed a defect in the axonal scaffold in order to exclude ones with defects in all these other processes. They comprised the majority of mutant lines. Only a handful remained that encoded proteins with a direct function in axonal guidance and it took a huge amount of work and a very detailed understanding of the system to distinguish them from the much larger set of genes that were only indirectly affecting the axonal scaffold.
Genetic screens in humans
Now, it may be becoming apparent where I’m going with this, in relation to human genetics. The midline screen in flies involved “saturation mutagenesis” – creating enough mutant lines such that every gene that could be mutated to cause a defect in the axonal scaffold would be mutated, several times over. Experimentally, this is achieved by feeding male flies a chemical mutagen that induces dozens of mutations per sperm and then establishing thousands of different mutant lines from their offspring.
In humans, all this work has been done for us. The human population is at saturation mutagenesis. Every time sperm or eggs are generated, new mutations are introduced. Not due to any chemical or environmental mutagen – just because it’s not easy to copy 3 billion letters of DNA with complete accuracy and every time that is done a few mistakes creep past the quality control machinery. Because of the recent explosion in the size of the human population, we can be sure that every base in the genome is mutated in multiple people somewhere on the planet (at least every one that is still compatible with life).
So, when we investigate the genetics of any given phenotype in humans, we are really doing a genetic screen – asking what genes can be mutated to cause a particular phenotype (often a clinical disorder) or affect a particular trait. And the same logic applies as we saw above: some of the mutations or genetic variants that we find affecting a trait will be in genes that encode proteins directly involved in the systems underlying that trait. But many more will be having only very indirect effects on the phenotype – often so distantly related that no real functional relationship holds between the system affected and the cellular roles of the encoded protein.
The proportion of each class depends hugely on what kind of phenotype we are looking at and in how much detail we can characterise it.
The genetics of human brain development
If we are looking at tightly defined neurological conditions, where the phenotype is quite specific, then we may expect a pretty direct relationship between the function of the gene and the effect when it is mutated. Microcephaly, for example, is characterised by a smaller than normal head and brain. Neuroimaging shows this is mostly due to a smaller neocortex. And, sure enough, a subset of the genes identified as mutated in this condition encode proteins that are directly involved in controlling neuronal proliferation in the developing neocortex.
However, there are over 1200 entries for ‘microcephaly’ in the OMIM (Online Mendelian Inheritance in Man) database, and the vast majority of implicated genes do not encode proteins that are directly involved in neurogenesis. They affect hundreds of other kinds of processes, which only indirectly impair neurogenesis when compromised. This is directly analogous to the situation in the midline screen in flies. Even for a condition where the phenotype is directly anatomically observable and the underlying cellular processes reasonably well defined, the vast majority of mutations affecting it do so indirectly.
Now consider the nature of that relationship for phenotypes that are much less well defined and more emergent, like psychological traits or psychiatric disorders. Let’s take a personality trait like impulsivity as an example. A lot of pharmacological and neural systems work has implicated the serotonin signaling system in this trait. And severe mutations in genes encoding components of this system (such as enzymes involved in making or breaking down serotonin and a number of different serotonin receptors) have been shown to affect impulsivity in both rodents and humans, often manifesting in risk-taking and physically aggressive behaviour.
Yet genome-wide association studies of risk-taking behaviour have not landed on variants in genes encoding components of the serotonergic pathway. A recent one identified associated common variants in 116 genes. These were enriched for genes expressed in the brain and involved in developmental pathways pretty generically, but did not include many previous candidate genes directly involved in serotonin (or dopamine) signaling.
Based on what we saw in flies, this should not be a surprise. Even if all of the phenotypic variation in risk-taking involved serotonergic neural pathways (an admittedly simplistic hypothesis), we should not expect all of the genetic variation affecting the trait to be directly impacting serotonin-related biochemical pathways. In fact, we shouldn’t even expect most of the genetic variation affecting the trait to do so – quite the opposite. This is just a statistical corollary of the fact that there are thousands of times more ways, genetically speaking, to mess up that system indirectly than directly.
A similar situation holds for psychiatric disorders, such as schizophrenia. This condition is quite highly heritable, meaning the majority of the variation in risk is genetic in origin. The underlying biology of the symptoms of the condition is not well understood, but alterations to dopaminergic signalling are a likely common feature in psychosis. However, of over a hundred genes implicated by common genetic variants, only one is involved directly in dopamine signaling. And of the dozens of genes with identified rare, high-risk mutations, none directly encode components of the dopamine pathway.
Instead, both sets of genes are enriched for ones with neurodevelopmental functions, defined pretty broadly. The common endpoint may involve dysfunction of dopaminergic neural pathways (again, this is simplistic), but the genetic origins are much more diverse and seem to be centred on how the brain develops. These are not “genes for schizophrenia”. They are not genes for working memory, or for veridical perception, or for not being paranoid. They are certainly not genes for dopaminergic signaling. They are genes for building a human brain.
The omnigenic model
A recent paper from Jonathan Pritchard’s group has considered the genetic architecture of a number of complex human traits and disorders, including schizophrenia, and come to what seems like a very surprising and somewhat disconcerting conclusion – that genetic variation in perhaps all of the genes expressed in the relevant tissue can contribute to phenotypic variation in a trait or disorder across the population.
They were looking specifically at the contribution of common genetic variants to these conditions. For schizophrenia, such variants (called SNPs) collectively explain a proportion of overall variance in risk across the population (maybe 25%). Individually, the effects on risk of one version or another at any given SNP are tiny – almost negligible, in fact – but they can be detected statistically by looking at the frequencies of each version in people with the disease versus people without.
Given large enough samples, one can look through the pattern of SNP frequencies to see which kinds of SNPs tend to show a statistical association. The hope is that some particular biochemical pathways or cellular processes will be implicated. As mentioned above, for schizophrenia, the main gene sets that show enrichment in associated SNPs are those involved in neural development. But that is by no means exclusive.
What Pritchard and his colleagues showed was that some signal of association could be found for effectively every SNP that was associated with a gene that is expressed in the brain. Moreover, while SNPs associated with genes that are specifically expressed in the brain explain more variance on a one-by-one basis, SNPs in more broadly expressed genes explain more variance collectively because there are so many more of them. Again, indirect effects outnumber direct ones.
Pleiotropy is the norm
Not only will most of the mutational effects be indirect, they will also be largely non-specific. Any given mutation that happens to indirectly affect something like serotonin signaling (e.g., by affecting neural development) will probably be indirectly affecting lots of other things too. (In genetic parlance, their effects are pleiotropic). And if you look across a bunch of mutations that affect serotonin signaling in the brain, their other effects will likely be quite diverse.
This will be true for any given phenotype that you screen for. So, just because a construct like impulsivity is heritable, does not mean there are “genes for impulsivity”. It certainly does not mean there is some dedicated genetic module as often proposed in evolutionary psychology models. The apparent selectivity is an illusion created by viewing the effects of pleiotropic genetic variants from the perspective of a single trait at a time.
Is there any point in doing genetics then?
The main rationale for doing genetic screens in model organisms is to elucidate the molecular basis of some biological process. This approach is incredibly powerful and has been extraordinarily successful. However, it depends on a detailed understanding of the processes being probed, and, as we saw in the midline screen, some secondary means to distinguish mutations in genes directly involved in a process of interest from the much larger set of mutations that affect it indirectly.
For many human traits or disorders, especially ones involving the human mind, that detailed understanding is lacking. Oftentimes the phenotype is simply a word on a form – like “schizophrenia”. Moreover, while in model organisms we can simply screen out the indirect and non-specific mutations and focus on the ones directly involved in the processes of interest, we don’t have that luxury in humans. The indirect and non-specific ones will contribute most of the variance in risk.
At one level, that’s okay – just identifying these genetic risk factors can be tremendously useful in a clinical setting. But it does make getting at the underlying biology much more challenging. Nature is under no obligation to make things simple for us. It is going to take a hell of a lot more work after the initial discovery of genetic variants to unravel the biology of complex traits and disorders.
Seeger M, Tear G, Ferres-Marco D, Goodman CS. Mutations affecting growth cone guidance in Drosophila: genes necessary for guidance toward or away from the midline. Neuron. 1993 Mar;10(3):409-26.
Boyle EA, Li YI, Pritchard JK. An Expanded View of Complex Traits: From Polygenic to Omnigenic.
Cell. 2017 Jun 15;169(7):1177-1186.
Here is a report in which natural genetic variation has been fixed to produce a complex trait (elevated blood pressure) in a model organism.ReplyDelete
The trait was fixed within 3 generations. And at a very large divergence from the source population mean. This could not have been achieved, in so short an interval, by simultaneous selection of the hundreds/thousands of trait affecting variants Pritchard proposes and you echo here.
So while your proposition may accurately describe some relationships in natural populations between genetic variation and complex traits, it is also clear that natural genetic variation can exist that has a strong combinatorial effect on traits that must arise from a small number of variants.
The question that Pritchard has sought to answer (why don't we find variants with major trait effects in GWAS) then arises again. Such variants seem to exist in outbred populations and can be rapidly fixed by selective breeding. Interestingly, the underlying variants in this specific model remain unidentified:
This suggests to me that there are elements of biology that are at work, but not sampled by the analytical approaches applied to their discovery. Pritchard's solution probably has merit, but it does seem like a rather easy out and brings with it the danger of diverting attention from a more fundamental question which is what element of biology are we missing when we seek and fail to link complex traits that cannot arise from the fixation of hundreds of variants.
Thanks a lot for your comment. Your point is completely valid. I hope I did not give the impression that I think the omnigenic model explains ALL of the genetic variance of traits or disorders - far from it! It relates specifically to the variance that can be ascribed to *common* genetic variants - SNPs that each have a TINY effect by themselves. Collectively, though, their effects may explain a proportion of the overall variance in risk (or in a continuous trait). For something like schizophrenia, this proportion is maybe 25%, as mentioned above.Delete
That means most of the genetic variance is probably due to rare mutations - many of them brand new ones, in fact. The same applies even to continuous traits - a sizeable fraction is NOT explained by common variants. In any individual, both these kinds of variants are likely at play. (I tend to think of the common variants as a modifying background - like strain background in mice). And you are dead right: nonlinear, non-additive interactions will almost certainly also be important in determining any individual's overall risk level. (See here for more on that: http://www.wiringthebrain.com/2013/07/no-gene-is-island.html)
For both rare and common variants, though, the main point I was making is the same - most of the effects on the phenotype of interest will be indirect. That also implies that most mutations will (indirectly) affect lots of other things too.
All of which makes it much more difficult to implicate specific biochemical pathways or cellular processes on the basis of genetic findings (rare or common) alone.
Interesting post. Sounds like direct effects are a needle in a haystack of indirect effects.ReplyDelete
Do you think looking at effect sizes could help? Intuitively, I'd think that for any given SNP, its direct effects would have larger effect sizes than its indirect effects, but that's just speculation.
For SNPs, I think maybe. It does feel like ones with bigger effect sizes would be more likely to be having a direct effect, but I don't actually know whether that is the case or not. I don't think it needs to be. If you look at rare mutations causing neurological or psychiatric disorders, I don't think that logic holds up - mutations causing phenylketonuria (PKU), for example, have a very strong effect on cognition, but it is extremely indirect (via disruption of basic cellular metabolism).Delete