There is a common view that the human genome has two different parts – a “constant” part and a “variable” part. According to this view, the bases of DNA in the constant part are the same across all individuals. They are said to be “fixed” in the population. They are what make us all human – they differentiate us from other species. The variable part, in contrast, is made of positions in the DNA sequence that are “polymorphic” – they come in two or more different versions. Some people carry one base at that position and others carry another. The idea is that it is the particular set of such variations that we inherit that makes us each unique (unless we have an identical twin). According to this idea, we each have a hand dealt from the same deck.
The genome sequence (a simple linear code made up of 3 billion bases of DNA in precise order, chopped up onto different chromosomes) is peppered with these polymorphic positions – about 1 in every 1,250 bases. That makes about 2,400,000 polymorphisms in each genome (and we each carry two copies of the genome). That certainly seems like plenty of raw material, with limitless combinations that could explain the richness of human diversity. This interpretation has fuelled massive scientific projects to try and find which common polymorphisms affect which traits. (Not to mention personal genomics companies who will try to tell you your risk of various diseases based on your profile of such polymorphisms).
The problem with this view is that it is wrong. Or at least woefully incomplete.
The reason is it ignores another source of variation: very rare mutations in those bases that are constant across the vast majority of individuals. There is now very good evidence that it is those kinds of mutations that contribute most to our individuality. Certainly, they are much more likely to affect a protein’s function and much more likely to contribute to genetic disease. We each carry hundreds of such rare mutations that can affect protein function or expression and are much more likely to have a phenotypic impact than common polymorphisms.
Indeed, far from most of the genome being effectively constant, it can be estimated that every position in the genome has been mutated many, many times over in the human population. And each of us carries hundreds of new mutations that arose during generation of the sperm and egg cells that fused to form us. New mutations may spread in the pedigree or population in which they arise for some time, depending in part on whether they have a deleterious effect or not. Ones that do will likely be quickly selected against.
A new paper from the 1000 genomes project consortium shows that:
“the vast majority of human variable sites are rare and that the majority of rare variants exhibit, at most, very little sharing among continental populations”.
This is a much more fluid picture of genetic variation than we are used to. We are not all dealt a genetic hand from the same deck – each population, sub-population, kindred, nuclear family has a distinct set of rare genetic variants. And each of these decks contains a lot of jokers – the new mutations that arise each time a hand is dealt.
Why have such rare mutations generally been ignored while the polymorphic sites have been the focus of intense research? There are several reasons, some practical and some theoretical. Practically, it has until recently been almost impossible to systematically find very rare mutations. To do so requires that we sequence the whole genome, which has only recently become feasible. In contrast, methods to survey which bases you carry at all the polymorphic sites across the genome were developed quite some time ago now and are relatively cheap to use. (They rely on sampling about 500,000 such sites around the genome – because of unevenness in the way different bits of chromosomes get swapped when sperm and eggs are made, this sample actually tells you about most of the variable sites across the whole genome). So, there has been a tendency to argue that polymorphic sites will be major contributors to human phenotypes (especially diseases) because those have been the only ones we have been able to look at.
Unfortunately, the results of genome-wide association studies, which aim to identify common variants associated with traits or diseases, have been disappointing. This is especially true for disorders with large effects on fitness, such as schizophrenia or autism. Some variants have been found but their effects, even in combination are very small. Most of the heritability of most of the traits or diseases examined to date remains unexplained. (There are some important exceptions, especially for diseases that strike only late in life and for things like drug responses, where selective pressures to weed out deleterious alleles are not at play).
In contrast, many more rare mutations causing disease are being discovered all the time, and the pace of such discoveries is likely to increase with technological advances. The main message that emerges from these studies has been called by Mary-Claire King the “Anna Karenina principle”, based on Tolstoy’s famous opening line:
“Happy families are all alike; every unhappy family is unhappy in its own way”
But can such rare variants really explain the “missing heritability” of these disorders? Some people have argued that they cannot, but this seems to me to be based on a pervasive misconception of how the heritability of a trait is measured and what it means. According to this misconception, if a trait is heritable across the population, that heritability cannot be accounted for by rare variants. After all, if a mutation only occurs in one or a few individuals, it could only minimally (nearly negligibly) contribute to heritability across the whole population. That is true. However, heritability is not measured across the population – it is measured in families and then averaged across the population.
In humans, it is usually derived by comparing phenotypes between people of different genetic relatedness (identical versus fraternal twins, siblings, parents, cousins, etc.). The values of these comparisons are then averaged across large numbers of pairs to allow estimates of how much genetic variance affects phenotypic variance – the population heritability. While a specific rare mutation may only affect the phenotype within a single family, such mutations could, collectively, explain all of the heritability. Completely different sets of mutations could be affecting the trait or causing the disease in different families.
The next few years will reveal the true impact of rare mutations. We should certainly expect complex genetic interactions and some real effects of common polymorphisms. But the idea that our traits are determined simply by the combination of variants we inherit from a static pool in the population is no longer tenable. We are each far more unique than that.
(And if your personal genomics company isn’t offering to sequence your whole genome, it’s not personal enough).
Gravel S, Henn BM, Gutenkunst RN, Indap AR, Marth GT, Clark AG, Yu F, Gibbs RA, The 1000 Genomes Project, & Bustamante CD (2011). Demographic history and rare allele sharing among human populations. Proceedings of the National Academy of Sciences of the United States of America, 108 (29), 11983-11988 PMID: 21730125
Walsh CA, & Engle EC (2010). Allelic diversity in human developmental neurogenetics: insights into biology and disease. Neuron, 68 (2), 245-53 PMID: 20955932
McClellan, J., & King, M. (2010). Genetic Heterogeneity in Human Disease Cell, 141 (2), 210-217 DOI: 10.1016/j.cell.2010.03.032