Missing heritability found safe and well
The case of the ‘missing heritability’ has
become celebrated, by some, as a supposed indicator of just how abjectly the
Human Genome Project has failed to live up to its promise. We’ve known for a
long time that many human traits and common disorders are quite heritable. The
HGP was supposed to reveal the underlying genetic causes, paving the way for
deeper understanding and new therapies. But genetics seemed to keep coming up
short, finding some causal variants but leaving most of the heritability
unexplained, or ‘missing’. A new study (along with a lot of supporting theory
and other empirical evidence) shows that the answer lies in genetic variants
that are much rarer in the population than those that had typically been
studied.
(https://pixabay.com/photos/puzzle-dna-research-genetic-piece-2500333/)
It is common knowledge that many human
traits run in families, as does risk of common disorders like heart disease,
asthma, or mental illness. People resemble their relatives, not just
physically, but physiologically and even psychologically. Twin and family
studies have consistently found that most of that resemblance is due to shared
genetics rather than shared environment.
If shared genes make people more similar to
each other, then, conversely, genetic differences must contribute to differences
between people across the population. The heritability is a statistic that
estimates how much of a contribution genetic variation makes to the observed
phenotypic variation. It can be estimated from twin or family studies and is
expressed as a proportion of the total variance.
For height, the heritability is about 0.8,
meaning 80% of the variance in height across the population is attributable to
genetic variation. For I.Q., most recent estimates of the heritability put it
at about 50%. [Note this does NOT mean that 80% of a person’s height and 50% of
their I.Q. comes from their genes and the rest from their environment – that is
a meaningless statement. Heritability does not apply to individuals nor does it
apply to the absolute values of a trait – it only applies to the variance in that trait]. Intuitively,
what those heritability figures mean is that if we were all clones, there would
be a lot less variability in our height (as we see in identical twins), and
only half as much variability in I.Q.
Finding
the genes
If a trait is heritable, then there must
exist across the population some specific genetic variants that cause the observed
differences in phenotype. So, what are these ‘variants’?
The sequence of the human genome – the
complete set of genetic instructions for a human being – is basically a single string
of DNA, comprising a sequence of three billion chemical bases or ‘letters’, A,
C, G, or T, broken into 23 different chromosomes. Starting from the tip of
chromosome 1, we can track the sequence at any given position and compare it
across people. Most sites show very little difference – almost everyone in the
population has the exact same letter at that position. But some sites – a bit
over 1 in every 1,000 – have two or more versions, both at an appreciable
frequency in the population. So, for example, 35% of the time it might be an
“A” and 65% of the time it might be a “C”. Those sites are referred to as
common variants or single-nucleotide polymorphisms (SNPs).
These common variants arose through the
process of mutation in some very distant ancestor – an error in copying the
DNA, leading to a change in sequence. That new variant then spread to the
descendants of that individual and, over many, many generations, eventually
came to be common in the population. (Of course, many other such variants
disappeared from the population over the same time period). Across the whole
genome, there are about 5 million such common variants.
In considering the source of the
heritability of general traits and common disorders (as opposed to say, very
rare genetic diseases known to be caused by specific mutations), a pretty
reasonable idea is that common variants will make a major contribution to it.
Given the way these traits are inherited (in a continuous, rather than discrete
fashion), it also makes sense to expect that each trait will be affected by
many such common variants, with combined or cumulative effects.
Genome-wide association studies (GWAS) were
designed to try and identify common variants contributing to specific traits or
disorders. The basic idea is epidemiological: if a particular version of a SNP
increases the value of a trait or increases risk of a disorder, then it should
be more common in people with a high value of the trait or with the disorder.
(Just as smoking is more common in people with lung cancer than in people
without). So, if you compare the frequencies of each of the versions of all the
SNPs in the genome across people with different values of a trait, you can find
ones that show this frequency difference. These are statistically “associated”
with the trait.
In practice, you can look at just a sample of
the SNPs across the genome because ones that are near each on a chromosome tend
to get inherited together, so if you know which version one of them is, you can
infer the other ones, with some measurable degree of certainty. So, most GWAS
looked at between 500,000 and a million SNPs. That is still an awful lot of
comparisons, which have to be corrected for when you are figuring out which
differences are statistically significant and which are just due to some
randomness in your sample.
When GWAS first started, the idea was that
there might be a dozen, or a few dozen, such variants affecting any given
trait. Each of them alone would thus be having a fairly sizable effect, which
should be detectable as a big frequency difference, in a modest-sized sample. But
early GWAS did not find any such common variants with big effects. Or even with
medium-sized effects. In fact, the frequency differences were so small that
they did not even reach the threshold for statistical significance in the
sample sizes initially used.
This led to the recognition in the human
genetics community that very large samples would be required to find any SNPs
affecting most traits or common disorders. As a result, huge consortia were
formed to pool resources and generate sample sizes large enough for
well-powered GWAS, along with very important replication samples. (This is,
incidentally, a model now being pursued in neuroimaging and other fields).
The
output of GWAS
As sample sizes grew, more and more SNPs
were found by GWAS to be statistically associated with many different traits
and disorders. In some cases, hundreds or even thousands of associated SNPs
have now been found. This has variously been hailed as a monumental success or
a huge disappointment. In some ways, how you see it depends on whether you’re a
glass half full or a glass ninety-five per cent empty kind of person.
One thing that should be said is that the
signals detected clearly seem to be “real”. We can be fairly certain of that
because the types of genes impacted by the associated SNPs differ depending on
the trait or condition involved, in ways that make biological sense. So, SNPs
associated with height affect a lot of skeletal and growth factor gene
pathways, those associated with intelligence or schizophrenia affect
neurodevelopmental genes, those associated with autoimmune disorders affect
immune genes, and so on.
The downside is that so many genes have been associated that the signal does not really
zero in on very specific biochemical pathways within those large categories.
Indeed, a recent model suggests that genetic variants in pretty much ALL the genes expressed in the relevant tissue may make some contribution to the
collective effects of common variants for a given trait.
And that brings up the second
disappointment. Even if the effects of any individual associated SNP were too
small to be of consequence (or even to offer much purchase for experimental
study), there was hope that their collective effects could still be sizeable.
But it turned out even these were far from enough to explain the heritability
of these phenotypes.
Measuring
the combined effects of common variants
There are a number of different ways that
one can estimate the combined contribution of common variants to a given
phenotype. The first depends on “polygenic scores”. These use a statistical regression
method to add up the effects of all the associated variants and determine how
much of the variance in the phenotype they collectively explain. For any single
variant, its effect size is simply indexed by the frequency difference between
cases with a disorder and controls (or between ends of the spectrum of a
trait). If you see a really big difference in frequency, you can infer a really
big effect (as with smoking and lung cancer).
The effect sizes of single SNPs are tiny –
almost negligible, in fact – but as you combine them they start to add up. You
can take all of the SNPs that pass the threshold for genome-wide statistical
significance and make a combined polygenic score. However, these typically
explain only a very small fraction of the heritability of a trait – often in
the range of 2-3%.
But you can dig a little deeper, into those
SNPs that came close to significance but didn’t quite reach the mark, with the
assumption that many of them are also really associated. As you go down the
list, the heritability explained increases, until it reaches a point where
you’re just adding noise and it starts to get a little worse. The trouble is
that the variance explained still remains way off the total heritability, for
pretty much all traits and disorders – below 10% in most cases.
There are two possible explanations for
that: one is that that is really the limit of the combined effects of common
variants on the traits in question. The other is that the method designed to
detect their effects in the output of GWAS cannot fully distinguish the signal
from the noise, and remaining effects may well exist.
That second possibility is supported by
findings from another type of analysis (known as GCTA-GREML or related
approaches), which also uses output from GWAS, but in a very different way. The
idea is an extension of twin and family studies, which look at how closely
phenotypic similarity tracks genetic similarity. The same can be done across
samples from the wider population, as we are all distantly related to each
other, to varying degrees. The data used for GWAS allow one to measure the degree
of genetic relatedness between people in the sample.
If you look across many, many pairs of
people, even a small increase in relatedness can be associated with a small
increase in phenotypic similarity. In a sample of a thousand people, you can
perform a million pairwise comparisons, yielding enough statistical power to
confidently detect even these small effects. It is then possible to extrapolate
that signal and infer the total heritability tagged by all the common variants
in the genome – the ones that can be shared between people who are only
distantly related.
Those numbers still fall far short of the
total heritability, however, ranging anywhere from 15-50%, depending on the
trait. That means a lot of the heritability is still ‘missing’. So where’s it
at?
Possible causes of the missing heritability
1.
It was never there. This line
of thought argues that the heritability reported in twin and family studies was
over-estimated due to things like identical twins being treated more similarly
than fraternal twins, or other cryptic environmental confounds. The apparent
failure of GWAS to identify causal variants or explain much of the heritability
is taken by many as evidence that the traits are not so heritable after all.
2.
MOAR common variants! Maybe
we’re not sampling all the SNPs effectively, or we haven’t gone to big enough
sample sizes, or the statistical models have some limitations that prevent us
from seeing all of the effects of common variants.
3.
Rare variants. As GWAS only
sample common variants in the genome, perhaps the remaining heritability is
explained by rarer variants.
4.
Structural variants. Changes to
single bases of the DNA sequence are not the only type of mutation. Deletions
or duplications of whole chunks of chromosomes also arise and can make large
contributions to phenotypes, but are not always easy to detect by GWAS.
5.
De novo mutations. Some
variants affecting traits and disorders arise de novo, in the sperm or egg that
fused to create a new person. These can make a big contribution to the
heritability in traditional twin studies (because they are always shared
between monozygotic twins, but never between dizygotic) but don’t show up at
all in population-based studies.
6.
Epistatic interactions. The
regression models underlying GWAS-based methods of explaining variance assume
additive interactions only. If there exists considerable non-additivity in how
the effects of genetic variants combine, this could increase the overall
effect.
Apart from the first one, there is some
evidence that all of these factors are at play. As we will see, the traits and
disorders under investigation really are as heritable as twin and family
studies indicated, with much of the heritability being contributed by rare
variants.
Rare
variants
The arguments for an important role for
rare variants have always had strong theoretical support from an evolutionary
perspective. When a new variant arises through de novo mutation, what happens
to it depends on what effect it has and how it is acted on by natural selection.
Most mutations have no effect, partly
because protein-coding genes make up only about 3% of the genome, but also
because many changes even in the business end of the genome are well tolerated.
Of the ones that do have an effect, most of them can be characterised as “bad”.
Natural selection has been acting on the human genome for many millions of
years and has produced a finely tuned genetic program for making functional
human beings. Messing with that program at random is just statistically much
more likely to mess it up than to improve it.
The question is: how bad is bad? If it’s
just a small effect, then natural selection is not going to care that much
about it. The individual carrying the new variant may still thrive and
reproduce and the variant may be passed to their offspring, and to their
offspring, and so on, and eventually may become common in the population (or
not, with the outcome being largely down to chance). But if it’s a big effect,
then the individual carrying it may struggle to survive or at least to breed,
and may have fewer offspring than others in the population. Such a variant may
survive in the population for a few generations but will never become common.
Indeed, really severe mutations may be
selected against immediately – people who carry them may never have offspring.
Such severely deleterious variants are nevertheless observed in individuals in
the population as they repeatedly arise through de novo mutation.
So, if you see a variant in an individual
and it is common in the population, it is very likely not having a big effect,
if any. If it had, it wouldn’t be common. Conversely, rare variants can have
larger phenotypic effects, and the ones with the scope to have the biggest
effects are the ones that have arisen de novo in an individual. (Nicely reviewed here).
Assaying
the full spectrum of genetic variation
That is where the new study comes in. It is
authored by Pierrick Wainschtein, along with Jian Yang, Peter Visscher and
colleagues. These researchers obtained whole-genome sequence (WGS)
data on 21,620 unrelated individuals of European ancestry. This gave them
access to not only the common genetic variation typically assayed in GWAS, but all
of the genetic variation, including the rare variants.
When
they used just the common variants from this set to estimate the heritability
of height and body-mass index (BMI) in this sample, they got results that
replicated prior findings: 0.49 for height and 0.27 for BMI – substantial, but
well off the twin estimates. But when they put information from all of the genetic variants into their
GCTA-GREML model, they got much higher values: 0.79 for height and 0.40 for
BMI. These are within the range of the heritability estimated from twin studies
(0.7-0.8 for height and 0.4-0.6 for BMI).
Now,
there are some potential caveats and technical points to consider, as discussed
here for example, by Alexander Young. The
issue of possible cryptic population stratification is a particularly vexing
one for the field in general these days, with the realisation that the typical
method used to correct for it (which relies on principal components analysis)
is not sufficient to fully eliminate its possible contribution. (See here and here, for example).
Despite those caveats,
I think the main result will stand: incorporating rare variants from whole
genome sequence will allow us to detect practically all of the missing
heritability.
The
other main finding was that the variants contributing this extra portion of the
heritability tended to be very rare and in low “linkage disequilibrium” with
neighbouring variants. That is a signature of more recent variants, and
suggests that variants affecting these traits have been under negative selection.
(This has a technical consequence discussed below).
However,
that does not necessarily imply that higher or lower values of the trait have
been selected for. We don’t know whether the variants affecting the traits tend
to increase or decrease height or BMI, just that they have some effect on it. In
fact, we don’t even know that they are being selected against because of their
effect on that trait. Most genetic
variants are “pleiotropic” – they affect lots of things. So, the negative
selection on rare variants affecting height and BMI could be due to their
effects on other traits or on general fitness.
Interestingly,
a couple papers (here and here) looking at heritability of various traits in populations
of yeast have also just come out and come to the same conclusion – much of the
heritability is explained by rare variants.
Implications
The
first major implication is that these traits (and presumably others) are every
bit as heritable as twin and family studies have indicated. That is not really
much of a surprise when it comes to these physical traits – it fits with common
observations that identical twins are really extremely similar in height and
tend to be very similar in BMI. But it does suggest that the estimates of
heritability for other, less overt traits – such as intelligence, or
personality constructs, or risk for psychiatric disease – are also reasonably reliable.
(Noting that heritability is not a fixed, universal, biological constant but
one that applies to specific populations at specific times).
In
addition, the fact that much of the heritability lies in rare variants has
important implications for the predictive value of polygenic scores. If these
scores are derived from profiles of common variants alone, as they typically
are (for example in direct-to-consumer genomics), then they will only capture a fraction of the genetic variance, and
therefore will be much less predictive than they could be.
One
hope might be that if we fully sequence a decent-sized reference sample of
people from any given population (in the tens of thousands), and we therefore
see all the rare variants that they have and how they are linked to various
local patterns of common variants, then we might be able to infer or impute
what rare variants other people have merely by analysing their own patterns of
common variants (which is MUCH cheaper than actually fully sequencing everyone).
Wainschtein
and colleagues tried that approach in their sample and found that it does
indeed allow them to access a bit more of the genetic variance, but not all of
it. To get all of it they had to actually fully sequence the genomes of all the
people. The reason goes back to that observation that the rare variants that
were making the extra contribution were not tightly linked to specific patterns
of common variants. So there was no way to impute their presence from a given
pattern of common variants. Essentially, the ones making the biggest
contribution were so rare and so recent that many weren’t even in the reference
sample.
Epistasis – when things don’t
just add up
The
typical model used in the generation of polygenic scores assumes that all of
the individual genetic effects combine linearly. That is, the effect of any
given variant on a trait as measured across the whole population (say, +0.1cm
of height) is the actual effect in every individual who carries it, regardless
of the rest of their genetic make-up. Now, that could be true, but we know that
non-linear (non-additive, or “epistatic”) genetic interactions are actually the norm in biology, not some kind of weird exception.
That
said, if a given trait is affected by thousands of genetic variants, then all
those non-linear interactions may actually average out, and you will be left
with a sum of effects that does actually look additive, statistically speaking.
Really, you can model it either way and the data will fit pretty well.
Conversely,
however, the fewer variants involved, and the larger their individual effects, the
greater the opportunity for pairwise or higher-order epistatic interactions to
have a big effect and cause a deviation from additivity in individuals.
This
brings us to a crucial point – the distinction between explaining variance in a
trait across the population and predicting individual values in that trait.
These are not the same thing at all. You can in fact have quite good
explanation of the variance – even complete access to all the heritability, as
in the Wainschtein study – and still not be able to predict the genetic effects
in individuals very precisely, especially when epistatic interactions are at play.
Unique and unpredictable
In
general then, our individual genetic heritage comprises a large background of
ancient, common variants, which make an important collective contribution to
many traits, and a much more unique profile of rare variants, which have larger
individual effects and which are also likely to show more non-linear genetic
interactions.
Even
if we capture most of the genetic variance affecting a trait across the
population, it will remain tricky to make exact predictions of individual
phenotypes from genetic information, because those unique profiles will never
have been seen before in reference samples.
And
finally, it should be noted that even if we do have access to all the rare
variants and even if we do know all the epistatic effects, most traits are not
completely heritable. Because much of the rest of the variation may be due to
randomness in development, there will always be a strong limit – in principle,
not just in practice – to how predictive polygenic scores can ever be. Which
personally makes me feel – admittedly rather perversely – a bit better.
Comments
Post a Comment