Eugenics, statistical hubris, and unknowable unknowns in human genetics

A new paper just out in Nature, by Peter Visscher and colleagues (including bio-ethicist Julian Salvulescu) explores the idea of polygenic genome editing of human embryos to reduce the risk of common diseases. This is, to say the least, a controversial idea, and a decidedly fantastical one. The authors present the results of statistical modelling which suggests that editing a small number of risk variants in each embryo’s genome could dramatically reduce the risk of a number of common disorders. But there are good reasons (lots of them) to doubt the assumptions on which this modelling is based and to have serious concerns about possibly deleterious unintended consequences of such interventions.

I co-wrote a commentary on the article with geneticist Shai Carmi and law professor and bio-ethicist Hank Greely, outlining some of the limitations of the modelling and our concerns over the dangers of the proposed approach. I’ll expand on those points here.

 

First, the development of CRISPR genome-editing technologies offers the promise of precise intervention in people’s genetic material. Where they carry some mutation that causes disease, it may be possible to use CRISPR tools to edit that mutation back to its “wild-type” sequence or to intervene in some other way that ameliorates the condition. This technique is already being used to edit the genome in some cells of people suffering from known “monogenic” conditions – ones where the disease is caused by a single, well-defined and well-studied mutation – such as sickle-cell disease. Importantly, the cells that are genetically modified in this case are blood-producing stem cells (taken out of the patient, modified, and put back in), not cells in the germline. As a result, the edits to the genome will not be passed on to offspring of these patients.

 

The idea that you could apply this in an embryo known to carry disease-causing mutations (in an in vitro fertilisation scenario) is an obvious extension of this approach, but one that is fraught with ethical issues. This would “cure” the resultant individual in the sense of removing or reversing the disease-causing mutations from all their cells. But it would also mean that they would pass on whatever edits are made to their offspring. That’s a big step – it’s not just treating a single patient, it’s affecting subsequent generations, and, depending on how widely it is deployed, could change the gene pool of the population at large. 

 

Now, you might argue that, for mutations like those causing sickle-cell anaemia or cystic fibrosis or Huntington’s disease, that it would just be a good thing if these mutations were indeed removed from the population, or at least reduced in frequency. But good for whom? Who has a legitimate interest in the genetic health of individuals or the population at large? Is this just a question of whether parents should have the right to edit the genomes of their prospective children? Or should we also be concerned with whether the state should actively support (or possibly even mandate) such interventions? When does individual medicine become public health? And when does public health become eugenics? These questions crop up again below.

 

One immediate concern in these discussions is of course whether the CRISPR methods being used are actually: (i) effective, and (ii) safe. That is, can we be sure that they will really succeed in making the desired edits? And can we also be sure that the approach won’t lead to any “off-target effects” or other unpredictable changes elsewhere in the genome? These are very open questions and many researchers are actively working to increase the precision and reliability of CRISPR methods. So, for the sake of argument, let’s assume we really can make just the genomic edits we want.

 

The next question is whether the edits themselves will necessarily have (only) beneficial effects. Answering this depends on how confident we are in our current understanding of the relationships between genotypes and phenotypes. For rare mutations causing serious diseases, this seems fairly straightforward. However, even there, there is evidence for some conditions – like sickle-cell anaemia itself, for example – that some mutations that cause disease when present in two copies, have a benefit when present in only one copy – in this case providing some protection against malaria. That said, such situations seem to be very rare and most disease-causing mutations are frankly and uncomplicatedly deleterious – it’s just bad to have them.

 

But there are other situations where the waters get considerably muddier. This is exemplified by the discovery that very rare mutations of the gene CCR5, which encodes a chemokine receptor involved in immune functions, confer resistance to infection by HIV. (This is because CCR5 is actually the “receptor” for the HIV virus – the means by which the virus attaches itself to immune cells to enable infection). Here is a rare mutation that seems to have beneficial effects – potentially life-saving, in fact.

 

This is the gene that was infamously targeted by disgraced researcher He Jiankui, who, in 2018, announced that he had edited the genomes of human embryos in order to introduce mutations in the CCR5 gene, and that two babies with such manipulated genomes had been born (discussed in detail in this book by Hank Greely). This intervention was widely condemned, for many reasons, including the fact that informed consent had not been obtained, and that the safety and efficacy of the intervention was entirely unproven and, in essence, untested. It was simply experimenting on human beings. He Jiankui was subsequently jailed although he has since been released and appears unrepentant and intent on further similar efforts.

 

Apart from the obvious ethical issues, there were two major scientific problems with his “experiments”: first, the desired edits were not in fact made precisely, and it is not known whether other off-target effects may also have occurred. And second, even if those precise edits had been made, there is no guarantee that they would only have had positive effects. The CCR5 protein does not just sit around in our cells waiting to be a receptor for a virus that will kill us. It has normal functions, not just in the immune system, but also unexpected ones, for example in the regulation of bone physiology and even in synaptic plasticity and learning and memory. Knocking it out is thus highly likely to have deleterious effects on other functions – what geneticists call “pleiotropic” effects (from the Greek words pleio, which means "many," and tropic, which means "affecting.")

 

This case thus illustrates a crucial point: rare mutations that are “protective” for one disease (i.e., statistically associated with reduced risk) are quite likely to have negative pleiotropic effects on other traits. That is, they may be rare for a reason (even if we don’t know what that is). This point is important for the current discussion, because it is precisely these kinds of rare protective mutations that Visscher and colleagues propose to introduce into human embryos.

 

 

Targeting common disorders

 

The target of Visscher and colleagues is not rare monogenic conditions or resistance to infectious agents like viruses. They are concerned, rather, with common disorders, specifically including: coronary artery disease, Alzheimer’s disease, major depressive disorder, diabetes, and schizophrenia. These conditions, which collectively affect millions of people worldwide, are substantially, but not completely heritable. That is, a substantial proportion of the variance in risk for these conditions across the population is due to genetic variation. (Roughly speaking, risk for these conditions loosely “runs in the family”).

 

But unlike conditions like cystic fibrosis or sickle-cell anaemia, these conditions are not monogenic – they are not caused by mutations in one specific gene. They are polygenic – that is, the genetic risk is due to the combined effects of thousands (or even millions) of genetic variants across the genome. Over the past couple of decades, researchers have used a technique called genome-wide association studies (or GWAS) to identify specific genetic variants (known as single-nucleotide polymorphisms or SNPs) associated with risk for these various conditions.

 

Each SNP is a site in the genome where the DNA sequence comes in a couple of versions (or “alleles”) across the population – it might be a “C” in some people and an “A” in others, for example. It turns out there are millions of such polymorphic sites in the human genome. If we want to look for variants associated with disease, we simply have to ask whether the frequency of one form (say the “C” version) is higher in people with the condition than without. (That’s exactly analogous to asking whether the frequency of smoking is higher in people with lung cancer than without).

 

These efforts have been highly successful. Thousands of such “risk variants” have been discovered for the common disorders mentioned above, and for many others. Each such variant typically has a tiny statistical effect on risk, when considered singly (which is why it typically requires samples of tens or even hundreds of thousands of people to detect the tiny differences in frequency between cases and controls). But considered collectively, they can account for a substantial proportion of the total genetic variance in risk.

 

Now, if that’s the situation, you might be wondering how genome editing could possibly ever be profitably applied to these conditions. If each person’s genetic risk is due to thousands of such SNPs, then how could editing only a few of them make any appreciable difference? That is exactly what Visscher and colleagues explore and they present results of statistical modelling that argue that editing even 5-10 risk variants could have substantial effects on risk – huge effects, in fact! (This is why the paper is in Nature 😉).

 

They estimate that editing a set of 10 of the variants with largest individual effects (and highest frequency in the population) could reduce the risk of common diseases by anywhere from 3-fold to 60-fold, depending on the condition!

 


The fact that they selected risk variants that were at very high frequency in the population is important. It might seem odd that there could be sites in the genome where by far the most common variant (at frequencies of 80-99%) is a risk variant for a disorder. That is almost like saying having a typical genome increases your risk of these conditions, which seems a bit weird. But if you look at it from the opposite direction, you can interpret it instead as evidence that the rare allele at each of those sites (the one that only 1-20% of people carry) is actually protective against the disorder. Those are just two different ways of expressing the same statistical observation of a frequency difference between cases and controls.

 

Focusing on those relatively rare protective variants is a promising approach, as most people will carry the common versions at most of the 10 sites chosen. (Of course, since we actually have two copies of each chromosome, there are really 20 sites at issue). That means that those sites represent a fairly generic target for editing, which is predicted to be applicable to most people in the population. So, no need to customise – this is a one-size-fits-all genetic prophylactic! That’s the theory, at least…

 

The statistical modelling, taken at face value, suggests that this approach could be hugely beneficial to individuals, and, by extension, to the health of the population as a whole (if widely deployed). The question is whether we should take it at face value. The modelling is based on lots of assumptions, which the authors, to their credit, take some pains to spell out. If these assumptions are not valid, they may undermine the predicted efficacy. Moreover, there are additional safety concerns that are just not part of the model at all – they arise due to possible pleiotropic and “epistatic” (non-additive) effects that are highly unpredictable.

 

 

Is a simple linear model appropriate?

 

The first assumption with this approach is that we know what the causal variants are at the relevant sites in the genome. It turns out that most of those SNPs are not doing anything themselves – they just act as tags for bits of chromosomes that tend to get co-inherited in chunks. The real causal variant is usually hanging around somewhere nearby, but it is often difficult – I mean really difficult – to identify the true culprit. So, even after GWAS has identified associated SNPs, there’s still a lot of work to be done to identify the causal variants, with no guarantee of success. But, for the sake of argument, let’s imagine a future world in which that work has been done and we have perfect knowledge of the causal variants at each site.

 

Even with that, a more crucial assumption in the modelling used is that the effects of these different genetic variants will simply add up, in a statistical sense. That is, that the decrease in risk due to any individual variant will be independent of the presence or absence of other variants in the genome. You can just keep adding protective variants to a genome and you’ll keep seeing the same relative reductions in risk. Is this a reasonable assumption? From one perspective, yes, and from another, not at all.

 

The difference hinges on whether we are talking about “statistical epistasis” or “biological epistasis” (reviewed here and here). Since people first started doing experimental genetics with model organisms – pea plants, fruit flies, bacteria, yeast, nematodes, mice – they’ve recognised that the effects of one mutation can be suppressed or enhanced by the presence of other mutations. They called mutations that suppress the effects of other mutations “epistatic” to them (“standing above”). This is just an extension of the better-known ideas of dominance and recessiveness, which capture the non-additive relationships of mutations at a single gene. Epistasis captures the same kind of non-additive effects of mutations in different genes.  


This phenomenon is commonplace – ubiquitous, really. It is a mainstay of experimental genetics in model organisms, as geneticists use these non-additive relationships to work out functional relationships between different gene products, delineating biochemical pathways and regulatory networks. More generally, it is extremely common to find that the phenotype caused by some mutation is highly dependent on the “genetic background” in which it is found. For example, the phenotype may change, drastically in some cases, if a mutation is crossed into a different strain of flies or mice.

 

Overall, as I discuss at greater length here, the evidence from experimental genetics in model organisms suggests that the relationship between genetic variation and phenotypic variation is highly complex, non-linear, and unpredictable.

 

But now here’s a surprise. When people look for these kinds of non-additive effects in human genetics – at least in the genetics of common traits and disorders – they typically don’t find them. To be more precise, they don’t seem to make much of a statistical contribution at the population level of analysis. The same kinds of effects absolutely do hold when looking at rare mutations with large effects on disease risk. There are loads of examples of digenic interactions that result in disease or that modify the severity of symptoms. And loads of evidence that the effects of such mutations are sensitive to the genetic (polygenic) background (e.g., here, here, and here).

 

However, when we just examine the collective effects of (relatively) common genetic variants of the sort tagged by SNPs in GWAS, we typically see that a simple, linear model of purely additive interactions does a pretty good job of capturing most of the genetic effects across the population. That seems weird. If two or three mutations can show big non-additive effects, shouldn’t we expect thousands of them to show all kinds of higher-order effects, so that things might really get massively non-linear? Well, it seems not, for a few reasons.

 

First, the SNPs in question only have small effects by themselves (because mutations with big deleterious effects are rapidly selected against and never rise to an appreciable frequency in the population). This means they’re also less likely to have strongly deleterious epistatic effects – just because they’re not doing much, biologically speaking. So, the class of genetic variants that dominates the genetic architecture of complex traits in natural populations is very different from experimentally induced mutations in model organisms, or from rare (recently arising) mutations in humans that have very large effects by themselves (and in combinations).

 

Second, when there are rarer (but still relatively common) SNPs that do have non-additive pairwise interactions, the statistical contribution of these epistatic effects to the overall genetic variance across the population depends on both the size of these effects and the frequencies of the variants. The rarer they are, the less they’ll be seen in combination. So any specific epistatic effect may only show up in some very small number of individuals, and thus will not make a big contribution to how genetic variation manifests across the whole population. In addition, the statistical method that is used to divide up the variance just prioritises the additive contribution, even though lots of non-additive interactions are included in that term.

 

As Visscher and Wray write: “Because of the dependency of the variance components on allele frequency, a strong deviation from additive gene action can result in mostly additive genetic variation.”

  

All of these factors mean that simple, linear models of genetic “liability” for common disorders actually do a reasonable job of treating the collective effects of common genetic variation across the population. It is that tradition that underpins the modelling in the current paper. The problem with this approach is that the proposed interventions would not be done on the population – they’d be done on specific individuals. So, we have to ask, for such individuals: would the intervention be as effective as predicted, and would it be safe?

 

The answer to both of these questions is: we just don’t know. In fact, it may be unknowable, without doing the experiment. For all practical purposes, it may be impossible to predict either efficacy or safety with any certainty. We simply can’t capture all the possible complexities of the biology of (genuinely) complex disorders with simple linear models based on population-averaged data.

 

 

Will the effects be as large as predicted?

 

The variants that Visscher and colleagues propose to introduce are specifically relatively rare, “protective” ones. Their modeling depends on the idea that their protective effects will simply add up. But now we’re in the space where biological epistasis really matters, because we’re talking about combining rare alleles in ways that don’t normally occur in the population. The only estimate we have of their protective effects is averaged across genomic backgrounds that do not carry many of the other rare variants.

 

Based on experience from model organisms (see here for detailed examples), it seems very possible, even highly likely, that combining these kinds of rare variants would result in diminishing returns – that is, their protective effects would be reduced if other protective variants are already present.

 

As Sackton and Hartl wrote in a very relevant article entitled “Genotypic Context and Epistasis in Individuals and Populations” in 2016:

 

Paradoxically, the effects of genotypic context in individuals and populations are distinct and sometimes contradictory. We argue that predicting genotype from phenotype for individuals based on population studies is difficult and, especially in human genetics, likely to result in underestimating the effects of genotypic context.”

 

Such effects would obviously reduce the efficacy of the intervention, as Visscher and colleagues concede: “Quantitatively, the effect of epistasis would be similar to what was modelled for G x E interactions (Fig. 2), where a reduction in actual outcomes compared to what is predicted in the absence of G x G interactions.” However, one could still argue that even reduced gains would still be worth pursuing, as long as there are no appreciable risks. The problem is that the authors do not fully explore the possible dangers of their proposed scheme.

 

 

Would it be safe?

 

As stated above, the first challenge will be to correctly identify the causal variants linked to our rare “protective” SNPs. This is not trivial, and experience has shown it is quite possible to get it wrong. Editing the wrong variant (one that is not in fact protective for the disorder, but which may have other effects) would obviously be a potential risk.

 

But, for the sake of the thought experiment, let’s presume we’ve been able to find the right ones. The causal variants will presumably also be rare, perhaps a good bit rarer in fact. The problem with introducing rare variants, as stated above, is that they may be rare for a reason. That is, a variant that is statistically protective against some common disease may also have negative (pleiotropic) effects on other phenotypes and may consequently be under negative selection. After all, for many of the linked causal variants, we would be talking about replacing a much more common, presumably “wild-type” (or ancestral) version of the DNA sequences with a more recently arising mutant version. They might have some protective effect but they might also influence all kinds of other things. Recent results in humans reinforce this expectation.

 

If all we are basing our models on are statistical data for the diseases themselves, then we will have no window onto these other possible effects – they will be unknown unknowns. But the problem gets even worse when we consider the possibility of epistasis and pleiotropy acting together. Not only do we not know what possible negative effects these “protective” variants might have on their own. We really don’t know what negative effects they might have in combination. And there’s no way we could find out without doing the experiment, because it’s quite likely that no one on earth carries all of the rare versions at both copies of all the ten sites being edited. So now we're talking about unknowable unknowns.

 

Visscher and colleagues acknowledge the potential for negative pleiotropic effects: “Pleiotropy is the norm in genetics. Variants associated with decreased disease can also be associated with other diseases and traits and may increase their risk.” But the way that they assess the possible consequences of such effects is frankly troubling. They consider protection against disease as increasing fitness and other possible negative effects as simply decreasing fitness. They then state that overall gains in fitness will be reduced if negative effects arise:

 

“We modelled a possible deleterious effect of [heritable polygenic editing] on fitness using a model of stabilizing selection (Supplementary Note 2), and the results are shown in Supplementary Fig. 3. These results imply that the fitness of edited genomes can be substantially reduced if the phenotypic change is large and the trait is under a strong stabilizing selection. The consequences of pleiotropy mean that the actual effects of HPE on disease prevalence will be less than those predicted in this model, and the overall effects on quantitative traits not as strong as predicted.” (My emphasis in bold).

 

This framing amounts to saying that there is some level of acceptable casualties (individuals with “reduced fitness”) from these interventions, if the overall population gain is still positive. These complicated and unpredictable negative effects are just seen as reducing the overall efficacy of the intervention across all the edited genomes. This seems very much like a move from concerns of individual welfare to concerns of public health, and even from public health to eugenics.

 

It sounds very like the way animal breeders might talk about the effects of selective breeding, where they can accept undesirable side effects for some individuals, in order to get a change in the average of some desired trait in the next generation. Actually, even some in the animal genetics community are not comfortable with the idea that we should engage in widespread genome editing of livestock animals, for the same reasons – we just don’t know enough about the genetic architecture of complex traits.

 

Visscher and colleagues do consider the issue of safety, among various ethical concerns: “Heritable polygenic editing (HPE) introduces new combinations of variants that can be dangerous or unsafe. It would be unethical to impose this uncertain risk on future generations”.

 

Their response is: “It is vital that any use of HPE be supported by rigorous safety data and have a clear justification through a risk/benefit balance. In addition, natural reproduction generates new combinations of variants. One strategy is to limit HPE to variant combinations already seen in existing populations.”

 

And they go on to conclude that: “Before such applications are considered, it is important to conduct further research on the effect of polygenic variants on individuals in natural populations, including the lifelong consequences of carrying rare protective alleles.”

 

That cautious attitude is certainly warranted, but is a far cry from the modeling they present and the sensational gains they claim are possible. If you don’t think editing all those sites is a good idea, why present that as the centerpiece of your paper? 

 

Sackton and Hartl discuss the relevance of genetic complexities for genome editing specifically. They end their article with this statement: Probably the best that one can do under the circumstances, using present methods, is to make predictions based on the main, additive effects of alleles, recognize the uncertainty of such predictions, and hope for the best.”

 

 “Let’s edit your baby’s genome and hope for the best” is not exactly a compelling marketing slogan. It’s certainly not a responsible medical attitude, where the dictum “first do no harm” is supposed to hold.

 

 

Eugenics and public attitudes to genetics

 

Overall, Visscher and colleagues present a fantastical scenario, based on all kinds of technical assumptions that may never be realized – primarily that CRISPR editing can be made completely precise and that we will be able to identify causal variants linked to associated SNPs. They go on to model the effects of gene editing with a simple, linear model that we know will not capture the possible biological complexities in individuals. And – in some places in their paper, at least – they treat possible negative consequences (real harms to real individuals) as simply reducing the efficacy of their proposed interventions across the pool of edited genomes.

 

It’s really not clear to me what the aims of the authors are in publishing this kind of speculative analysis. It seems the height of arrogance and statistical hubris to assume that their simple linear models based on population-averaged data can capture the biological complexities of genotype-phenotype relations in individual human beings. (Though the underlying sentiment certainly aligns with the zeitgeist in the tech world these days). 

 

While in some places Visscher and colleagues strike a cautious tone, in others they explicitly argue for the uptake of this kind of approach, in terms that are frankly eugenic:

 

“In the long term, there may be an obligation to pursue and develop technologies such as HPE. Mildly deleterious mutations that escape natural selection because of better medical care are predicted to accumulate in the gene pool (62). Previously published models suggest that the effect of this ‘genetic load’ might manifest itself as physical and mental deterioration in only a few generations (62). However, this concept is controversial, and the conclusions are debated (63–65). If we take seriously the idea of leaving future generations in a better state than the current generations, then we have reason to provide them with the preconditions for a good life. This includes access to clean water, unpolluted air, education and shelter, and may include the use of HPE to lower the genetic risk of disease.”

 

This concern with ‘genetic load’ is a mainstay of eugenic thinking, but there is no good evidence that it is actually increasing or that we are facing the inevitable degradation of the human genome if we don’t intervene somehow. Appealing to it here in this offhand way, without the nuanced discussion it requires, seems highly irresponsible, especially when eugenics tropes are reappearing with increasing regularity in the public sphere.

 

Genetics will continue to have a greater and greater impact in our societies. We will need to have all kinds of ethical discussions about what it means and, as scientists, do lots of work to explain the complexities and nuances to the wider public. Indulging in the kind of statistical fantasies presented in this paper is unlikely to help these efforts. 

 

Note that the opinions expressed above are my own and should not be attributed to the co-authors of our commentary. We conclude that piece in more measured fashion:

 

“Meanwhile, other technologies in reproductive genetics are already available or are around the corner, including embryo, fetus and newborn whole-genome sequencing. Each comes with profound clinical, ethical and societal questions. Upcoming population-scale genome-sequencing efforts raise further urgent questions of privacy, stigmatization and discrimination. Is it wise to distract stakeholders, including the public, with a technology that is still a long way off at best, and might never actually be safe?”

 

 

Comments

Popular posts from this blog

Undetermined - a response to Robert Sapolsky. Part 1 - a tale of two neuroscientists

Grandma’s trauma – a critical appraisal of the evidence for transgenerational epigenetic inheritance in humans