Nav

Tuesday, July 22, 2014

Exciting findings in schizophrenia genetics – but what do they mean?

A paper published today represents a true landmark in psychiatric genetics. It reports results of a genome-wide association study (GWAS) of schizophrenia, involving 36,989 cases and 113,075 controls. Assembling this sample required collaboration on a massive scale, with over 300 authors involved. This huge sample gives unprecedented statistical power to detect genetic variants that predispose to disease, even if their individual effects on risk are tiny. The study reports 108 regions of the genome where genetic differences affect risk of disease. This achievement is rightly being widely celebrated and reported, but what do these results really mean?

GWAS look at sites in the genome where the particular base in the DNA sequence is variable – it might sometimes be an “A”, other times a “T”, for example. There are millions of such sites in the human genome (which comprises over 3 billion bases of sequence). Each such site represents a mutation that happened some time in the distant past, which has since been inherited and spread throughout the population, while not supplanting the previous version completely. This leaves some people with one version and some with another – these different versions are thus called “common variants”. [More correctly, since we each have two copies of each chromosome, each of us carries two copies of each variable site, so the combined genotype could be AA, AT or TT, in the example above].

The idea of a GWAS is to look across the entire genome at over a million such variants for ones at higher frequency in disease cases than in controls. That difference in frequency might be very minor (say, the “A” version might be seen at a frequency of 30% in cases but 27% in controls), but with such a huge sample size, that kind of variation can be statistically significant. In epidemiological terms, the variant that is more common in cases is termed a “risk factor” – if you have it, you are statistically more likely to be in the case group than in the control group. (Just as smoking is more common in people with lung cancer than in people without, although in that case the difference in frequency is massive).

For any individual common variant, the increased statistical risk is tiny – most increase risk by less than 1.1-fold. But the idea is that the combined risk associated with a large number of such variants could be quite large – large enough to push people into disease. Since the variants are common, each of us will carry many of them, but some people will carry more than others. This will generate a distribution of “risk variant burden” across the population. If there are 108 sites, each in two copies, then the range of that distribution could theoretically be from 0 to 216 risk variants. The actual distribution is far narrower however, with the vast majority of the population carrying somewhere between 90 and 130 risk variants (assuming the relative frequencies of the two variants are around 50:50, on average).

GWAS are premised on the “liability-threshold” model, which suggests that though there is a smooth distribution of genetic burden (or liability) across the population, only those above a certain threshold become ill (say the top 1% in the case of schizophrenia). This is known as a polygenic model of risk because it assumes the causal action of a large number of genes in any individual.

An alternative model views common disorders such as schizophrenia as arising mainly due to very rare mutations of large effect, but in different genes in different individuals (and with the possibility of modifying effects of other variants in the genetic background). This scenario is known as genetic heterogeneity. Many such rare, high-risk mutations areknown but the ones we currently know about collectively account for only 10-15% of cases of schizophrenia (and perhaps 30% of cases of autism).

So, with that as background, let’s consider what the GWAS signals mean, individually and collectively. First, GWAS signals are a bit like #Greenfieldisms: they point to a locus and they point to an increased statistical risk of disease – that is all. This is because the common variant that is interrogated is being used as a tag of wider genetic variation at that locus (a locus is just a small region of the genome). Chromosomes tend to be inherited in large chunks without too much mixing (or recombination) between the two copies present in each parent. That means that one common variant at one position will tend to be co-inherited with other common variants nearby. The signal derived from GWAS is associated with one of those (or sometimes several), but tags a lot of additional variation.

Generally, the presumption is that one of the common variants is having a causal effect and the others are merely passengers. However, there are also lots of rare mutations that come along for the ride. These are mutations that arose much more recently and that are therefore present in far fewer individuals. Though GWAS can’t see them directly, any such mutation necessarily arises on the background of a particular set of common variants (called a haplotype). Most people with that haplotype will not carry the rare mutation, but it may be possible that several such mutations in the population (if they are of large effect and thus found mainly in cases) can give an aggregate signal that boosts the frequency of the common haplotype in cases, resulting in a GWAS signal (driven by a “synthetic association”). Many examples of such cases are now found in the literature, for other conditions, though it is not clear if synthetic associations drive any of the signals in the most recent schizophrenia study.
http://www.ncbi.nlm.nih.gov/pubmed/22269335

It is striking, however, that many of the loci implicated by GWAS signals are known to sometimes carry rare mutations that dramatically increase risk of disease. Some of the 108 loci implicated contain only one gene, but some encompass many, while others have no gene in the region or even nearby. Cases where the implicated gene is clear include genes like TCF4, CACNA1C, CACNB2, CNTN4, NLGN4X and multiple others, where rare mutations are known to cause specific genetic syndromes. Moreover, there is substantial enrichment in the GWAS loci for genes in which rare mutations have been discovered in cases with schizophrenia, autism or intellectual disability (including CACNA1I, GRIN2A LRP1, RIMS1 and many others).

These findings strongly reinforce the validity of the GWAS results and also suggest that many of the loci identified sometimes carry rare, high-risk mutations that should be very informative for follow-up mechanistic studies. Whether the GWAS signals themselves are driven by such rare mutations in the samples under study is an open question. (Another paper just out suggests that signals from the GRM3 locus, which encodes a metabotropic glutamate receptor, may be driven by a rare variant that increases risk of mental illness generally by about 2.7-fold). But there are also many examples of loci where both rare and common variation is known to play a role in disease risk and the GWAS signal could well be driven purely by common variants with direct functional effects.

However, such effects need not be tiny in individuals, even if their overall signal of increased risk across the population is very small. We know of many examples of common variants that strongly modify the effects of rare mutations, at the same locus or at one encoding an interacting protein. In such cases, the common variant may increase risk of expression of a disorder due to a rare mutation, but essentially have no effect in most of the population who do not carry such a rare mutation. This situation is exemplified by Hirschsprung’s disease, a condition affecting innervation of the gut. It can be caused by rare mutations in any of 18 known genes. However, such mutations do not always cause disease and the range of severity is also very wide. Common variants at several of those same risk loci have been found to be much more frequent in people with rare mutations who develop disease than in those with the same mutations who remain healthy. When averaged across the population, as in a GWAS study, such effects would yield only a tiny average increase in risk, but this may reflect a large effect in a small subset of people and no effect in the majority.

This brings us to a larger point – what do the GWAS signals tell us collectively? More specifically, should they be taken as evidence in support of a polygenic model of disease risk, where it is the collective burden of common risk variants that causes the majority of disease cases?

One way to test that is to model the variance of the “liability” to the disease, which is actually an unmeasurable parameter, but which is assumed to be normally distributed in the population. With that and a number of other assumptions in place, one can then ask how much of the variance in this trait is accounted for by the loci identified by the GWAS? The authors state that a combined risk profile score “now explains about 7% of variation on the liability scale to schizophrenia across the samples”. That is an improvement over previous studies (the first 13 loci accounted for about 3%), but certainly not as much as might have been expected under a purely polygenic model.

In fact, the GWAS data are also fully consistent with a more complex model of genetic heterogeneity, which involves common variants interacting with rare variants to determine individual risk. Population averages of their effects remain just that – statistical measures that cannot be applied to individuals. Even combining all the common variants to generate a risk profile score does not generate a predictive measure of risk for individuals. (One reason for that is that non-additive genetic interactions that are likely highly important in individuals are averaged out by population-level signals).

So, the current study points the finger at a large set of new genes, but does not really discriminate between models of genetic architecture. The overlap between the GWAS signals and the genes know to carry rare, high-risk mutations certainly suggests that the GWAS has been successful in identifying important risk loci - a tremendous advance for which the authors should be congratulated (as well as for their willingness to collaborate on this level). This is, however, just a first step in understanding the biology of the disease. The underlying genetic heterogeneity presents a tremendous challenge but also an opportunity, as individual high-risk mutations can be followed up in functional studies to elucidate some of the mechanisms through which a change in some piece of DNA can ultimately produce the particular psychological symptoms of this often-devastating disease.

Tuesday, July 8, 2014

"Common disorders" are really collections of rare genetic conditions


Disorders such as autism, schizophrenia and epilepsy each affect about 1% of the population and are therefore defined as “common disorders”. But are they really? I mean, they are clearly really that common, but are they really “disorders”? Are they natural categories that reflect some shared underlying etiology or are they simply groupings based on sets of shared symptoms? Genetics is providing an answer to that question and demonstrating that so-called “common disorders” are really collections of rare disorders with similar symptoms. This represents a complete paradigm shift in psychiatry, the full ramifications of which have yet to be appreciated.

http://directorsblog.nih.gov/2014/01/28/exploring-the-complex-genetics-of-schizophrenia/
We have known for decades of examples of rare genetic syndromes that can include symptoms of autism spectrum disorder (such as Fragile X syndrome or Rett syndrome) or of schizophrenia (such as velo-cardio facial syndrome, now called 22q11 deletion syndrome), while epilepsy is a known symptom of many genomic disorders. But such examples were typically thought of as exceptional and distinct from the much larger group of idiopathic cases of ASD, SZ or epilepsy. (Idiopathic simply means of currently unknown cause). Such conditions were often dismissed as not “real autism” or “real schizophrenia”, despite the fact that clinicians could not make any such assessment based on symptoms alone.

Instead, it was widely held to be a proven fact that the genetics of ASD and SZ generally followed a very different mode – rather than being caused by single mutations, as with the syndromes mentioned above, the idea was that the idiopathic cases were caused by combinations of tens or hundreds (or even thousands) of minor genetic differences, each with only a tiny effect on its own, but collectively sufficient to result in disease if enough of them were inherited.

Modern genomic technologies are revealing that this supposed dichotomy between rare and common disorders is artificial – merely a reflection of our current state of knowledge (or, more correctly, our current state of ignorance). Over the past five years, researchers have discovered many more rare genetic conditions that manifest with psychiatric symptoms, and which collectively can account for an ever-growing percentage of patients presenting with ASD or SZ. These include deletions or duplications of whole chunks of chromosomes, often affecting many genes, as well as mutations that affect only one gene.

http://www.123rf.com/photo_4981342_a-cassette-tape-has-been-destroyed-and-the-tape-unraveled.html
[The DNA sequence of each gene codes for production of a specific protein. Genes are strung along chromosomes, like the information encoding successive songs on a cassette tape. (I may be showing my age with this reference!) Localised damage to the tape at one specific point can affect just one song, but cutting out a whole section could remove or disrupt multiple songs at the same time. Similarly, changing one letter of the DNA sequence can alter the code for a single protein, while deleting a whole section of chromosome can remove multiple genes and thereby affect production of multiple proteins at once].

Some regions of the genome are particularly prone to errors in DNA replication that result in deletions or duplications. While still rare, these recur at a high enough frequency that many cases with effectively the same genetic lesion can be identified. This has enabled researchers to recognise and characterise a growing number of genomic disorders that carry a high risk of psychiatric or neurological symptoms. In addition to previously known conditions such as 22q11 deletion syndrome, Williams, Angelman and Prader-Willi syndromes, new conditions have been defined involving deletions or duplication at 1q21.1, 3q29, 7q36.2, 15q11.2, 16p11.2, 22q13 and many others, with more being recognised all the time.

All of these mutations have variable effects, sometimes presenting as ASD, sometimes as SZ or epilepsy – often, but not always, with developmental delay or intellectual disability. Because their clinical manifestations are so variable, there was no way to detect or recognise these patients prior to genetic screening (except for conditions with other characteristic symptoms, such as distinct facial morphology). But once a genetic diagnosis can be made, it becomes possible to group patients with the same mutation together and determine whether there are any patterns to their symptoms, their course of illness, how they respond to medications, and other clinical parameters. This is useful information for clinicians and also for patients and their families – indeed, international support groups have been formed for many of these rare genomic conditions.   

New conditions caused by mutations in specific genes are also being defined. Rett syndrome is a classic example – a form of autism and intellectual disability in girls that is caused by mutations in a gene called MeCP2. New genomic sequencing technologies are now revealing many more such conditions, although the pace of discovery here has been slower, for two reasons.

First, if you sequence the entire genome of any individual you will find many serious mutations – severely affecting production or function of a couple hundred proteins (out of ~20,000 in total). Recognising which one of those is causing disease in a particular patient is impossible, unless you have some prior information. That information can come from seeing the same gene mutated in multiple patients with a particular condition. That brings up the second problem – the number of genes in which mutations can cause ASD or SZ or epilepsy is very large, probably on the order of a thousand. So the likelihood that any two patients will have a mutation in the same gene is very low. This means we will need to sequence very large samples of patients to start to see the signal of meaningful repeat hits amongst the background noise of repeats that arise by chance, simply because we all carry many mutations.

Those efforts are underway and are beginning to pay off, with new conditions being defined at an ever-increasing rate. One recent example involves mutations in the gene CHD8. Mutations that disrupt this gene have been observed in multiple patients with diagnoses of developmental delay or ASD (15 independent mutations in 3,730 cases), but never in a sample of 8,792 clinically unaffected controls. You can see how rare these mutations are – accounting for only 4 of every 1000 cases – but the fact that you don’t see such mutations in controls provides strong evidence that they are in fact the cause of disease in those patients. (See here for a much more nuanced discussion of causality in genetic disorders – the phenotypic effects of any single mutation will always be modified, sometimes strongly, by additional genetic variants in the background). 

By finding multiple patients with mutations in the same gene, clinicians were able to define a new syndrome that was previously unrecognisable. In this case, patients with CHD8 mutations display macrocephaly (increased head size), distinct faces and gastrointestinal problems (the CHD8 protein has independent functions in both the brain and the nervous system innervating the gut). The genetic information is thus directly and immediately relevant to the clinical management and treatment of these cases.

Now, one might say that such mutations are so rare that they don’t really tell us anything about the generality of conditions like ASD or SZ. But the point is, there is no reason to think such a thing exists. As more and more mutations causing high risk of psychiatric conditions are discovered, the percentage of cases remaining idiopathic decreases. Those diagnostic categories are not founded on knowledge but on ignorance of underlying cause, by definition.

Known, high-risk mutations can now be identified in >10% of cases of SZ, 25-30% of cases of ASD, and over 60% of cases of severe intellectual disability. Those numbers represent a vast increase from even a few years ago and are sure to increase rapidly in the very near future. Even if the genetic effects in many cases are more complicated (involving more than one mutation at a time, with contributions from common variants), the major message remains the same: these conditions are incredibly genetically heterogeneous. It is probably far more appropriate to think of “autistic symptoms” or “schizophrenic symptoms” as a common consequence of many distinct genetic conditions, than to think of “autism” or “schizophrenia” as monolithic disorders.

That has hugely important implications not just for clinical practice but also for research. If you take a hundred patients with ASD, you might have 70-80 distinct genetic causes. That’s something to consider in the context of, say, neuroimaging studies that look for commonalities across groups of ASD or SZ patients. Any time I see a study reporting some difference in brain structure “in autism” or “in schizophrenia”, I replace that phrase with “in intellectual disability” and see if it still makes any sense. (It doesn’t, give the well-accepted heterogeneity of ID). Of course, there may be some commonalities in the final outcome in these patients, given they end up with similar symptoms, but research purporting to look at causes should bear the genetic heterogeneity in mind.

Genetics is increasingly providing the means to distinguish the underlying causes in different patients and hopefully develop a far more personalised approach to care. Fortunately, new technologies of genome editing are making it much easier to recapitulate disease-causing mutations in animals so that pathogenic mechanisms can be elucidated. Just in the past couple weeks, very exciting results have been published that help localise the primary effects of particular mutations (in the genes SYNGAP1 and NLGN3) to specific cell types in specific regions of the developing brain in mouse models. 

The recognition that these common diagnostic categories are really collections of very rare conditions will necessitate a shift in approaches aimed at developing new treatments. The economics of drug development for rare conditions are obviously very different from the search for the new blockbuster. The next big challenge is to elucidate the biological mechanisms leading to disease across many different mutations to determine if there are any shared pathways or common pathophysiological endpoints that might be targeted in large groups of patients or if individualised treatments can be (or need to be) developed for very small and specific sets of patients, as is happening in other areas of medicine.




Monday, April 14, 2014

The Trouble with Epigenetics, Part 3 – over-fitting the noise


The idea of transgenerational epigenetic inheritance of acquired behaviors is in the news again, this time thanks to a new paper in Nature Neuroscience (who seem to have a liking for this sort of thing).

The paper is provocatively titled:  Implication of sperm RNAs in transgenerational inheritance of the effects of early trauma in mice”. The abstract claims that:

We found that traumatic stress in early life altered mouse microRNA (miRNA) expression, and behavioral and metabolic responses in the progeny. Injection of sperm RNAs from traumatized males into fertilized wild-type oocytes reproduced the behavioral and metabolic alterations in the resulting offspring.”

Unfortunately, the paper provides no evidence to back up those extraordinary claims. It is, regrettably, a prime example of over-fitting the noise. That is, finding patterns in a mass of messy data, like faces in clouds, and building hypotheses on them after the fact. If any change in any parameter will do, it isn’t hard to find support that “something happens”. I have written about this problem before, exemplified by previous papers from this group. I normally try not to be sarcastic here, but I don’t have time to edit today, so you’re getting raw, unfiltered exasperation this time.

There are some documented examples of transgenerational effects mediated by RNAs in sperm, especially in worms and plants. Almost all of these involve repression of transposon or transgene insertions. This is not believed to be a widespread phenomenon in mammals, however, and you don’t need to (and shouldn’t!) take my word for it – the following is from a very recent review by leaders in this field:

"...epigenetic inheritance is usually—if not always—associated with transposable elements, viruses, or transgenes and may be a byproduct of aggressive germline defense strategies. In mammals, epialleles can also be found but are extremely rare, presumably due to robust germline reprogramming. How epialleles arise in nature is still an open question, but environmentally induced epigenetic changes are rarely transgenerationally inherited, let alone adaptive, even in plants. Thus, although much attention has been drawn to the potential implications of transgenerational inheritance for human health, so far there is little support."

Shutting down a transposon in gametes and the resultant offspring is one thing – it’s a pretty straightforward molecular mechanism, actually. Using such a mechanism to transmit a behavioural change induced by an experience in the previous generation is something else entirely. What that would require is the following sequence of events: animal has an experience, experience is registered by the brain (so far, so good), signal is transmitted to the gametes (hmm, by what?), relevant gene or genes are specifically modified (how? why just those genes?), modification is maintained in the zygote through “genome rebooting” (what, now?), modification is maintained throughout subsequent development of the animal and the brain (really?), but in a selective way so that somehow in the adult it only affects expression in certain brain regions so as to initiate an appropriate behavioural change in the offspring (ah, c’mon, now you’re taking the piss...).

That is why my skepticometer gets pegged by studies that make such claims without documenting or even suggesting a plausible mechanism by which such events could occur. The current paper takes a stab at one part of that, by looking at small non-coding RNAs as a possible mediator. Unfortunately, the paper is… well, let me show you.

The authors use a paradigm which they developed previously (in one of the papers which I criticised here), to induce what they call a traumatic stress. This involves “unpredictable maternal separation combined with unpredictable maternal stress (MSUS) for 3 hours daily from postnatal day 1 through 14 (PND 1–14)”. The pups don’t like that, apparently, and the authors claim they grow up to show “depressive-like behaviours”. I find those behavioural data a bit shaky, but they get much worse in the following generations, when the responses vary in one test, in one sex in one generation and then in another test in the other sex in the next. It all looks like noise to me, and, the authors neither correct for all these multiple tests, nor provide any hypothesis to account for these fluctuating effects.

In their 2010 paper, they looked at DNA methylation of some candidate genes in F1 sperm and F2 brains to see if they could find a molecular mechanism. Here’s the figure:




There are a lot of asterisks on there, indicating some changes that are statistically significant (alone), but you’ll notice how many different measurements they have made and, also, I hope, the lack of consistency in the supposed effects from F1 to F2. Importantly, there is no independent replication – just one big experiment with the stats done on the whole lot at once. It is no surprise that some data points come out as significant. I’m thinking of green jelly beans

You can see the same kind of thing in the figure below from this recent paper, which also got a lot of media attention: “Parental olfactory experience influences behavior and neural structure in subsequent generations”




In both cases, the data look to me like noise.

Now, back to this latest paper. Amidst a load of somewhat peripheral data, the data supporting the two main claims of the paper are the following: First, the authors claim that the maternal separation protocol alters the levels of various small non-coding RNA molecules in the sperm of the F1 mice (the ones whose moms were cruelly taken away). The data for that claim are in Supplemental Figure 2, which I reproduce below. You will see it derives from three pools of mice for the control condition and three pools from the MSUS condition.


 I see no consistent pattern of changes here. There looks to be as much variability within conditions as between. (Take MSUS pool 2 out and you wouldn’t be left with much signal, I would wager). I am sure there is some statistical test that would give you a significant result, but if you torture the data enough, they’re bound to try to tell you something.

Their next figure takes some of these specific miRNAs and examines their expression levels in sperm, serum and various brain regions of F1 and F2 mice. Again, the data are all over the place. They’re up, they’re down, they’re not changed. They’re changed in hippocampus but some go in the opposite direction in hypothalamus and none are changed in cortex (Supplementary Figure 10 for those reading along at home).


That’s some noisy noise right there. Notably, they see no changes in sperm of F2, even though the F3 supposedly still show behavioural changes, rather undermining their own case for the link between these two (non-)events.

Their next step is the one that the main conclusion rests on – to show that injection of small RNAs from the sperm of an MSUS F1 mouse into a fertilised oocyte can induce the suite of behavioural changes they (claim to) see in the F2 generation under normal conditions.

Amazingly, they do not actually show those data. We get summary t statistics claiming there are some differences but are not treated to the actual data themselves. So, we can’t evaluate the effect sizes or the underlying variability of the data. Here’s how it reads in the paper:


They do show a supposed effect on metabolism in the MSUS-RNA-injected animals, which is a difference seen in one experiment with 8 animals per group in glucose levels, not at baseline, but after stress. I have no idea what’s going on here or what the hypothesis is supposed to be – are we supposed to expect greater or lower glucose? At baseline or after a stress? Whatever is happening, it’s not consistent between the F1 and the F2 in the traditional paradigm, nor between the traditional F2 and the MSUS-RNA-injected F2. 




Finally, we are shown that the levels of one of the miRNAs differs in the hippocampus of the MSUS-RNA-injected F2. Doesn’t look very convincing to me, by itself – it’s the kind of result one might want replicated before publishing, but more to the point: Why that one? What about all the others whose levels fluctuated so happily in the figure shown above?



Overall, there’s no there there. It’s all sound and fury, signifying nothing. I would give it the ultimate insult by saying it’s not even wrong, but it is.

Nevertheless, this paper is sure to be latched onto by the woo crowd who seem to think that epigenetics is some kind of magic. (Now I have that Queen song running in my head - you're welcome). We can change our genes! They’re not our destiny! Toxins cause autism because epigenetics! Hooray!

Evolution appears to have made us mammals very delicate creatures. If you look sideways at a mouse these days you can permanently alter its genes, it seems, along with those of its kids and grandkids. Of course, you’d think another look might change them back if they're so sensitive, but apparently not. I’m sure your genes (ooh, and brain circuits!) have been changed by reading this, for which I can only apologise.



Monday, March 24, 2014

Gay genes? Yeah, but no, well kind of… but, so what?


Sexual preference is one of the most strongly genetically determined behavioural traits we know of. A single genetic element is responsible for most of the variation in this trait across the population. Nearly all (>95%) of the people who inherit this element are sexually attracted to females, while about the same proportion of people who do not inherit it are attracted to males. This attraction is innate, refractory to change and affects behaviour in stereotyped ways, shaped and constrained by cultural context. It is the commonest and strongest genetic effect on behaviour that we know of in humans (in all mammals, actually). The genetic element is of course the Y chromosome.

http://mathbionerd.blogspot.ie/2013/05/accessible-research-gene-survival-and.html
The idea that sexual behaviour can be affected by – even largely determined by – our genes is therefore not only not outlandish, it is trivially obvious. Yet claims that differences in sexual orientation may have at least a partly genetic basis seem to provoke howls of scepticism and outrage from many, mostly based not on scientific arguments but political ones.

The term sexual orientation refers to whether your sexual preference matches the typical preference based on whether or not you have a Y chromosome. It is important to realise that it therefore refers to four different states, not two: (i) has Y chromosome, is attracted to females; (ii) has Y chromosome, is attracted to males; (iii) does not have Y chromosome, is attracted to males; (iv) does not have Y chromosome, is attracted to females. We call two of these states heterosexual and two of them homosexual. (This ignores the many individuals whose sexual preferences are not so exclusive or rigid).

A recent twin study confirms that sexual orientation is moderately heritable – that is, that variation in genes contributes to variation in this trait. These effects are detected by looking at pairs of twins and determining how often, when one of them is homosexual, the other one is too. This rate is much higher (30-50%) in monozygotic, or identical, twins (who share all of their DNA sequence), than in dizygotic, or fraternal, twins (who share only half of their DNA), where the rate is 10-20%. If we assume that the environments of pairs of mono- or dizygotic twins are equally similar, then we can infer that the increased similarity in sexual orientation in pairs of monozygotic twins is due to their increased genetic similarity.

These data are not yet published (or peer reviewed) but were presented by Dr. Michael Bailey at the recent American Association for the Advancement of Science meeting (Feb 12th 2014) and widely reported on. They confirm and extend findings from multiple previous twin studies across several different countries, which have all found fairly similar results (see here for more details). Overall, the conclusion that sexual orientation is partly heritable was already firmly made. 

The reaction to news of this recent study reveals a deep disquiet with the idea that homosexuality may arise due to genetic differences. First, there are those who scoff at the idea that such a complex behaviour could be determined by what may be only a small number of genetic differences – perhaps only one. As I recently discussed, this view is based on a fundamental misunderstanding of what genetic findings really mean. Finding that a trait (a difference in some system) can be affected by a single genetic difference does not mean a single gene is responsible for crafting the entire system – it simply means that the system does not work normally in the absence of that gene. (Just as a car does not work well without its steering wheel).

Others have expressed a variety of personal and political reactions to these findings, ranging from welcoming further evidence of a biological basis for sexual orientation to worry that it will be used to label homosexuality a genetic disorder and even to enable selective abortion based on genetic prediction. The latter possibility may be made more technically feasible by the other aspect of the recently reported study, which was the claim that they have mapped genetic variants affecting sexual orientation to two specific regions of the genome. (This doesn’t mean they have identified specific genetic variants but may be a step towards doing so).

Let’s explore what the data in this case really show and really mean. A variety of conclusions can be drawn from this and previous studies:

1.     Differences in sexual orientation are partly attributable to genetic differences.
2.     Sexual orientation in males and females is controlled by distinct sets of genes. (Dizygotic twins of opposite sex show no increased similarity in sexual orientation compared to unrelated people – if a female twin is gay, there is no increased likelihood that her twin brother will be too, and vice versa).
3.     Male sexual orientation is rather more strongly heritable than female.
4.     The shared family environment has no effect on male sexual orientation but may have a small effect on female sexual orientation.
5.     There must also be non-genetic factors influencing this trait, as monozygotic twins are still often discordant (more often than concordant, in fact).

The fact that sexual orientation in males and females is influenced by distinct sets of genetic variants is interesting and leads to a fundamental insight: heterosexuality is not a single default state. It emerges from distinct biological processes that actively match the brain circuitry of (i) males or (ii) females to their chromosomal and gonadal sex so that most individuals who carry a Y chromosome are attracted to females and most people who do not are attracted to males.

http://sites.sinauer.com/levay4e/webtopic05.04.htmlWhat is being regulated, biologically, is not sexual orientation (whether you are attracted to people of the same or opposite sex), but sexual preference (whether you are attracted to males or females). Given how complex the processes of sexual differentiation of the brain are (involving the actions of many different genes), it is not surprising that they can sometimes be impaired due to variation in those genes, leading to a failure to match sexual preference to chromosomal sex. Indeed, we know of many specific mutations that can lead to exactly such effects in other mammals – it would be surprising if similar events did not occur in humans.

These studies are consistent with the idea that sexual preference is a biological trait – an innate characteristic of an individual, not strongly affected by experience or family upbringing. Not a choice, in other words. We didn’t need genetics to tell us that – personal experience does just fine for most people. But this kind of evidence becomes important when some places in the world (like Uganda, recently) appeal to science to claim (wrongly) that there is evidence that homosexuality is an active choice and use that claim directly to justify criminalisation of homosexual behaviour.

Importantly, the fact that sexual orientation is only partly heritable does not at all undermine the conclusion that it is a completely biological trait. Just because monozygotic twins are not always concordant for sexual orientation does not mean the trait is not completely innate. Typically, geneticists use the term “non-shared environmental variance” to refer to factors that influence a trait outside of shared genes or shared family environment. The non-shared environment term encompasses those effects that explain why monozygotic twins are actually less than identical for many traits (reflecting additional factors that contribute to variance in the trait across the population generally).

http://scousebunintheoven.blogspot.ie/The terminology is rather unfortunate because “environmental” does not have its normal colloquial meaning in this context. It does not necessarily mean that some experience that an individual has influences their phenotype. Firstly, it encompasses measurement error (just the difficulty in accurately measuring the trait, which is particularly important for behavioural traits). Secondly, it includes environmental effects prior to birth (in utero), which may be especially important for brain development. And finally, it also includes chance or noise – in this case, intrinsic developmental variation that can have dramatic effects on the end-state or outcome of brain development. This process is incredibly complex and noisy, in engineering terms, and the outcome is, like baking a cake, never the same twice. By the time they are born (when the buns come out of the oven), the brains of monozygotic twins are already highly unique.

Genetic differences may thus change the probability of an outcome over many instances, without determining the specific outcome in any individual. 

A useful analogy is to handedness. Handedness is only moderately heritable but is effectively completely innate or intrinsic to the individual. This is true even though the preference for using one hand over the other emerges only over time. The harsh experiences of many in the past who were forced (sometimes with deeply cruel and painful methods) to write with their right hands because left-handedness was seen as aberrant – even sinful – attest to the fact that the innate preference cannot readily be overridden. All the evidence suggests this is also the case for sexual preference.

http://www.nytimes.com/2011/03/08/health/views/08klass.html?_r=0
What about concerns that these findings could be used as justification for labelling homosexuality a disorder? These are probably somewhat justified – no doubt some people will use it like that. And that places a responsibility on geneticists to explain that just because something is caused by genetic variants – i.e., mutations – does not mean it necessarily should be considered a disorder. We don’t consider red hair a disorder, or blue eyes, or pale skin, or – any longer – left-handedness. All of those are caused by mutations.

The word mutation is rather loaded, but in truth we are all mutants. Each of us carries hundreds of thousands of genetic variants, and hundreds of those are rare, serious mutations that affect the function of some protein. Many of those cause some kind of difference to our phenotype (the outward expression of our genotype). But a difference is only considered a disorder if it negatively impacts on someone’s life. And homosexuality is only a disorder if society makes it one.

Sunday, February 23, 2014

Reductionism! Determinism! Straw-man-ism!


“Reductionism!” is a charge often flung at geneticists, from accusers in the popular press and also, not infrequently, from many fellow scientists. What is it that leads so many people to so fundamentally misunderstand what genetics is about?

Whenever someone presents results showing that variation in such and such a trait is partly influenced by genetic variation, or even showing more specifically that mutations in a particular gene can predispose to a particular outcome, someone is sure to shout: “Reductionism! Single genes can’t cause complex traits – it’s patent nonsense to say that they can! Biological organisms are complex systems interacting in complex ways in an ever-changing environment – particular behaviours can’t be simply determined by genes”. 

http://thesciencedog.wordpress.com/2013/12/28/beware-the-straw-man/

Of course they are right, but they’re also arguing against something no one is claiming. A couple recent examples illustrate this phenomenon. One is the reaction to coverage of a presentation at the American Association for the Advancement of Science meeting by Dr. Michael Bailey, which described results of a very large twin study, confirming that sexual orientation has a strong genetic component (explaining 30-40% of the variance in this trait).

Nick Cohen, writing in The Observer, quoted geneticist Steve Jones’ reaction to the coverage of this story:

“The idea that they could find a reductionist explanation for a phenomenon as complicated as human sexuality was, well, optimistic. All you could say was genetic inheritance probably influenced it. But then you could say the same about anything.”

Cohen goes on to say:

“Suppose researchers claim to identify gay genes. Their discovery would be pseudo-science. A Gordian knot of environmental, cultural and hormonal influences would be as important in determining sexual preference.”

“To put it another way – if you go along with crude reductionism, you can expect to find yourself at the mercy of crude reductionists.”

In fact, the scientists presenting these findings were quite careful to spell out that their findings do not show that sexual orientation is completely determined by genes in general or any specific genes in particular. They simply show that genetic variation contributes to variation in this trait and that certain locations in the genome may contain some of the genetic variants responsible.

There have been similar reactions to the discussion of the effects of genetic variation on intelligence and educational achievement. Again, the authors of these studies are circumspect about the impact of genetic effects, highlighting the important role of the environment, and emphasising the complexity of the genetic effects.

The main problem, it seems to me, is a fundamental misunderstanding of what genetics as a science studies and how it relates to the function of complex systems. The following statements are not contradictory:

1.     The function of a complex system emerges from the complex and dynamic interactions between all of the components of the system, in a context- and experience-dependent manner.
2.     Variation in single components of the system (or in multiple components) can affect how it functions.

Geneticists investigate the second question. Showing that variation in Gene X affects the behaviour or outcome of a system is not the same as saying that Gene X fully determines that behaviour or fully accounts for the entire system. Gene X is just a piece of DNA sitting in a cell somewhere – it doesn’t do anything by itself. But a difference in Gene X can account for a difference in how the system works.

There’s nothing reductionist about that, except from a methodological point of view, in that scientists often focus on individual components of complex systems, one at a time, in order to get a handle on that complexity and figure out how the whole system works. That has been an extraordinarily successful approach, but does not mean that scientists employing methodological reductionism also ascribe to philosophical reductionism – the idea that the system really can be explained simply from the properties of its lower-level components – that it is no more than the sum of its parts, or that its function can be said to be caused by any one part.

http://www.visualphotos.com/image/2x6178912/mans_hands_holding_steering_wheelConsider a car – the function of this wonderful piece of machinery depends on the integrity of all the components and emerges from their interactions. To say that any one component is somehow responsible for the function of the whole system is nonsense. If you just have a steering wheel, you’re not going to get very far. But it’s also true that if you don’t have a steering wheel, you’re going to have difficulties driving anywhere. A change to one component can drastically affect how even the most complex system functions.

Genetics as an experimental approach studies how a system varies or fails, when individual components are disrupted and uses that information to infer which components are involved in which processes. By contrast, biochemistry or systems biology study how the components of a system are put together and how those interactions mediate its function. These are two complementary experimental approaches to understanding complex systems. (For a very funny analogy along these lines, see here).

As well as inducing mutations in model organisms to learn how various biological processes work, geneticists also study how naturally occurring genetic variation in a population affects various traits. Again, showing that differences in genes contribute to differences in traits is not the same as claiming those genes alone produce or determine everything about the system in question.

Part of the confusion may arise from the two different meanings of the word gene – one based on heredity and the other on molecular biology. In the molecular biology sense, a gene codes for a protein, which carries out some function as a component of some biochemical or cellular system. This is a productive definition of the gene in relation to the system. By contrast, a gene in the heredity sense really means a variant or mutation in the molecular-biology-gene – something that alters its function and thus alters the system. This is a disruptive definition of a gene. Such variants can be passed on and contribute to variation in some trait in the population.

The relationship between a gene (as a piece of DNA coding for a protein) and its function in a biochemical system is thus very different from the relationship between a gene (as a unit of heredity – i.e., a genetic variant) and its effects on some phenotype. For one thing, the effects of a genetic variant can be extremely indirect, cascading across levels from the molecular and biochemical to cellular, physiological and sometimes behavioural. Trying to understand how the phenotypes caused by disrupting a gene relate to the molecular function of the normal gene product is the main enterprise of experimental genetics.

Of course, many geneticists contribute to this confusion (often aided and abetted by journalists and headline writers) by using the egregious “gene for” construct. So, we end up talking about “genes for” schizophrenia or “genes for” intelligence or other traits or conditions – it sounds like these phrases are referring to the productive molecular biology sense of a gene, but really they are using the disruptive heredity sense. What those phrases really refer to is mutations that alter how genes work and that thereby contribute to variation in a particular trait or the likelihood of developing some condition, in the context of the incredibly complex biological system that is a human being, which develops over many years in varying environments. All those qualifiers don’t make for great headlines, I’ll admit, but that’s what the shorthand “gene for” construction really means.

So, claims of genetic influence on various traits are really much more modest than many people seem to think. All they say is that the function of the system in question can vary due to variation in one or more of its components. Hardly the grand threat to civilisation which some people perceive.


While I am at it, here are some other common and equally misplaced arguments against genetic findings:

“I don’t want Trait X to be genetic or innate, so I will simply refuse to believe those data and counter by playing my Reductionist! card”. Political agendas don’t trump scientific facts, thankfully, but this is still a very popular move.

“If genes underlying Trait X are discovered, people will misuse this knowledge”. This may well be true, but it does not speak to the underlying facts of the matter of whether the trait is really influenced by said genes. (More on this in a later post).

“The effects of genetic variation in Gene X are only probabilistic – not everyone with that mutation develops Condition X – how can you therefore say it is causal?” This is akin to saying that because not everyone who smokes gets lung cancer, you can’t really say that smoking causes lung cancer. (Which is true, if you want to be pedantic about the word “causes”, but we can certainly say it causes a much higher probability of developing lung cancer. That is a valid, informative and useful statement and we can make analogous statements about the effects of mutations).

A related one: “The effects of variation in Gene X are only apparent at the statistical level”. Well, the effect of the Y chromosome on height is also only apparent at the statistical level (by comparing the average height of men versus women) but that doesn’t mean it’s not real or useful information. It does mean that this kind of information can’t be reliably applied to make predictions about individuals, which is something some geneticists claim they can do, so this criticism hits the mark in those cases.

“Geneticists have yet to find any specific genes affecting Trait X, therefore it is not really genetic”. This one really gets my goat, especially because people are often referring to negative results from genome-wide association studies (GWAS) or the failure of such studies to explain all the heritability of a trait or disorder and claiming that this implies that it is not really heritable after all. As GWAS only look at one kind of genetic variation (common genetic variants) the only thing such negative results imply is that GWAS may be looking in the wrong place and that heritability is likely to also involve many rare genetic variants.  

I am not suggesting that geneticists never overstep the mark and claim more than they should on the basis of specific findings – this field is no more immune from hype than any other – some might argue it is more susceptible, in fact. That is all the more reason to reserve valid arguments against over-extrapolation of genetic results for when they are needed, rather than levelling the blanket, straw-man charge of reductionism! at the field as a whole.


s;o