Sunday, November 22, 2015

What do GWAS signals mean?

Genome-wide association studies (GWAS) have been highly successful at linking genetic variation in hundreds of genes to an ever-growing number of traits or diseases. The fact that the genes implicated fit with the known biology for many of these traits or disorders strongly suggests (effectively proves, really) that the findings from GWAS are “real” – they reflect some real biological involvement of those genes in those diseases. (For example, GWAS have implicated skeletal genes in height, immune genes in immune disorders, and neurodevelopmental genes in schizophrenia).

But figuring out the nature of that involvement and the underlying biological mechanisms is much more challenging. In particular, it is not at all straightforward to understand how statistical measures derived at the level of populations relate to effects in individuals. Here, I explore some of the diverse mechanisms in individuals that may underlie GWAS signals.

GWAS take an epidemiological approach to identify genetic variants associated with risk of disease in exactly the same way epidemiologists identify environmental factors associated with risk – they look for factors that are more frequent in cases with a disease than in unaffected controls. For example, smoking is more common in people with lung cancer than in people without lung cancer (even though only a minority of people who smoke get lung cancer). From this we can deduce that smoking may be a risk-modifying factor for lung cancer, and we can measure the strength of that effect. Of course, observational epidemiology cannot prove causation – but it can provide important clues as to the risk architecture of a disease.

For GWAS, the factors in question are not environmental – they are the differences in our DNA that exist at millions of positions across the genome. These “single-nucleotide polymorphisms”, or SNPs, are positions in the genome where the DNA sequence varies between people – sometimes it might be an “A”, sometimes it might be a “T” (or a “G” or a “C”). Of course, any position in the genome can be mutated and likely is mutated in someone on the planet, but such mutations are typically extremely rare. SNPs are different – they are positions where two different versions are both relatively frequent in the population; these versions are thus often referred to as common variants.

GWAS are premised on the simple idea that if any of those common variants at any of those millions of SNPs across the genome is associated with an increased risk of disease, then that variant should be more frequent in cases than in controls. So, if we find variants that are more common in cases than in controls, we can infer that these variants may be causally related to an increased risk of disease.

What that doesn’t tell us is how. How does having one variant over another at that particular site cause an increased risk of that particular disease? I don’t just mean by what biological mechanism; I mean how does risk calculated at the population level relate to effects in individuals?

Statistically, we get two measures out of GWAS for any SNP that is associated. One is the p-value, which is a measure of how unlikely it would be to see a frequency difference of the magnitude we observe, just by chance. You might, for example, find that the “A” version at one SNP is at 25% frequency in controls but 28% frequency in cases. That’s not a big difference, so you’d need a very big sample to make sure it wasn’t noise, which is precisely why GWAS now use sample sizes of tens or even hundreds of thousands of people.

GWAS also apply very rigorous thresholds for statistical significance, in order to correct for the fact that they are testing so many different SNPs. (This follows the logic that, while it is quite unlikely that you will win the lottery yourself, if enough tickets are sold, it won’t be surprising if the lottery is won by somebody). These methods have greatly advanced the trustworthiness of results from the field, far beyond those reported in the benighted “candidate gene era”. But the p-value doesn’t tell us anything about how big of an effect there is – how much of an effect on risk does the difference in frequency between cases and controls reflect?

That number is summarised by the other measure we get for each associated SNP, which is the odds ratio. This reflects the size of the difference in frequency of that variant between cases and controls. It is calculated very simply: say your SNP comes in two versions, or “alleles”: “A” and “G”. We want to convert the difference in absolute frequencies in cases versus controls (say 28% vs 25%, or 62% vs 60%, or whatever it is) into a number that tells us how many times more common is one version in cases versus controls. (The reason is that that number is more easily related to the increased risk associated with having that version).

Here’s an example: If we take 28% and 25% as frequencies of the “A” allele at a certain SNP in cases and controls, respectively, then if you were to select an “A” allele at random from the sample, the odds of it coming from a case versus a control is 0.28/0.25 (=1.12). The odds of the alternative “G” allele occurring in a case versus a control is correspondingly lower: 0.72/0.75 (=0.96). The odds ratio is then 1.12/0.96 = 1.167.  Assuming that the cases and controls are representative of the general population, we can infer that individuals with an “A” allele are 1.167 times more likely to be a case, compared to those with the “G” allele, which is the number we’re after. (Note that this approximation of odds ratio to relative risk only holds when the disease is rare).

If you do the same calculations for 62% vs 60% it works out to 1.09. These odds ratios are on the order of the typical values obtained from GWAS. For comparison, the odds ratio for smoking and lung cancer is around 30. It is calculated in the same way, e.g., from data like these from a study in Spain in the 1980’s (where smoking was apparently astronomically common!): this study found that 98.8% of lung cancer patients were smokers, while “only” 80.3% of controls were smokers. Doing the same calculations as above gives an OR = 29.1, which is consistent with many other studies.

Thus, for either genetic or environmental factors, the odds ratio gives an average increased risk of disease. But, biologically, what is actually going in each individual that collectively gives that signal?

The most straightforward interpretation is that an odds ratio of, say, 1.2 at the population level reflects exactly the same thing at the individual level – each individual who inherits that SNP variant is at 1.2 times greater risk of developing the disease than they would been otherwise. This is the additive model whereby each SNP acts independently of all other factors – it doesn’t matter what other genetic variants a person has, or indeed what environmental factors they may be exposed to – the added effect on risk of this SNP is the same in all carriers.

That is, I think, a pretty common interpretation of what the odds ratio means in individuals, but it is certainly not the only scenario that could produce that result at the population level. In the diagram below, I illustrate several different scenarios that could all yield the same odds ratio across the population.

The additive scenario is illustrated in A. Every person who inherits the risk allele has a slightly increased risk of disease (small red arrows). [This applies whether the SNP that is genotyped in the GWAS has a functional effect itself or tags another common SNP that is the one doing the damage].

It might seem like the odds ratio can be interpreted directly as a multiplier of the baseline risk across the population, i.e., the prevalence of the disease in question. So, if the baseline rate is say 1%, then people with the “A” allele in our example above would have a risk of 1.167%, all other things being equal. The problem with that interpretation is that all other things are not equal.

For example, a condition like autism affects about 1% of the population. This does not mean, however, that everyone in the population had a 1% risk of being born autistic, and that the ones who actually are autistic were just unlucky (statistically speaking, not judgmentally). That 1% is actually made up of people who were at very high risk of being autistic – we know this because people with the same genotype as those with autism (i.e., their monozygotic twins) have a rate of autism of over 80%. What this implies is that the vast majority of the population were at effectively no risk (not at 1% risk).

This suggests that the effects of any SNP are also likely to be highly unequally distributed across the population*, depending on the genetic background, as illustrated in Scenario B. In some people, the risk variant increases risk a little bit (small red arrows), while in others it increases it a lot (bigger red arrows). In others it may have no effect (flat blue line), while in yet others it may actually decrease risk (green downward arrow).

That last situation may seem far-fetched but is actually well described; for example, two mutations that each independently cause epilepsy may paradoxically cancel each other out if they occur together. Similarly, mutations in the fragile X gene, Fmr1, or in the tuberous sclerosis gene, Tsc2, can each cause autism in humans and various neurological and behavioural symptoms when mutated in mice. However, combining them both in mice leads to a rescue of the symptoms caused by either one alone (because they counteract each other at the biochemical level).

These kinds of “epistatic” (non-additive) interactions are generally very common and can be seen for all kinds of complex traits. In terms of how they would contribute to a GWAS signal, a slight preponderance of increased risk when you average those effects across the population would generate a small odds ratio greater than 1. Based on the odds ratio alone, there is no way to distinguish scenarios A and B.

Note that this kind of effect holds for all epidemiological data – the effect sizes obtained are always averages across the population which may hide substantial variability in effect size across individuals. For example, a high-fat diet may be a much higher risk factor for cardiovascular disease in some people than in others, based on their genetic vulnerability.

It is interesting to note that if those kinds of diverse epistatic interactions occur for each SNP, then their aggregate effects will likely always look additive, as these pairwise and higher-order interactions will average out both among and across individuals. That doesn’t mean they could not in principle be decomposed to reveal such effects, as can be done using various genetic techniques in model organisms. So, just because SNP effects seem to combine additively does not rule out multiple epistatic interactions at the biological level.

Scenario C is a special case of epistatic interaction. In this case, the common risk variant has no effect on biological risk at all in most carriers (flat blue lines). However, if it occurs in people with a rare mutation in some specific gene (big purple arrow), which by itself predisposes to the disease with incomplete penetrance (where not everyone with the mutation necessarily develops the disease), then it can have a modifying effect, strongly increasing the likelihood of actual expression of the disease symptoms.

Again, this kind of scenario is well documented and is particularly well illustrated by Hirschsprung disease. This disorder, which affects innervation of the gut, can be caused by mutations in any one of about 18 known genes, one of which encodes the Ret tyrosine kinase. However, mutations in this gene are not completely penetrant – some people with it do not develop disease or have only a mild form. Recent studies have found that simultaneously carrying a common variant in the same gene increases the likelihood that carriers of the rare mutation will show severe disease. The common variant thus modifies the risk of disease substantially, but only in carriers of a rare mutation. (In this case it is in the same gene, but that doesn’t have to be the case). 

The last scenario, D, is quite different. Here, the common variant is not doing anything itself. It’s not even linked to another common variant that is doing something. Instead, it is linked to a rare mutation that causes disease with much higher penetrance. Or, to put it better, the rare mutation is linked to it. Any new mutation must arise on a background of some set of common SNPs (a “haplotype”), with which it will tend to be subsequently co-inherited. If a rare mutation that increases risk of disease rises to an appreciable frequency then it will necessarily increase the frequency of the SNPs in that haplotype in people with the disease, giving rise to what has been called a “synthetic association”.

Any one mutation might be too rare to cause such an effect (especially if it is likely to be selected against precisely because it causes disease), but if you have multiple rare mutations at a given locus, and if they happen to occur by chance more on one haplotype than another, then you could get an aggregate effect that could give a tiny difference in frequency of the sort detected by GWAS.

There are now many documented examples where GWAS signals are explained by synthetic associations with rare mutations in the sample, which have much larger odds ratios (e.g., 1, 2, 3, 4). On the other hand, there are also cases where no such rare mutations have been found (e.g., 5, 6), suggesting that such a mechanism is by no means universal. It is difficult indeed to know how prevalent that situation will turn out to be, though large-scale whole-genome sequencing studies currently underway should help address this question. (See here for theoretical discussions: 7, 8, 9, 10).

Both scenarios C and D are congruent with the repeated finding that many of the genes implicated by GWAS (with small effect sizes) are known to sometimes carry rare mutations linked to a high risk of the same disease. That would fit with a mechanism whereby common variants at a given locus increase the penetrance of rare mutations in the same gene, but have little effect otherwise (scenario C). Or it would fit with GWAS signals actually arising from synthetic association with high-penetrance rare mutations in the population (where the common variant tags these haplotypes but has no effect itself whatsoever; scenario D).

Teasing these various scenarios apart is a challenge, especially as, for any given disease, different scenarios may pertain for different SNPs. One method has been to try and find a functional effect of a common SNP at the molecular level. For example, SNPs may affect the expression of a gene, altering binding of regulatory proteins to the parts of DNA that specify how much of the protein to make, in which cells and under which conditions. Multiple such examples have been documented (sometimes with surprising results, as when the gene thus affected is actually quite distant to the SNP itself).

However, finding some effect of a common SNP on expression of a gene at a molecular level does not explain how it affects disease risk. Any of scenarios A, B or C could still pertain, and even scenario D is not ruled out by such findings. Indeed, it is not even clear what kind of molecular-level effect we should expect to explain a tiny odds ratio. Should we expect a small effect at the molecular level, or a big effect at the molecular level that translates to a small effect at the organismal level? Or a big effect at the organismal level, but only in combination with other genetic or environmental insults?

That leaves something of a Catch-22 situation for researchers looking for functional effects of SNPs at the biological level – too small an effect and it will never be detected in messy biological experiments; too big and it will have a rather glaring discrepancy with the epidemiological odds ratio. In the end, it may prove impossible to definitively investigate such small individual epidemiological effects at the biological level, whether from genetic or environmental factors.

This doesn’t mean individual GWAS signals are not useful, of course – they certainly point to loci of interest for further study and have successfully implicated previously unknown biochemical pathways in various diseases (e.g., autophagy in Crohn’s disease). It does mean, however, that the interpretation of individual SNP associations may remain a bit vague.

On the other hand, while the biological effect of any single SNP in isolation may be small, their aggregate effect should be large, at least if the model of disease being cause by a polygenic load of such common risk alleles is correct. Indeed, even if the burden of common alleles is not by itself sufficient to cause disease (e.g., in a scenario where they act collectively as a polygenic modifier of rare mutations, which I consider the most likely scenario), they may still have biological effects in aggregate on relevant traits.

There is now an ever-growing number of studies taking that approach, correlating polygenic scores of risk for various diseases (based on aggregate SNP burden) with a range of biological phenotypes. Whether this approach will really help reveal underlying pathogenic mechanisms remains to be seen. More on that in a later post.

With thanks to John McGrath for helpful comments and edits.

*The usual way around this is to model the effects of a SNP on the liability scale, rather than the observed scale of risk. This is based on the idea that underlying the observed discontinuous distribution of a disease is a normally distributed burden of liability, which effectively remains latent until some threshold of burden is passed, in which case disease results. As a mathematical model to describe risk across the population this works reasonably well, given a host of assumptions. It is a mistake, however, in my mind, to think that the model reflects pathogenic mechanisms in individuals.

Tuesday, July 28, 2015

The Genetics of Neurodevelopmental Disorders

The Genetics of Neurodevelopmental Disorders is a new book that will be published by Wiley in 2015. It is due out in August (in Europe) and September (in the USA), and is available on Amazon here

I had the pleasure of editing the book, which comprises 14 chapters from world-leading scientists and clinicians. Our aim is to provide a timely synthesis of this fast-moving field where so much exciting progress has been made in recent years. Below I have reproduced the Foreword from the book, which outlines the rationale for writing it and the conceptual principles on which it is based, as well as a summary of the topics covered (giving an overview of the state of the field in the process). There are also links to two chapters that are freely available. On behalf of all the authors, I hope the book will prove useful.

The term “neurodevelopmental disorders” is clinically defined in psychiatry as “a group of conditions with onset in the developmental period… characterized by developmental deficits that produce impairments of personal, social, academic, or occupational functioning” [DSM-5]. This term encompasses the clinical categories of intellectual disability (ID), developmental delay (DD), autism spectrum disorders (ASD), attention-deficit hyperactivity disorder (ADHD), speech and language disorders, specific learning disorders, tic disorders and others.

However, the term can be defined differently, not based on age of onset or clinical presentation, but by an etiological criterion, to mean disorders arising from aberrant neural development. This definition includes many forms of epilepsy (considered either as a distinct disorder or as a co-morbid symptom) as well as disorders like schizophrenia (SZ), which have later onset but which can still be traced back to neurodevelopmental origins. Though the symptoms of SZ itself typically arise only in late teens or early twenties, convergent evidence of epidemiological risk factors during fetal development and very early deficits apparent in longitudinal studies strongly indicate that SZ is a disorder of neural development, though its clinical consequences may remain latent for many years.

Collectively, severe neurodevelopmental disorders affect ~5% of the population (though exact numbers are almost impossible to obtain, due to changing diagnostic criteria and substantial co-morbidity between clinical categories). These disorders impact on the most fundamental aspects of human experience: cognition, language, social interaction, perception, mood, motor control, sense of self. They impair function, often severely, and restrict opportunities for sufferers, as well as placing a heavy burden on families and caregivers. As lifelong illnesses, they also give rise to a substantial economic burden, both in direct healthcare costs and indirect costs due to lost opportunity.

The treatments currently available for neurodevelopmental disorders are very limited and problematic. Intensive educational interventions may help ameliorate some cognitive or behavioural difficulties, such as those associated with ID or ASD, but to a limited extent and without addressing the underlying pathology. With respect to psychiatric symptoms, the mainstays of pharmacotherapy (antipsychotic medication, mood stabilizers, antidepressants and anxiolytics) all emerged between the 1940’s and 1960’s with almost no new drugs being developed since. Most of these treatments were discovered serendipitously, and their mechanisms of action remain poorly understood. In most cases, the existing treatments are only partially effective and can induce serious side effects. This is also true for the range of anticonvulsants, and, for all these drugs, it is typically impossible to predict from symptom profiles alone whether individual patients will benefit from a particular drug or possibly be harmed by it. These difficulties and the attendant poor outcomes for many patients arise from not knowing the causes of disease in particular patients and not understanding the underlying pathogenic mechanisms. Genetic research promises to address both these issues.

Neurodevelopmental disorders are predominantly genetic in origin and have often been thought of as falling into two groups. The first includes a very large number of individually rare syndromes with known genetic causes. Examples include Fragile X syndrome, Down syndrome, Rett syndrome and Angelman syndrome but there are literally hundreds of others. Each of these is clearly caused by a single genetic lesion, sometimes involving an entire chromosome or a section of chromosome, sometimes affecting a single gene. Most are characterised by ID, but many also show high rates of epilepsy, ASD or other neuropsychiatric symptoms.

The second group comprises idiopathic cases of ID, ASD, SZ or epilepsy – those with no currently known cause. Despite the lack of an identified genetic lesion, there is still very strong evidence of a genetic etiology across these categories. All of these conditions are highly heritable, showing high levels of twin concordance, much higher in monozygotic than in dizygotic twins, substantially increased risk to relatives and typically zero effect of a shared family environment, indicating strong genetic causation.

What has not been clear is whether these so-called “common disorders” are simply collections of rare genetic syndromes that we cannot yet discriminate, or whether they have a very different genetic architecture. The dominant paradigm in the field has held that the idiopathic, non-syndromic cases of common disorders like ASD or SZ reflect the extreme end of a continuum of risk across the population. This is based on a model involving the segregation of a very large number of genetic variants, each of small effect alone, which can, above a collective threshold of burden in individuals, result in frank disease.

Recent genetic discoveries are prompting a re-evaluation of this model, as well as casting doubt on the biological validity of clinical diagnostic categories. After decades of frustration, the genetic secrets of these conditions are finally yielding to new genomic microarray and sequencing technologies. These are revealing a growing list of rare, single mutations that confer high risk of ASD, ID, SZ or epilepsy, particularly epileptic encephalopathies.

These findings strongly reinforce a model of genetic heterogeneity, whereby common clinical categories do not represent singular biological entities, but rather are umbrella terms for a large number of distinct genetic conditions. These conditions are individually rare but collectively common. Strikingly, almost all of the identified mutations are associated with variable clinical manifestations, conferring risk across traditional diagnostic boundaries. These findings fit with large-scale epidemiological studies that also show shared risk across these disorders. Thus, while current diagnostic categories may reflect more or less distinct clinical states or outcomes, they do not reflect distinct etiologies.

The “genetics of autism” is thus neither singular nor separable from the “genetics of intellectual disability”, the “genetics of schizophrenia” or the “genetics of epilepsy”. The more general term of “developmental brain dysfunction” has been proposed to encompass disorders arising from altered neural development, which can manifest clinically in diverse ways. This book is about the genetics of developmental brain dysfunction.

A lot can go wrong in the development of a human brain. The right numbers of hundreds of distinct types of nerve cells have to be generated in the right places, they have to migrate to form highly organised structures, and they must extend nerve fibres, which navigate their way through the brain to ultimately find and connect with their appropriate partners, avoiding wrong turns and illicit interactions. Once they find their partners they must form synapses, the incredibly complex and diverse cellular structures that mediate communication between nerve cells. These synapses are also highly dynamic, responding to patterns of activity by strengthening or weakening the connection.

The instructions to carry out these processes are encoded in the genome of the developing embryo. Each of these aspects of neural development requires the concerted action of the protein products of thousands of distinct genes. Mutations in any one of them (or sometimes in several at the same time) can lead to developmental brain dysfunction.

The identification of numerous causal mutations has focused attention on the roles of the genes affected, with a number of prominent classes of neurodevelopmental genes emerging. These include genes involved in early brain patterning and proliferation, those mediating later events of cell migration and axon guidance, and a major class involved in synapse formation and subsequent activity-dependent synaptic refinement, pruning and plasticity. Also highlighted are a number of biochemical pathways and networks that appear especially sensitive to perturbation.

Genetic discoveries thus allow an alternate means to classify disorders, based on the underlying neurodevelopmental processes affected. This provides more etiologically valid and arguably more biologically coherent categories than those based on clinical outcome. For individual patients, the application of microarray and sequencing technologies is already changing clinical practice in diagnosis and management of neurodevelopmental disorders. This will only increase as more and more pathogenic mutations are identified.

Such discoveries also provide entry points to enable the elucidation of pathogenic mechanisms, where exciting progress is being made using cellular and animal models. For any given mutation, this involves defining the defects at a cellular level (in the right cells), and working out how such defects propagate to the levels of neural circuits and systems, ultimately producing pathophysiological states that underlie neuropsychiatric symptoms. Definition of these pathways will hopefully lead to a detailed enough understanding of the molecular or circuit-level defects to rationally devise new therapeutics.

The elucidation of the heterogeneous genetic and neurobiological bases of neurodevelopmental disorders should thus enable a much more personalised approach to diagnosis and treatment for individual patients, and a shift in clinical care for these disorders from an approach based on superficial symptoms and generic medicines, to one based on detailed knowledge of specific causes and mechanisms.

The book is organised into several sections:

Chapters 1-6 cover broad conceptual issues relevant to neurodevelopmental disorders in general. These are informed by recent advances in genomic technologies, which have transformed our view of the genetic architecture of both rare and so-called “common” neurodevelopmental disorders. These chapters will consider the genetic heterogeneity of clinical categories like ASD or SZ, the relative importance of different types of mutations (common vs rare; single-gene vs large deletions or duplications; inherited vs de novo), etiological overlap between clinical categories and complex interactions between two or more mutations or between genetic and environmental factors.     

A preprint of Chapter 1, by me, on The Genetic Architecture of Neurodevelopmental Disorders, is available here

Chapters 7-9 present our current understanding of several different types of disorder, grouped by the neurodevelopmental process impacted. Consideration of disorders from this angle provides a more rational and biologically valid approach than consideration from the point of view of clinical symptoms, which can be arrived at through various routes.

Chapters 10-11 deal with the elucidation of pathogenic mechanisms, following genetic discoveries. They include chapters on cellular models (using induced pluripotent stem cells derived from patients) and animal models (recapitulating pathogenic mutations in mice), which are revealing the routes of pathogenesis, from defects in diverse cellular neurodevelopmental processes to resultant alterations in neural circuits and brain systems, which ultimately impinge on behaviour. The manifestation of these defects in humans also depends on processes of learning and experience-dependent development that proceed for many years after birth. Taking this aspect of development seriously is essential as it is a critical period where symptoms can be exacerbated if neglected or potentially improved by intensive interventions. 

Chapters 13-14 consider the clinical implications of recent discoveries and of the general principles described in earlier chapters. Foremost among these is the recognition of extreme genetic heterogeneity, meaning that understanding what is going on in any particular patient requires knowledge of the specific underlying genetic cause. The dramatic reductions in cost for whole-genome sequencing mean such diagnoses will become far easier to make, with important implications for clinical genetic practice (including preimplantation or prenatal screening or diagnosis). Finally, the study of cellular and animal models of specific disorders is already suggesting potential therapeutic avenues for some conditions. These advances illustrate a general principle – to treat these conditions we need to identify and understand the underlying biology and design therapies to treat the specific cause in each patient and not just the generic symptoms.

A preprint of Chapter 13, by Gholson Lyon and Jason O'Rawe, on Human genetics and clinical aspects of neurodevelopmental disorders is available here.

The full Table of Contents is shown below:

           Kevin J. Mitchell

1.     The Genetic Architecture of Neurodevelopmental Disorders
Kevin J. Mitchell

2.     Overlapping Etiology of Neurodevelopmental Disorders
Eric Kelleher and Aiden Corvin

3.     The Mutational Spectrum of Neurodevelopmental Disorders
Nancy D. Merner, Patrick A. Dion and Guy A. Rouleau

4.     The Role of Genetic Interactions in Neurodevelopmental Disorders
Jason H. Moore and Kevin J. Mitchell

5.     Developmental Instability, Mutation Load, and Neurodevelopmental Disorders
Ronald A. Yeo and Steven W. Gangestad

6.     Environmental Factors and Gene-Environment Interactions
John McGrath

7.     The Genetics of Brain Malformations
M. Chiara Manzini and Christopher A. Walsh

8.     Disorders of Axon Guidance
Heike Blockus and Alain Chédotal

9.     Synaptic Disorders
Catalina Betancur and Kevin J. Mitchell

10.  Human Stem Cell Models of Neurodevelopmental Disorders
Peter Kirwan and Frederick J. Livesey

11.  Animal Models for Neurodevelopmental Disorders
Hala Harony-Nicolas and Joseph D. Buxbaum

12.  Cascading Genetic and Environmental Effects on Development: Implications for Intervention
Esha Massand and Annette Karmiloff-Smith

13.  Human Genetics and Clinical Aspects of Neurodevelopmental Disorders
Gholson J. Lyon and Jason O’Rawe

14.  Progress Toward Therapies and Interventions for Neurodevelopmental Disorders
Ayokunmi Ajetunmobi and Daniela Tropea

Thursday, April 30, 2015

Genetics in Modern Medicine – the Future is Now

--> Human Genome Project was founded on the premise that it would unlock the secrets of disease and lead to new cures for many disorders. While the new cures have mostly yet to materialise, the secrets of disease are indeed being revealed, in ways that will transform medicine over the coming years. Both our knowledge of the genetic causes of disease and our ability to test for those causes have increased exponentially in recent years. These advances will place genetic testing at the front line of diagnostics, not just for the relatively small number of already well-known inherited disorders, but for an ever-widening array of conditions, both rare and common.

The lifetime prevalence of rare disorders in European populations is estimated at 6-8% of the population (National Rare Disease Plan for Ireland, 2014-2018). Over 6,000 distinct genetic disorders are already defined and more are being discovered at an increasing pace. For many patients with such disorders, their experience with the health system involves a long and frustrating diagnostic odyssey. They are typically seen by various specialists for various symptoms, but the connections between them are not always recognised. A referral for genetic testing may be made eventually, but usually as a last resort rather than a first option.

In a growing proportion of such cases, genetic testing can reveal the underlying cause of the condition, bringing certainty and insight to the diagnosis. While specific medications may not exist that target each condition, a genetic diagnosis can often provide useful predictions of prognosis and treatment responsiveness. This is especially true for the hundreds of metabolic disorders, which may be treatable by dietary interventions or supplements.

But even in cases where there are no direct medical implications, just receiving a specific diagnosis can be highly beneficial in helping patients and their families cope with the situation. In addition, many international support groups have arisen relating to specific disorders, or for rare diseases in general, such as NORD (U.S.), GRDO (Ireland) and Rare Disease UK. These organisations are helping patients, parents and clinicians share information, compare experiences and improve outcomes. Genetic information can also inform future reproductive decisions, including possibilities such as pre-implantation genetic screening.

Rare mutations can cause common disorders

The effects of genetic mutations are not restricted to what we typically think of as rare disorders, however. Discoveries over the last several years are illustrating their central role in much more common disorders, such as epilepsy, autism, schizophrenia, Alzheimer’s and Parkinson’s disease, many cancers and other conditions. Indeed, many of those diagnostic categories may in fact be umbrella terms for a multiplicity of rare disorders that manifest with similar symptoms.

For neuropsychiatric conditions, it has long been known that such disorders are highly heritable, but it had not been possible to identify causal genes. That has changed, with the development of new DNA sequencing technologies, yielding insights that overturn our conception of such disorders. Rather than reflecting a single entity, broad clinical categories like autism or epilepsy obscure an extreme diversity of underlying conditions. Each of these conditions may be quite rare but there are so many of them that manifest in similar ways that collectively they result in highly prevalent disorders. Genetics now provides the tools to distinguish them.

The causal mutations in patients with these conditions can disrupt single genes or can delete or duplicate small sections of chromosomes, affecting multiple genes at once. For very severe cases, the mutations will often have arisen de novo, in the generation of egg or, more commonly, sperm cells. But others are inherited, often from parents who are clinically unaffected, despite carrying the mutation. This highlights the complexity in relating genotypes to phenotypes – the clinical presentation of such mutations is quite variable and often depends on other genetic or environmental factors. Nevertheless, in a patient showing symptoms, the identification of a major mutation can reveal important information as to the primary cause.

For example, for patients with a diagnosis of autism – a diagnosis based on symptoms alone – genetic testing for specific conditions like Fragile X syndrome or Rett syndrome has been in place for some time. This is now being expanded to include testing for a growing number of chromosomal disorders or single-gene mutations, which collectively can now explain ~15% of cases – a huge increase from just a few years ago. This percentage is growing all the time as causal mutations in new genes are identified (reaching 20-25% in recent studies). The successes for autism are likely to be duplicated for other conditions as the number of sequenced patient genomes increases.

Genome sequencing now an affordable front-line option

The pace of technological change in this field is simply staggering. We are moving from a
position of being able to test a few specific genes implicated in any particular disorder, to one where it will be cheaper and faster, as well as more informative, to sequence the patient’s entire genome. It took thousands of researchers over ten years to sequence the reference Human Genome, at a total cost of about $3,000,000,000. Today, a human genome can be sequenced for under $2,000, in about a day, maybe two.

Those sequencing costs and times are still falling as new technologies are developed and economies of scale brought to bear. This brings genome sequencing into the cost range of many blood tests, radiological scans, or other investigative procedures and suggests it may soon become a front-line test for many patients with idiopathic disease. Indeed, it may become cheaper for doctors to order a genome sequence than to spend any of their own time wondering about whether to order it.

But genomic data are only useful if someone can interpret them, a far greater task than simply checking for the presence of a mutation in a specific gene. As it happens, each of us carries a couple hundred mutations in our genome that seriously impact on gene function. Most of these do not cause disease, however, and it is therefore a challenge to recognise a pathogenic mutation amongst this background burden of mutations we all carry. That job will be made easier as genetic information becomes available for more and more patients.

A national strategy for genetic services

The enormous potential benefits of such information have been recognised in several countries, most recently in the UK where the NHS has launched a project to sequence 100,000 genomes, including those of thousands of patients with diverse disorders. The genetic heritage of each population is different, however, with some pathogenic mutations at much higher frequencies in specific populations, as with mutations causing cystic fibrosis in Ireland. Characterising the genetic heritage of the Irish population is thus an important goal as a necessary foundation for clinical genetic testing. 

The health and economic benefits of this genetic revolution will only be realised if there is adequate provision and funding of genetic testing and genetic counselling services. In Ireland we currently lag far behind most other developed countries in the provision of these services, a situation exacerbated by the recent decision to downgrade what was the National Centre for Medical Genetics at Our Lady’s Hospital in Crumlin to a department within the hospital. On the contrary, if the health service in Ireland is to keep pace with international developments and provide the best care for patients, the role of genetics services will have to be greatly expanded in the future. 

[This piece was written for "The Consultant" - the magazine of the Irish Consultants Association and appears in the Spring 2015 issue. It is reproduced here with their consent.]

Monday, November 24, 2014

Top-down causation and the emergence of agency is a paradox at the heart of modern neuroscience. As we succeed in explaining more and more cognitive operations in terms of patterns of electrical activity of specific neural circuits, it seems we move ever farther from bridging the gap between the physical and the mental. Indeed, each advance seems to further relegate mental activity to the status of epiphenomenon – something that emerges from the physical activity of the brain but that plays no part in controlling it. It seems difficult to reconcile the reductionist, reverse-engineering approach to brain function with the idea that we human beings have thoughts, desires, goals and beliefs that influence our actions. If actions are driven by the physical flow of ions through networks of neurons, then is there any room or even any need for psychological explanations of behaviour?

How vs Why
To me, that depends on what level of explanation is being sought. If you want to understand how an organism behaves, it is perfectly possible to describe the mechanisms by which it processes sensory inputs, infers a model of the outside world, integrates that information with its current state, weights a variety of options for actions based on past experience and predicted consequences, inhibits all but one of those options, conveys commands to the motor system and executes the action. If you fill in the details of each of those steps, that might seem to be a complete explanation of the causal mechanisms of behaviour.

If, on the other hand, you want to know why it behaves a certain way, then an explanation at the level of neural circuits (and ultimately at the level of molecules, atoms and sub-atomic particles) is missing something. It’s missing meaning and purpose. Those are not physical things but they can still have causal power in physical systems.

Why are why questions taboo? articulated a theory of causality, which defined four causes or types of explanation for how natural objects or systems (including living organisms) behave. The material cause deals with the physical identity of the components of a system – what it is made of. On a more abstract level, the formal cause deals with the form or organisation of those components. The efficient cause concerns the forces outside the object that induce some change. And – finally – the final cause refers to the end or intended purpose of the thing. He saw these as complementary and equally valid perspectives that can be taken to provide explanations of natural phenomena.

However, Francis Bacon, the father of the scientific method, argued that scientists should concern themselves only with material and efficient causes in nature – also known as matter and motion. Formal and final causes he consigned to Metaphysics, or what he called “magic”! Those attitudes remain prevalent among scientists today, and for good reason – that focus has ensured the phenomenal success of reductionist approaches that study matter and motion and deduce mechanism.

Scientists are trained to be suspicious of “why questions” – indeed, they are usually told explicitly that science cannot answer such questions and shouldn’t try. And for most things in nature, that is an apt admonition – really against anthropomorphising, or ascribing human motives to inanimate objects, like single cells or molecules or even to organisms with less complicated nervous systems and, presumably, less sophisticated inner mental lives. Ironically, though, some people seem to think we shouldn’t even anthropomorphise humans!

Causes of behaviour can be described both at the level of mechanisms and at the level of reasons. There is no conflict between those two levels of explanation nor is one privileged over the other – both are active at the same time. Discussion of meaning does not imply some mystical or supernatural force that over-rides physical causation. It’s not that non-physical stuff pushes physical stuff around in some dualist dance. (After all, “non-physical stuff” is a contradiction in terms). It’s that the higher-order organisation of physical stuff – which has both informational content and meaning for the organism – constrains and directs how physical stuff moves, because it is directed towards a purpose.

Purpose is incorporated in artificial things by design – the washing machine that is currently annoying me behaves the way it does because it is designed to do so (though it could probably have been designed to be quieter). I could explain how it works in purely physical terms relating to the activity and interactions of all its components, but the reason it behaves that way would be missing from such a description – the components are arranged the way they are so that the machine can carry out its designed function. In living things, purpose is not designed but is cumulatively incorporated in hindsight by natural selection. The over-arching goals of survival and reproduction, and the subsidiary goals of feeding, mating, avoiding predators, nurturing young, etc., come pre-wired in the system through millions of years of evolution. 

Now, there’s a big difference between saying higher-order design principles and evolutionary imperatives constrain the arrangements of neural systems over long timeframes and claiming that top-down meaning directs the movements of molecules on a moment-to-moment basis. Most bottom-up reductionists would admit the former but challenge the latter. How can something abstract like meaning push molecules around?

Determinism, randomness and causal slack whole premise of neuroscientific materialism is that all of the activities of the mind emerge from the actions and interactions of the physical components of the brain – and nothing else. If you were transported, Star Trek-style, so that all of your molecules and atoms were precisely recreated somewhere else, the resultant being would be you – it would have all the knowledge and memories, the personality traits and psychological characteristics you have – in short, precisely duplicating your brain down to the last physical detail, would duplicate your mind. All those immaterial things that make your mind yours must be encoded in the physical arrangement of molecules in your brain right at this moment, as you read this.

To some (see examples below, in footnote), this implies a kind of neural determinism. The idea is that, given a certain arrangement of atoms in your brain right at this second, the laws of physics that control how such particles interact (the strong and weak nuclear forces and the gravitational and electromagnetic forces), will lead, inevitably, to a specific subsequent state of the brain. In this view, it doesn’t matter what the arrangements of atoms mean, the individual atoms will behave how they will behave regardless.

To me, this deterministic model of the brain falls at the first hurdle, for one simple reason – we know that the universe is not deterministic. If it were, then everything that happened since the Big Bang and everything that will happen in the future would have been predestined by the specific arrangements and states of all the molecules in the universe at that moment. Thankfully, the universe doesn’t work that way – there is substantial randomness at all levels, from quantum uncertainty to thermal fluctuations to emergent noise in complex systems, such as living organisms. I don’t mean just that things are so complex or chaotic that they are unpredictable in practice – that is a statement about us, not about the world. I am referring to the true randomness that demonstrably exists in the universe, which makes nature essentially non-deterministic.

Now, if you are looking for something to rescue free will from determinism, randomness by itself does not do the job – after all, random “choices” are hardly freely willed. But that randomness, that lack of determinacy, does introduce some room, some causal slack, for top-down forces to causally influence the outcome. It means that the next lower-level state of all of the components of your brain (which will entail your next action) is not completely determined merely by the individual states of all the molecular and atomic components of your brain right at this second. There is therefore room for the higher-order arrangements of the components to also have causal power, precisely because those arrangements represent things (percepts, beliefs, goals) – they have meaning that is not captured in lower-order descriptions.

Information and Meaning
In information theory, a message (a string or sequence of digits, letters, beeps, atoms, anything at all really) has a quantifiable amount of information proportional to how unlikely that particular arrangement is. So, there’s more information in knowing that a roll of a six-sided die ended up a four than in knowing that a flip of a coin ended up heads. That’s important for signal transmission especially because it determines how compressible a message is and how efficiently it can be encoded and transmitted, especially under imperfect or noisy conditions.

Interestingly, that measure is analogous to the thermodynamic property of entropy, which can be thought of as an inverse measure of how much order there is in a system. This reflects how likely it is to be in the state that it’s in, relative to the total number of such states that it could have been in (the coin could only have been in two states, while the die could have been in six). In physical terms, the entropy of a gas, for example, corresponds to how many different organisations or microstates of its molecules would correspond to the same macrostate, as characterised by specific temperature and pressure.

Actually, this analogy is not merely metaphorical – it is literally true that information and entropy measure the same thing. That is because information can’t just exist by itself in some ethereal sense – it has to be instantiated in the physical arrangement of some substrate. Landauer recognised that “any information that has a physical representation must somehow be embedded in the statistical mechanical degrees of freedom of a physical system”.

However, “entropy only takes into account the probability of observing a specific event, so the information it encapsulates is information about the underlying probability distribution, not the meaning of the events themselves.” In fact, the information theory sense of information is not concerned at all with the semantic content of the message. For sentences in a language or for mathematical expressions, for example, information theory doesn’t care if the string is well-formed or whether it is true or not.

So, the string: “your mother was a hamster” has the same information content as its anagram “warmth or easy-to-use harm”, but only the former has semantic content – i.e., it means something. However, that meaning is not solely inherent in the string itself – it relies on the receiver’s knowledge of the language and their resultant ability to interpret what the individual words mean, what the phrase means and, further, to be aware that it is intended as an insult. The string only means something in the context of that knowledge.

In the nervous system, information is physically carried in the arrangements of molecules at the cellular level and in the patterns of electrical activity of neurons. For sensory information, this pattern is imposed by physical objects or forces from the environment (e.g., photons, sound waves, odor molecules) impinging on sensory neurons and directly inducing molecular changes and neuronal activity. The resultant patterns of activity thus form a representation of something in the world and therefore have information – order is enforced on the system, driving one particular pattern of activity from an enormous possible set of microstates. This is true not just for information about sensory stimuli but also for representations of internal states, emotions, goals, actions, etc. All of these are physically encoded in patterns of nerve cell activity.

These patterns carry information in different ways: in gradients of electrical potential in dendrites (an analog signal), in the firing of action potentials (a digital signal), in the temporal sequence of spikes from individual neurons (a temporally integrated signal), in the spatial patterns of coincident firing across an ensemble (a spatially integrated signal), and even in the trajectory of a network through state-space over some period of time (a spatiotemporally integrated signal!). The operations that carry out the spatial and temporal integration occur in the process of transmitting the information from one set of neurons to another. It is thus the higher-order patterns that encode information rather than the lower-order details of the arrangements of all the molecules in the relevant neurons at any given time-point.

But we’re not done yet. Just like that sentence about your mother (yeah, I went there), for that semantic content to mean anything to the organism it has to be interpreted, and that can only occur in the much broader context of everything the organism knows. (That’s why the French provocateur spoke in English to the stupid English knights, instead of saying “votre mère était un hamster”. Not much point insulting someone if they don’t know what it means).

The brain has two ways of representing information – one for transmission and one for storage. While information is transmitted in the flow of electrical activity in networks of neurons, as described above, it is stored at a biochemical and cellular level, through changes to the neural network, especially to the synaptic connections between neurons. Unlike a computer, the brain stores memory by changing its own hardware.

Electrical signals are transformed into chemical signals at synapses, where neurotransmitters are released by one neuron and detected by another, in turn inducing a change in the electrical activity of the receiving neuron. But synaptic transmission also induces biochemical changes, which can act as a short-term or a long-term record of activity. Those changes can alter the strength or dynamics of the synapse, so that the next time the presynaptic neuron fires an electrical signal, the output of the postsynaptic neuron will be different.

When such changes are implemented across a network of neurons, they can make some patterns of activity easier to activate (or reactivate) than others. This is thought to be the cellular basis of memory – not just of overt, conscious memories, but also the implicit, subconscious memories of all the patterns of activity that have happened in the brain. Because these patterns comprise representations of external stimuli and internal states, their history reflects the history of an organism’s experience.

So, each of our brains has been literally physically shaped by the events that have happened to us. That arrangement of weighted synaptic connections constitutes the physical instantiation of our past experience and accumulated knowledge and provides the context in which new information is interpreted.

But I think there are still a couple elements missing to really give significance to information. The first is salience – some things are more important for the organism to pay attention to at any given moment than others. The brain has systems to attribute salience to various stimuli, based on things like novelty, relevance to a current goal (food is more salient when you are hungry, for example), current threat sensitivity and recent experience (e.g., a loud noise is less salient if it has been preceded by several quieter ones).

The second is value – our brains assign positive or negative value to things, in a way that reflects our goals and our evolutionary imperatives. Painful things are bad; things that smell of bacteria are bad; things that taste of bitter/likely poisonous compounds are bad; social defeat is bad; missing Breaking Bad is bad. Food is good; unless you’re dieting in which case not eating is good; an opportunity to mate is (very) good; a pay raise is good; finally finishing a blogpost is good.

The value of these things is not intrinsic to them – it is a response of the organism, which reflects both evolutionary imperatives and current states and goals (i.e., purpose). This isn’t done by magic – salience and value are attributed by neuromodulatory systems that help set the responsiveness of other circuits to various types of stimuli. They effectively change the weights of synaptic connections and reconfigure neuronal networks, but they do it on the fly, like a sound engineer increasing or decreasing the volume through different channels.

Top-down control and the emergence of agency
The hierarchical, multi-level structure of the brain is the essential characteristic that allows this meaning to emerge and have causal power. Information from lower-level brain areas is successively integrated by higher-level areas, which eventually propose possible actions based on the expected value of the predicted outcomes. The whole point of this design is that higher levels do not care about the minutiae at lower levels. In fact, the connections between sets of neurons are often explicitly designed to act as filters, actively excluding information outside of a specific spatial or temporal frequency. Higher-level neurons extract symbolic, higher-order information inherent in the patterned, dynamic activity of the lower level (typically integrated over space and time) in a way that does not depend on the state of every atom or the position of every molecule or even the activity of every neuron at any given moment.

There may be infinite arrangements of all those components at the lower level that mean the same thing (that represent the same higher-order information) and that would give rise to the same response in the higher-level group of neurons. Another way to think about this is to assess causality in a counterfactual sense: instead of asking whether state A necessarily leads to state B, we can ask: if state A had been different, would state B still have arisen? If there are cases where that is true, then the full explanation of why state A leads to state B does not inhere solely in its lower-level properties. Note that this does not violate physical laws or conflict with them at all – it simply adds another level of causation that is required to explain why state A led to state B. The answer to that question lies in what state A means to the organism.

To reiterate, the meaning of any pattern of neural activity is given not just by the information it carries but by the implications of that information for the organism. Those implications arise from the experiences of the individual, from the associations it has made, the contingencies it has learned from and the values it has assigned to past or predicted outcomes. This is what the brain is for – learning from past experience and abstracting the most general possible principles in order to assign value to predicted outcomes of various possible actions across the widest possible range of new situations. is how true agency can emerge. The organism escapes from a passive, deterministic stimulus-response mode and ceases to be an automaton. Instead, it becomes an active and autonomous entity. It chooses actions based on the meaning of the available information, for that organism, weighted by values based on its own experiences and its own goals and motives. In short, it ceases to be pushed around, offering no resistance to every causal force, and becomes a cause in its own right.

This kind of emergence doesn’t violate physical law. The system is still built of atoms and molecules and cells and circuits. And changes to those components will still affect how the system works. But that’s not all the system is. Complex, hierarchical and recursive systems that incorporate information and meaning and purpose produce astonishing and still-mysterious (but non-magical) emergent properties, like life, like consciousness, like will.

Just because it’s turtles all the way down, doesn’t mean it’s turtles all the way up.

Footnote: Here are some examples of prominent scientists and others who support the idea of a deterministic universe and who infer that free will is therefore an illusion (except Dennett and other compatibilists):

Stephen Hawking: "…the molecular basis of biology shows that biological processes are governed by the laws of physics and chemistry and therefore are as determined as the orbits of the planets. Recent experiments in neuroscience support the view that it is our physical brain, following the known laws of science, that determines our actions and not some agency that exists outside those laws…so it seems that we are no more than biological machines and that free will is just an illusion (Hawking and Mlodinow, 2010, emphasis added)." Quoted in this excellent blogpost:

Patrick Haggard: "As a neuroscientist, you've got to be a determinist. There are physical laws, which the electrical and chemical events in the brain obey. Under identical circumstances, you couldn't have done otherwise; there's no 'I' which can say 'I want to do otherwise'. It's richness of the action that you do make, acting smart rather than acting dumb, which is free will."

Sam Harris: "How can we be “free” as conscious agents if everything that we consciously intend is caused by events in our brain that we do not intend and of which we are entirely unaware?"

Jerry Coyne: "Your decisions result from molecular-based electrical impulses and chemical substances transmitted from one brain cell to another. These molecules must obey the laws of physics, so the outputs of our brain—our "choices"—are dictated by those laws."

Daniel Dennett: Who concedes physical determinism is true but sees free will as compatible with that. This is a move that I have never fully understood the logic of or found at all convincing, yet apparently some form of compatibilism is a majority view among philosophers these days.

Further reading:

Baumeister RF, Masicampo EJ, Vohs KD. (2011) Do conscious thoughts cause behavior? Annu Rev Psychol. 2011;62:331-61.

Björn Brembs (2011) Towards a scientific concept of free will as a biological trait: spontaneous actions and decision-making in invertebrates. Proc Biol Sci. 2011 Mar 22;278(1707):930-9.

Bob Doyle (2010) Jamesian Free Will, the Two-Stage Model of William James. William James Studies 2010, Vol. 5, pp. 1-28.

Buschman TJ, Miller EK.(2014) Goal-direction and top-down control. Philos Trans R Soc Lond B Biol Sci. 2014 Nov 5;369(1655).

Damasio, Antonio (1994). Descartes' Error: Emotion, Reason, and the Human Brain, HarperCollins Publisher, New York.

George Ellis (2009) Top-Down Causation and the Human Brain. In Downward Causation and the Neurobiology of Free Will. Nancey Murphy, George F.R. Ellis, and Timothy O’Connor (Eds.) Springer-Verlag Berlin Heidelberg

Friston K. (2010) The free-energy principle: a unified brain theory? Nat Rev Neurosci. 2010 Feb;11(2):127-38.

James Gleick (2011) The Information: A History, a Theory, a Flood

Paul Glimcher: Indeterminacy in brain and behavior. Annu Rev Psychol. 2005;56:25-56.

Douglas Hofstadter (1979) Gödel, Escher, Bach

Douglas Hofstadter (2007) I am a Strange Loop

William James (1884) The Dilemma of Determinism.

Roger Sperry (1965) Mind, brain and humanist values. In New Views of the Nature of Man. ed. J. R. Platt, University of Chicago Press, Chicago, 1965.

Roger Sperry (1991) In defense of mentalism and emergent interaction. Journal of Mind and Behavior 12:221-245 (1991)