Summary, discussion and future perspectives

Leukodystrophies are rare, inherited, neurological disorders with exclusive or predominant, primary involvement of the white matter of the central nervous system (CNS).1 Leukodystrophies are most prevalent in children and often result in severe neurological deficits and early death.2 While leukodystrophies were initially diagnosed primarily by pathology, since the introduction of MRI into medicine in the nineteen eighties the first evidence for a leukodystrophy typically comes from MRI showing extensive, prominent signal abnormalities in the CNS white matter. Pathology confirmation is usually lacking.
MRI has an incredible sensitivity for white matter abnormalities, but its specificity for underlying histopathologic changes is low.3,4 This low sensitivity sometimes causes problems in differentiating primary from secondary white matter involvement. Cortical neuronal degeneration leads to secondary white matter abnormalities due to Wallerian degeneration with loss of axons and myelin within the white matter.5 Especially in infantile onset neuronal disorders, in which the neuronal degeneration disrupts the normal myelination process, MRI may show prominent cerebral white matter abnormalities. Examples are infantile onset GM1 and GM2 gangliosidoses, which present with an MRI pattern suggestive of a leukodystrophy, whereas later onset variants show the MRI picture of a neuronal degenerative disorder, dominated by cerebral atrophy and only slight white matter signal abnormalities.2 The same is true for the early onset variants of SLC19A3- and ITPA-related encephalopathies, discussed in this thesis (chapter 3 and chapter 10). Because with the elucidation of the genetic defect it is clear that the white matter abnormalities are secondary to neuronal cell death, they may not be regarded as leukodystrophies. However, because of the prominent brain white matter abnormalities, these MRIs were part of the database of leukoencephalopathies in Amsterdam, which was the source of the MRIs for this study. Most leukodystrophy experts include such cases in their scope of interest (see GLIA consortium consensus statement).6

Gene discovery for leukodystrophies
A molecular diagnosis is extremely important for patients and families, because this comes with information on prognosis, options for family planning, and exploration of potential treatment options. In the past and often up to this day, patients frequently experienced and still experience a long diagnostic odyssey to reach a definitive diagnosis. The rarity of these disorders and the lack of multiple multiplex families precluded traditional gene-discovery approaches or hampered their success. In 2010 it was found that still 50% of the patients with a leukodystrophy remained without a specific diagnosis.7
The advent of whole-exome sequencing (WES) in 2009 has created a paradigm shift in our approach to gene discovery for rare Mendelian disorders. Numerous genes discovered with WES are genes previously associated with a known Mendelian phenotype (and assigned OMIM number), but now found in association with also other disease entities or with an expansion of the phenotypic spectrum.8 In a large international cohort study of 8,838 families with a Mendelian phenotype, 50% of the discovered genes were known genes associated with a known phenotype and 30% of the gene discoveries involved known genes associated with a novel phenotype or expansion of the phenotypic spectrum. 20% of the gene discoveries represented novel genes.8 The number of genes associated with a new or previously defined Mendelian disorder identified in the last five years using WES has been enormous. To illustrate this, from 2009 till 2011, the application of WES led to the identification of 30 novel genes associated with a Mendelian disorder and eight new clinical phenotypes linked to a known gene.9 In the following two years this number expanded tremendously to an additional 133 novel genes and 43 new phenotypes linked to known genes, and this number is still growing exponentially.9 Specifically for leukodystrophies, until now, in total 18 additional novel genes associated with a leukodystrophy were identified and four new phenotypes were added to known genes until then associated with a non-leukodystrophy phenotype.10-32 In addition, ten genes (eight novel, two new phenotypes) were identified for neuronal disorders with variable white matter abnormalities.29,33-40

The studies for this thesis started in 2011, shortly after the introduction of WES. The aim of this thesis was to identify the molecular cause of novel, MRI-defined leukodystrophies by WES analysis. MRI pattern recognition was used to identify and define novel leukodystrophies in our large database of unclassified patients. This thesis reflects the overall success of the application of WES identifying the molecular cause of leukodystrophies and shows that the combined approach of phenotypic definition by MRI and WES is extremely powerful for gene discovery. In our studies 77% of the molecular diagnoses concerned genes associated with a known OMIM annotated diseases expanding the phenotypic spectrum (4 genes; PLP1, SLC19A3, LAMA2 and EARS2) or adding a novel disease entity (3 genes; AARS2, HMBS and CTSA); the remaining 23% (2 genes) represented novel disorders associated with a gene not linked to a human disorder or clear phenotype (NUBPL and ITPA).
In this chapter results described in the previous chapters of this thesis are summarized and discussed in light of other published research.

Expansion of the phenotypic spectrum of a known disorder

The identification of the first gene associated with expansion of the phenotypic spectrum is described in chapter 2. In this chapter the search for the genetic cause of 16 patients from 10 families with the X-linked disorder ´hypomyelination of early myelinating structures´ (HEMS) is presented. This novel disorder was previously identified and defined from our database of unclassified white matter disorders by Steenweg et al.41 In HEMS, brain MRI shows an intriguing pattern of abnormalities; brain structures that normally myelinate early (e.g., brainstem, hilus of the dentate nucleus, posterior limb of the internal capsule, optic tracts and tracts to the pericentral cortex) are poorly myelinated in contrast to structures that normally myelinate at a later developmental stage, which show better myelination.41,42 By using X-chromosome exome sequencing we found that all patients had unusual hemizygous variants of PLP1 located in exon 3B or deep in intron 3 outside the splice consensus site. In silico analysis of the variants predicted an effect on PLP1/DM20 alternative splicing, which we confirmed by a decreased PLP1/DM20 ratio in patients fibroblasts and transfected immature immortalized oligodendrocytic cells. Our findings expand the phenotypic spectrum of PLP1-associated disorders and also provide new insights in the possible pathomechanisms underlying the PLP1-related disorders. PLP1 mutations are known to be associated with a broad continuum of neurological phenotypes ranging from connatal Pelizeaus Merzbacher disease (PMD) with severe hypomyelination to pure spastic paraplegia type 2.43-45 An association is seen between the nature of the genetic alteration and the phenotype.46 It is hypothesized that the different genetic abnormalities result in distinct cellular defects depending on their effect and location in the PLP1 gene.44 PLP1 duplications may cause accumulation of the protein in the late endosome or lysosome interfering with oligodendrocyte function such as myelination.44,45 Missense variants result in misfolding of the protein resulting in accumulation of the protein in the endoplasmic reticulum leading to activation of the unfolded protein response and inducing oligodendrocyte apoptosis.44,45 PLP1-specific missense variants that lead only to aberrant PLP1 and not DM20 are expected to induce less oligodendrocyte apoptosis than missense variants that impact both PLP1 and DM20.44 Null variants result in a lack of synthesis of the protein, leading to formation of compact myelin lacking PLP.45 Our results indicate that specific variants resulting in a decreased PLP1 to DM20 ratio should be added to the cellular defects that may underlie PLP1-related disorders and in those cases specifically affect the developmental regulation of myelination.
In chapter 3, we identified a cohort of seven patients with an early-infantile lethal encephalopathy sharing the same novel MRI pattern characterized by incredibly fast subtotal brain degeneration, including the cerebral white matter, and elevated lactate on magnetic resonance spectroscopy.35 WES analysis performed in one patient revealed recessive variants in SLC19A3, encoding the second thiamine transporter, hTHTR2.47 All other patients were also found to have recessive SLC19A3 variants confirming the association of mutated SLC19A3 with this phenotype.35 Brain pathology findings showed similarities to what is observed in patients with Leigh-syndrome.48 Recessive variants in SLC19A3 were previously described in patients with three distinct different clinical phenotypes: biotin-thiamine-responsive basal ganglia disease (BTBGD) (MIM 607483), Wernicke-like encephalopathy (MIM 607483) and a more generalized encephalopathy.49-53 Overall these patients have a later-onset disease, milder clinical signs and symptoms, and present with more limited lesions on MRI, evidently primarily involving the gray matter.49-53 Although 30 additional patients have been reported since our report with different recessive (e.g. missense, nonsense, insertions and deletions, and intronic) variants in SLC19A3, no evident genotype-phenotype correlation could be established.54-62 In our cohort we did, however, observe a similar phenotype within families suggesting that the genotype indeed co-determines the phenotype. Other influences like environmental, epigenetic and additional genetic factors acting on the availability of thiamine in the brain and decompensation of the oxidative phosphorylation system (OXPHOS) could account for the more severe phenotype seen in our patients.35
In chapter 4 we report on a family in which we identified with WES compound heterozygous variants in LAMA2, a known gene associated with congenital muscular dystrophy (now referred to as MDC1A, MIM 607855).63,64 Although the brain abnormalities fitted with the diagnosis of MDC1A, the neurological signs of predominantly distal weakness were atypical for the disease.65 Specific LAMA2 variants that result in partial laminin-α2 chain deficiency can be associated with an adult-onset muscular dystrophy.64 However, these patients mainly have weakness of the proximal muscles.64 Immunohistochemical staining of muscle using an anti-α2-chain antibody did not show (partial) deficiency of laminin-α2 in our patients, although we cannot exclude the possibility that we would have detected partial deficiency of laminin-α2 using a more sensitive antibody. This study is an example of phenotypic expansion of the disease spectrum as a result of WES findings.
In chapter 5 we report a patient with a homozygous in-frame deletion in EARS2, encoding glutamyl-tRNA synthetase and associated with the leukodystrophy ´Leukoencephalopathy with thalamus and brainstem involvement and high lactate´ (LTBL). This disorder was first described by our group in 2012.18,66 Patients have one episode of deterioration with a typical MRI pattern consisting of signal abnormalities in the thalamus, brainstem, and deep cerebral white matter that improve over time in the majority of the cases. A small group of patients have an earlier, neonatal-onset, with stabilization, but no improvement.18,66 The patient reported in chapter 5 had an antenatal onset of the disease with on his MRI absence of thalami with the configuration of a developmental anomaly, without evidence of a lesion. This finding has impact on the interpretation of previous MRI findings in patients with LTBL; in several reported patients the posterior part and the rostrum of the corpus callosum were absent and this was ascribed to a developmental anomaly.18 We now hypothesize that this feature is more likely the result of an early injury rather than a developmental anomaly per se. This interpretation implicates that the phenotype associated with EARS2 variants can be associated with more than one episode of deterioration. So far no genotype-phenotype association has been established for LTBL. In fact, in the group published by Steenweg et al, two brothers with the same genotype had opposing phenotypes, one belonging to the group that shows improvement, and the other one to the early-onset group without improvement.18

Novel disease entity associated with a known gene

In chapter 6 two unrelated patients with similar white matter abnormalities on MRI were independently subjected to WES. Using MRI pattern recognition an additional cohort of 11 patients was subsequently selected as potential candidates. A total of six patients (including the two probands) were found to have recessive variants in the gene AARS2, encoding mitochondrial alanyl tRNA synthetase (mtAlaRs) enzyme. The main clinical phenotype of these patients was a childhood to adult-onset neurological deterioration, with in all females ovarian failure, and on MRI a leukoencephalopathy with involvement of left-right connections, descending tracts, and cerebellar atrophy (described as ´(ovario)leukodystrophy´).25 So far, variants in AARS2 had been found only in a small number of patients with an early-onset fatal cardiomyopathy or myopathy (MIM 614096), also all identified recently with WES.67-70 To elucidate the striking differences between the two disease entities we investigated the effect of the variants present in one of our patients (p.Phe22Cys and p.Val500*) as well as variants present in cardiomyopathy patients (p.Leu55Arg) on OXPHOS functioning in a Saccharomyces cerevisiae yeast model.25 The other cardiomyopathy-associated variant (p.Arg592Trp) could not be investigated because this residue is not conserved in yeast.25 The cardinal difference observed was that the cardiomyopathy-associated missense variant behaved like a null-allele, while the (ovario)leukodystrophy-associated missense variant was only deleterious in stress conditions, suggesting a less severe effect of the (ovario)leukodystrophy variant on OXPHOS.25 Beside a differential effect of the variants on OXPHOS, it was hypothesized that the variant located in the editing domain leading to mistranslation would result in cardiomyopathy. However, recently, additional evidence for the first hypothesis was provided by Euro et al., who used a full-length mtAlaRs homology model to perform a structural analysis of all published AARS2 variants to predict their effects on synthetase structure and function.68 A differential effect of the variants on aminoacylation activity was found between the two groups. The cardiomyopathy phenotype was associated with variants that were predicted to result in a severely compromised aminoacylation activity, while the combined variants in the (ovario)leukodystrophy patients were predicted to retain partial activity.68 The tissue-specific difference might be explained by the vulnerability of the heart for severely reduced mtAlaRS activity in early life resulting in a severe cardiomyopathy and premature death before any brain abnormalities can be observed.68 If this hypothesis is correct, one would expect that infants who survive the infantile period would develop white matter abnormalities. However, the cardiomyopathy phenotype has thus far only been associated with a single founder variant, p.Arg592Trp, in homozygous or compound heterozygous state.68 Another hypothesis could be that this specific variant acts on a possible noncanonical function of mtAlaRs (discussed later) that could have a differential effect on heart and brain.
In chapter 7 and chapter 8 two different leukodystrophies are described with a different disease entity and inheritance mode than the known disorder associated with this gene. It has now been recognized that for some genes both recessive and dominant variants may be associated with a disease, either with a rather similar clinical phenotype that is milder for the dominant variants than for the recessive variants, or with completely different disease entities. For example, recessive GLIALCAM variants are found in patients with classical megalencephalic leukoencephalopathy with subcortical cysts (MLC, MIM 604004) with an early-onset macrocephaly and delayed-onset neurological deterioration, while patients with dominant GLIALCAM variants have benign familial macrocephaly or a mild clinical phenotype with initially a leukoencephalopathy and subsequent normalization of the white matter abnormalities on MRI (MIM 613926).71 Another example is Bethlem myopathy (MIM 158810), a slowly progressive mild myopathy caused by dominant COL6A1, COL6A2 or COL6A3 variants, while recessive COL6A1, COL6A2 or COL6A3 variants are associated with Ullrich congenital muscular dystrophy (MIM 254090), a severe myopathy presenting directly after birth.72 Recently two large unrelated families with adult-onset dominant axonal polyneuropathy were described with a dominant variant in the gene NAGLU, while recessive NAGLU variants are associated with a severe lysosomal childhood-onset disease, mucopolysacharidosis type IIIB (MIM 252920).73
In chapter 7 we report three siblings from one family with bi-allelic variants in the gene HMBS (EC, encoding the enzyme hydroxymethylbilane synthase, previously called porphobilinogen deaminase (PBGD), the third enzyme in heme biosynthesis.74 Autosomal dominant variants are known to cause acute intermittent porphyria (AIP, MIM 176000).74 Our patients do not have the acute AIP-related symptoms as seen in patients with heterozygous HMBS variants, but present with a childhood onset very slowly progressive neurological disorder with a distinct leukoencephalopathy on MRI. HMBS enzyme activity and porphyrin precursors in body fluids were in the range of heterozygous HMBS variant carriers in our patients. In the literature five other patients with bi-allelic HMBS variants have been reported. These patients also had neurological symptoms, but the onset of the disease was earlier, the course more severe and the outcome less favorable.75-79 One of these cases was reported to have brain white matter abnormalities on MRI.79 In these cases reported before, HMBS enzyme activity levels ranged from 1-17% of normal, with an excessive excretion of porphyrin precursors.75-79 Strikingly, also these five patients had no acute AIP-related episodes. We reason that our family and the other five patients reported previously represent a separate clinical phenotype caused by bi-allelic HMBS variants, in which our patients represent the more benign end of the phenotypic spectrum.
In chapter 8 we describe two families with the same novel adult-onset autosomal dominant vascular leukoencephalopathy, referred to as CARASAL. The disease is characterized by therapy-resistant hypertension, strokes, and slow and late cognitive deterioration. The MRI pattern shows a diffuse, progressive leukoencephalopathy preceding the onset of strokes and disproportionate to the degree of clinical severity. Genetic studies, including WES, revealed one heterozygous variant; c.973C>T, p.(Arg325Cys) (NM_000308.2) in CTSA, encoding Cathepsin A (CathA), segregating with CARASAL in both families. An 1145 kb genomic region encompassing this gene on chromosome 20q13.12 was shared by both families suggesting that this variant originates from a common ancestor. Recessive CTSA variants cause the lysosomal storage disorder galactosialidosis due to deficiency of ß-galactosidase and neuraminidase-1 (MIM 256540) and heterozygosity had not been associated with disease so far. We explored a potential toxic effect of the variant, but we did not find evidence of misfolding of mutant CathA in non-reducing SDS-PAGE on patients´ white matter lysates. Hence, the possible functional role of the CTSA variant in CARASAL is as yet unexplained. Potential clues for the pathogenesis are the unusual pathologic changes of white matter small arterioles, including the vasa vasorum, and the striking increase in endothelin-1 expression in CARASAL white matter astrocytes, also compared to other vascular leukoencephalopathies, with hampered oligodendrocyte precursor cell differentiation. Galactosialidosis patients and their parents, who are obligatory carriers, have no clinical overlap with the CARASAL phenotype. However, since hypertension and hypertension-related brain white matter lesions are common and galactosialidosis is very rare, such an association might have been overlooked easily. It could also be envisioned that only this specific CTSA variant is associated with the CARASAL phenotype. Until now 31 recessive CTSA variants associated with galactosialidosis have been described and the p.Arg325Cys CARASAL variant is not one of them (public version of the Human Gene Mutation Database (HGMD¨), ( Single variants associated with a specific phenotype have been reported before. For example, one particular de novo variant, c.959G>A (p.Arg320His), with a dominant-negative effect was found to cause progressive myoclonus epilepsy.81 Finding additional CARASAL patients is important for the confirmation of this hypothesis. Interestingly, a recent publication of a French family with a comparable autosomal dominant vascular leukoencephalopathy with similar MRI and pathology findings showed linkage with an 11.2 Mb interval on chromosome 20q13,82 encompassing the 1145 kb region of the CTSA variant, a strong argument for the same genetic disease.

Novel disorders associated with a gene previously not linked to a human disorder or clear phenotype

In our studies two groups of patients were found to have variants in an OMIM-annotated gene not known to be associated with a well-defined phenotype or overt disease. In chapter 9 we reported six patients with a respiratory chain complex I deficiency without molecular cause and a similar, novel MRI pattern that is characterized by predominant abnormalities of the cerebellar cortex, deep cerebral white matter and corpus callosum. On follow-up, the corpus callosum and cerebral white matter abnormalities improved, while the cerebellar abnormalities were progressive and brainstem abnormalities appeared. WES revealed recessive variants in NUBPL, encoding an iron-sulfur cluster assembly factor for complex I. One case with NUBPL mutations was previously reported but without information on phenotype or MRI findings.83 In line with the supposed function of the gene we provided evidence that NUBPL is involved in the early assembly of the peripheral arm of complex I, detected a decreased amount of NUBPL protein and fully assembled complex I, and found a decreased complex I activity. It is noteworthy that all patients had a c.166G>A missense variant in cis with an intronic branch-site variant (c.815-27T>C) with a relatively high carrier frequency (1.2% of European haplotypes). A previous study showed that overexpression of NUBPL protein carrying the missense variant is able to fully complement complex I activity in a NUBPL-deficient cell line, while the intronic branch-site variant results in aberrant splicing of NUBPL mRNA,84 which we could also confirm. Recently, more evidence for the pathogenicity of the intronic branch-site variant was provided.85 In an in vivo yeast model it was shown that this intronic branch-site variant leads to a severely decreased IND1 (NUBPL homologue in yeast) protein level, an 80% decreased complex I enzyme function, and the same slow cell growth as a knock-out strain of IND1.85 These observations highlight the challenges faced with interpreting WES data, as based on initial in silico pathogenicity prediction models intronic variants (outside the splice consensus site) would be omitted, especially if the carrier frequency (>1%) is rather high.

In chapter 10 we describe a group of seven patients, from four unrelated families, with a novel disorder characterized by a progressive microcephaly, seizures, variable cardiac defects and an early death. The MRI pattern showed white matter abnormalities that would be compatible with early-onset neuronal degeneration and Wallerian degeneration of the associated white matter tracts. Using WES combined with single-nucleotide polymorphism (SNP)-array we found homozygous variants in the gene ITPA, encoding the enzyme inosine triphosphate pyrophosphatase (ITPase). All variants were predicted to cause a loss of enzyme function, which we confirmed by severely decreased ITPase activity in patients´ erythrocytes or fibroblasts. ITPase has a key function in purine metabolism by removing noncanonical triphosphate purines inosine triphosphate pyrophosphatase (ITP) and xanthosine triphosphate (XTP) and their deoxy forms (dITP/dXTP)) from the cellular pool.86-88 Toxicity of accumulated noncanonical nucleotides, leading to neuronal apoptosis and interference with proteins normally using ATP and GTP can play a role in the disease pathogenesis.89-91 Our report is the first associating ITPA variants with a human disorder. Previously, a specific homozygous missense variant (Pro32Thr) in ITPA associated with abolished ITPase enzyme activity in human erythrocytes has been linked to adverse reactions to specific drugs.89,92-96 We hypothesized that the latter variant does not abolish ITPase activity in all cell types and that the apparent discrepancy between the similar loss of ITPase activity in erythrocytes as seen in our patients and dissimilar clinical consequences may be explained by different effects of the variants on enzyme function and structure and variability of ITPA expression between tissues. A knock-out itpa mouse model shows a phenotype similar to that observed in our patients, supporting our hypothesis.97 ITPA-related disease is most likely a spectrum, with the clinical problems related to the p.Pro32Thr variant at the mildest end of the spectrum and this novel disorder at the most severe end. Recognition of future ITPA-encephalopathy patients will be difficult, because the specific MRI pattern is only present for a few months and assessment of ITPase activity and ITP accumulation in erythrocytes does not discriminate between benign and severe variants.

Strategies for gene discovery

Since the first reports of WES application for gene discovery for rare Mendelian disorders in 2009/2010, an incredibly high number of new genes and new disease entities associated with known genes have been published using Next-generation-sequencing (NGS) technologies.98 Although extremely successful, one of the biggest challenges of WES is interpretation of the data. From a massive pool of approximately 20.000-25.000 variants identified in each individual exome, a single candidate gene has to be selected.9,99,100 Various approaches have been explored for the identification of genes influenced by the mode of inheritance, availability of other affected or unaffected family members or unrelated patients, presumed additional value of the application of other genetic techniques and a priori knowledge of disease phenotype and possible function of the encoded protein.
Our experience is that in most situations combined approaches are helpful for successful gene finding (Figure 1). For example, in chapter 8 and chapter 10 we additionally used genetic techniques like homozygosity mapping and microsatellite marker analysis to identify a disease-linked region to narrow down the candidate genomic search area. For the disorder described in chapter 9 we had strong clues for a mitochondrial etiology so we focused first on mitochondrial genes (using the MitoCarta database)101 to reduce the number of candidate variants. It is important to realize that every filtering step used can also discard the pathogenic variant. The rarity of leukodystrophies justifies filtering for rare variants. Our experience is that with a maximal minor allele frequency margin set at 1%, all of the variants in this study would have been detected, except the intronic branch site variant described in chapter 9.

Figure 1. Overview of different inheritance patterns with different exome analysis approaches used for each gene identified. Circles represent females; squares males. Blue solid symbols indicate affected individuals; symbols with a dot indicate unaffected carriers. (A) Several families with ITPA mutations were consanguineous. Therefore, the candidate region could be narrowed down by searching for overlapping homozygous regions using a SNP-array. Whole-exome sequencing (WES) was performed in multiple unrelated patients to allow intra-group comparison. (B) The families with AARS2, HMBS, LAMA2, NUBPL and SLC19A3 mutations showed a possible autosomal recessive inheritance pattern, without reported consanguinity. For the identification of HMBS and LAMA2 sibpair analysis was sufficient for detecting the gene. For the identification of AARS2, NUBPL, and SLC19A3, comparison of multiple unrelated patients was needed to confirm the associated phenotype. (C) The two families with CTSA mutations had an autosomal dominant inheritance pattern. The gene was identified using linkage analysis to narrow down the candidate region, sequencing multiple affected individuals within one family and comparison of the two families. (D) All the families with PLP1 mutations showed a strong X-linked recessive inheritance pattern. The gene was identified by focusing on the X-chromosome, sibpair analysis and comparison of multiple unrelated patients.

However, filtering against public databases (e.g. dbSNP, 1000 Genomes and Exome Variant Server, NHLBI GO Exome Sequencing Project (ESP)) should be performed with caution, especially in the case of late-onset leukodystrophies, because such patients could be included in these databases, because at time of inclusion the disease was still subclinical. For most studies, the greatest reduction of variants was accomplished by the application of ´intra-group comparison´. By a priori selecting groups of patients based on their specific, distinctive MRI pattern we were able to define multiple homogeneous patient cohorts with presumably the same disorder. This approach was both extremely reliable and powerful as we so far we have had a success rate of 80-90% for gene identification by WES in small, homogeneous patients groups from the Amsterdam database of unclassified leukoencephalopathy cases, whereas several larger WES studies have reported success rate for 42% for mixed leukodystrophy cases102 and 16% to 53% in several large studies.69,103-108 Another advantage of using multiple unrelated patients over single cases for WES analysis is that by identification of the gene a new clinical disease entity can be described, facilitating the recognition of future patients and making the diagnostic process for these patients easier and faster. The impact of the availability of multiple unrelated patients within one group is reflected in almost every study reported in this thesis. For example, in chapter 9 we found recessive variants in the gene NUBPL in six patients. In 2010 a single patient with NUBPL variants had already been published but with little clinical information and no MR images.70 The consequence was that the disease phenotype could not be recognized. Our study provides a clear description of the disease entity associated with NUBPL variants, which will preclude this issue and make diagnosis for future patients much easier. Furthermore, the ITPA variants described in chapter 10 might not have been considered pathogenic if identified in a single case because of the known lack of clinical effect of previously described variants.89,96 In chapter 2 the presence of multiple patients with HEMS contributed greatly to our success, because only after sequencing additional HEMS patients who had genomic variants in regions that were covered with WES we were able to identify the candidate gene PLP1.

It should be mentioned that for all studies we used multiple singleton patients instead of trios (parents and patient), but if other patients are not available sequencing trios is more successful than sequencing singleton patients.105 This appears to be especially true for disorders caused by de novo variants, since these variants can only be promptly detected by this trio approach. Sequencing parents along with the patients is also helpful for reducing the number of potential compound heterozygous variants in cases where a recessive disorder is suspected. Still, sequencing single cases/families is rather challenging because the function of most genes in the genome is poorly understood; less than 10% of the genes are annotated with an OMIM disease association.109 The main bottleneck of the filtering pipeline is our ability to interpret the effect of the variants, which is very challenging for synonymous and intronic variants, but also for missense variants. Without the presence of other patients and lack of knowledge about the function of the protein encoded by the gene, conclusions on the potential pathogenicity of a variant are based on in silico prediction software programs (e.g. MutationTaster,110 PolyPhen-2111 and SIFT112), information on nucleotide conservation (PhyloP and GERP scores) and amino acid (GVGD) level and effect on splicing (e.g. the splice site prediction tools; Human Splice Finder,113 MaxEntScan,114 NetGene2,115 NNSplice116 and SpliceSiteFinder-like, and tools for prediction of splice enhancers and inhibitors (e.g. ESEfinder 3.0117), which are often inconclusive and make functional follow-up studies imperative. In the recently published FORGE project the disorder remained unsolved in 28 (20%) of the 174 unclassified families despite that a single candidate gene was found that could be causally related, because of a lack in knowledge of function of the protein encoded by the gene and because no additional families with the same disorder were available.108 Finding another case with a presumably pathogenic variant in the same gene would provide sufficient evidence for causality.
One of the advantages of WES is the unbiased approach of gene finding; the genes that are found in association with a disease may be completely unexpected. Prior knowledge about the gene is therefore not imperative. It is important to realize that filtering based on a priori knowledge (like evidence of mitochondrial dysfunction, as described in chapter 9) may facilitate identification of the gene mutated in the disease of interest, but this stringent filtering approach may also omit the candidate gene. The reason may be that it is not yet known that a particular protein is of importance in a specific process, but it may also be that the hypothesis concerning the disease mechanism was not correct. In chapter 3 the patient´s clinical phenotype, biochemical profile (e.g. elevated lactate), and neuropathologic features of massive neuronal cell death strongly suggested an underlying mitochondrial disorder. WES analyses revealed variants in SLC19A3, a thiamine transporter. Defects of this transporter indirectly impact mitochondrial function, but this gene is absent from the mitocarta database.101

Pitfalls and challenges of WES

As discussed before it has been found that the success rate for WES in providing a molecular diagnosis for unselected groups of single patients with a rare Mendelian disorder ranges from 16% to 53%.69,103-108 The reason that a substantial number of patients still remains without molecular diagnosis relies on several factors. The first is incomplete sequencing coverage of the exome. Typically, a minimum mean coverage of 20-30 reads per base is required for sufficient accuracy of variant detection (depending on using short or long reads). In 2011 different exome capture kits all showed an overall exome capture of around 80% (minimal coverage of 30X).118 However, the last four years this has significantly improved and the latest exome capture kits are able to cover up to 95% of the exome at 20X,119 creating a higher potential diagnostic yield. The second reason is that certain variants are elusive to the technology itself, including trinucleotide repeats (e.g. Fragile X-syndrome), inversions, a certain size range of duplications and deletions, and structural variants (chromosomal translocations and aneuploidy).120 Ongoing improvements of the bioinformatics pipelines including the software for mapping and algorithms developed for variant calling may resolve some of these false-negatives in the future. For detection of copy number variations (CNVs) and small duplications and deletions in WES data numerous tools have been developed in the last few years, but although advances have been made, accurate detection of CNVs in WES data remains challenging.121 It is important to realize that such variants can be missed when filtering the data. This is illustrated by the gene finding process in chapter 9. In this case one of the pathogenic variants in NUBPL (c.677_688insCCTTGTGCTG (p.Glu223Alafs*4) was not detected by our bioinformatics pipeline at that time. Although we assumed a recessive inheritance mode we fortunately filtered for genes with either one or two heterozygous variants to circumvent this possibility. Variants that by definition are refractory to WES are variants in non-coding regions (outside the splice consensus site). In chapter 2 we encountered this problem. Initial WES analysis of two siblings with HEMS, their mother and an unrelated patient with HEMS was unrevealing. After the inclusion of four additional patients and the application of targeted X-chromosome exome sequencing we were able to identify the candidate gene, PLP1. In hindsight the negative WES results can be explained by the lack of coverage of the intronic variants. The second approach was successful because some of the additional sequenced patients had exonic variants. Although most disease causing variants (roughly 85%) are located in functional and coding regions of the genome, the remaining (non-coding causal) variants are still refractory to WES analysis. The non-coding variants identified in this disorder were located in a conserved regulatory intronic region affecting alternative splicing of PLP1.42,122 Splicing is a complex mechanism and regulated by the strength of the canonical splice donor and acceptor sites and branch sites, exonic or intronic enhancers and silencers and formation of RNA structures.123,124 It is estimated that >90% of the human protein-coding genes are subjected to alternative splicing, producing multiple mRNA isoforms.125 Deep intronic variants leading to splicing alterations are likely to be underestimated, as they are not detected by routine DNA sequencing approaches or WES.
As discussed in the previous section of strategies for gene discovery, the main bottleneck of the filtering pipeline is our ability to interpret the effect of the variants. At present, for leukodystrophies, the best approach is WES in a group of unrelated patients with the same disease, so that conclusion for pathogenicity of the variant does not depend solely on knowledge of function of a gene or predicted effect of a variant. For leukodystrophies MRI offers an excellent tool for disease definition due to its high sensitivity and accuracy. We took advantage of this in all chapters and I would recommend it for future studies. As the unclassified leukodystrophies are becoming rarer, collaboration and pooling of data are increasingly important, as already shown by our studies that were all based on world-wide collaborations.

Implications for patients

The presence of a (molecular) diagnosis has major implications for the patients and their families. All patients described in this thesis have had an awfully long diagnostic work-up before WES was performed. Some of the families were already known to us for over 10 years. In some families, three children had died without diagnosis and without prenatal testing ever being an option. After the application of WES the diagnosis was established rapidly for all cases. In addition to a definitive diagnosis, we were now able to inform these families about recurrence risks, options for prenatal testing and for some cases about prognosis. Occasionally a genetic diagnosis can have important therapeutic implications. For example, in a study performed by Sawyer and colleagues, six (26%) of the 105 families with a WES-identified molecular diagnosis had a dramatically change of treatment.108 In chapter 3 (SLC19A3-related encephalopathy) we describe a disorder, for which a fast (molecular) diagnosis can result in a specific therapy that might be life-saving in some cases. The most commonly reported phenotype associated with SLC19A3 variants, ´BTBGD´, can be successfully treated with thiamin and biotin.49,51,53 Haack et al., reported that more severely affected patients presenting as an early-infantile Leigh-like syndrome with loss-of-function variants would also benefit from thiamin and biotin supplementation.57 The last two years several additional reports have been published regarding successful administration of thiamine to SLC19A3 mutated patients, although these mainly concerned the BTBGD or Leigh-like syndrome phenotype.54,55,57-60,62 In the case of early administration of thiamine (and biotin) the clinical and radiological abnormalities can be reversed improving neurological outcome.54,57,59,62 However, we are rather pessimistic about a potential positive effect of thiamine in our infantile group, because due to the very early onset and rapid disease course, the brain damage is beyond repair within one or a few days. SLC19A3-related disorders might be treatable, if thiamine is administrated early enough before significant damage has occurred, but even then the definitive outcome remains uncertain as many studies published so far have a short follow-up.54,57,59,62
As illustrated by the study described in chapter 7, the unbiased approach of WES can lead to unexpected results that have direct implications for the health of siblings. In this particular case, siblings were found to be carrier of a heterozygous pathogenic HMBS gene variant that is associated with a life time risk of 10% for acute porphyric attacks.126 Although this could be regarded as an incidental finding for these siblings, this knowledge has major implications for their future health because these individuals now can take preventive measures the limit the risk of potential life-threating porphyric attacks, hypertension and liver carcinoma.126,127

Insight from gene discovery for leukodystrophies
The original definition of leukodystrophies was highly focused on myelin. Initially it was thought that myelin and oligodendrocyte-specific proteins are involved in all leukodystrophies. In this thesis nine different genes were identified, each associated with a different leukodystrophy. Intriguingly, all identified genes, except for PLP1, do not encode myelin-specific proteins. Due to the unbiased approach of NGS analysis, genes were found that were totally unexpected. Strikingly, a defect in certain common pathways was found to underlie several leukodystrophies. As shown by our results in chapter 11, many genes associated both with hypomyelination and other leukodystrophies are involved in mRNA translation and protein synthesis. Protein synthesis is a highly complex process and numerous proteins are involved (e.g. proteins mediating the activation, initiation, elongation or termination of mRNA translation, aminoacyl tRNA synthetases (aaRSes), cofactors and modifying proteins and ribosomal proteins).128,129 More specifically, our results described in chapters 5, 6 and chapter 11 show that a large group of patients represented defects of aaRSes or RNA polymerase III. Hence, it has become evident that most genes associated with leukodystrophies do not encode intrinsic myelin proteins.

Aminoacyl tRNA synthetases

Human cells contain 17 cytoplasmic aaRSes, 18 mitochondrial aaRSes and two aaRSses present both in the cytoplasm and mitochondria.130 AaRSes are ancient housekeeping enzymes that attach amino acids to their cognate tRNA molecules as essential step in protein synthesis.130 Before the advent of WES in 2010 only seven disorders were known to be associated with an aaRS defect. Up to date, 26 of 37 aaRSes have been implicated in human disease, of which the vast majority (19) was found in the last four years, predominantly using WES.29,36-38,130-132
Although a wide variety of disorders is each associated with a different aaRS enzyme defect, it has become clear that several of these aaRSes, both cytosolic and mitochondrial, are mutated in patients with leukodystrophies (in 2008 DARS2,133 in 2010 AIMP1,134 in 2011 EARS2,18 in 2012 MARS2,16 in 2013 DARS,19 in 2013 LARS2135 and AARS2,25 and in 2014 RARS.26 Although the essential canonical function of these enzymes is the same (mRNA translation), there are striking differences between the different leukodystrophies concerning tissue and cell type specificity and clinical phenotype, like the onset, course and outcome of the disorder. For example, patients with DARS2 variants typically have a childhood or adolescent-onset slowly progressive pyramidal, cerebellar and dorsal column dysfunction,133,136 whilst patients with AARS2 variants experience a rapid neurological deterioration in adulthood that leads to severe disabilities and sometimes death after a long period of clinical stability.25 Intriguingly, for both disorders the MRI pattern shows abnormalities in very specific white matter tracts.25,133,136 On the other hand, it is also striking that hypomyelination with brainstem and spinal cord involvement and leg spasticity (HBSL) and LBSL have a major overlap in specifically affected white matter structures, while the involved aaRSes (encoded by DARS and DARS2), have a different subcellular localization (respectively in the cytoplasm and mitochondria).19,133 It is still unclear what determines the selective vulnerability of different tissues in this group of disorders, and what, within the subgroup of leukodystrophies, determines the selective vulnerability of specific structures within the nervous system, while the canonical function of the affected enzymes is the same. It is suggested that involvement of noncanonical functions of these aaRSes could determine the phenotypes and that the explanation for differences and similarities in phenotypes could be found there.19,133 Indeed, some of the cytosolic aaRSes are reported to have noncanonical functions in biological processes such as regulation of gene transcription and RNA splicing,137 angiogenesis, tumorigenesis and inflammation.130,138 For aaRS defects, as for other defects in housekeeping genes involved in protein synthesis and associated with a leukodystrophy, like the POLR3-related disorders (described below) and VWM, no patients have been reported with two functional null-alleles, suggesting that complete abolishment of the function of the affected enzymes is not compatible with life.

RNA polymerase III
POLR3 is a DNA-directed polymerase, which is involved in the transcription of small non-coding RNAs, including tRNAs.139 Recessive variants in POLR3A and POLR3B, encoding the largest and second largest subunit of POLR3, and POLR1C encoding a shared RNA polymerase I and POLR3 subunit, were found the be the cause of three overlapping leukodystrophy phenotypes: tremor-ataxia with central hypomyelination (TACH), leukodystrophy with oligodontia (MIM 607694) and ´hypomyelination, hypogonatropic hypogonadism and hypodontia syndrome´, also referred to as 4H syndrome.10-12,28,140,141 Since the discovery of the genes, as shown by our study in chapter 11, this is currently one of the most prevalent hypomyelinating disorders with a known molecular defect.142

Future perspectives
The diagnosis and classification of leukodystrophies has changed dramatically since the 19th century.143-146 Although pathology and biochemical findings still play a role in the diagnostic process, they are often not mandatory for a definitive diagnosis, but rather supportive. Although one could expect that with WES the diagnostic process would be reversed (referred to as ´reversed phenotyping´ (first genotyping, then phenotyping)), this has not become a common approach. Indeed, it is shown by us and many others that a priori good phenotyping is and will remain very important for successful WES analysis, at least for leukodystrophies.
The unbiased approach of WES has created huge new research opportunities to be explored. Novel genes are now discovered that would not have been studied because of the lack of correlation with the phenotype or lack of knowledge of the function of the gene. The rapidly increasing insight into genes mutated in leukodystrophies creates novel, unprecedented insight into proteins that are apparently essential for the maintenance of white matter health. It is becoming increasingly apparent that in contrast to what was thought for many years, most leukodystrophies are not caused by variants in myelin specific proteins, like PLP, but are caused by defects in diverse fundamental biological processes, such as mitochondrial respiratory function and processes of mRNA translation and protein synthesis. At present we do not understand why defects in such basic, housekeeping functions specifically affect the maintenance of brain white matter health and function. This should be the subject of future studies. It is to be expected that in the coming years 90-95% of the leukodystrophy patients will receive a molecular diagnosis. With that we will have an almost complete map of proteins that are involved one way or the other in the brain white matter.
The new insights have major implications for our understanding and definition of ´leukodystrophies´. The original definition of a ´leukodystrophy´ was totally myelin focused and assumed a progressive disorder. With the current knowledge of the underlying genetic defects of leukodystrophies and the variable clinical phenotype this definition should be revised. As proposed in chapter 11, we propose that all genetic disorders primarily and predominantly involving CNS white matter structures (and not only myelin) should be referred to as leukodystrophies, independent of the disease course. This new insight has therapy implications: not only should remyelination be achieved, but repair of all affected white matter components.