All

2021

Barton AR, Sherman M, Mukamel R, Loh PR. Whole-exome imputation within UK Biobank powers rare coding variant association and fine-mapping analyses. Nat Genet. 2021;53(8):1260–1269.
Exome association studies to date have generally been underpowered to systematically evaluate the phenotypic impact of very rare coding variants. We leveraged extensive haplotype sharing between 49,960 exome-sequenced UK Biobank participants and the remainder of the cohort (total n ≈ 500,000) to impute exome-wide variants with accuracy R2 > 0.5 down to minor allele frequency (MAF) ~0.00005. Association and fine-mapping analyses of 54 quantitative traits identified 1,189 significant associations (P < 5 × 10-8) involving 675 distinct rare protein-altering variants (MAF < 0.01) that passed stringent filters for likely causality. Across all traits, 49% of associations (578/1,189) occurred in genes with two or more hits; follow-up analyses of these genes identified allelic series containing up to 45 distinct 'likely-causal' variants. Our results demonstrate the utility of within-cohort imputation in population-scale genome-wide association studies, provide a catalog of likely-causal, large-effect coding variant associations and foreshadow the insights that will be revealed as genetic biobank studies continue to grow.
Zekavat S, Lin SH, Bick A, Liu A, Paruchuri K, Wang C, Uddin MM, Ye Y, Yu Z, Liu X, Kamatani Y, Bhattacharya R, Pirruccello J, Pampana A, Loh PR, Kohli P, McCarroll S, Kiryluk K, Neale B, Ionita-Laza I, Engels E, Brown D, Smoller J, Green R, Karlson E, Lebo M, Ellinor P, Weiss S, Daly M, BioBank Japan Project, FinnGen Consortium, Terao C, Zhao H, Ebert B, Reilly M, Ganna A, Machiela M, Genovese G, Natarajan P. Hematopoietic mosaic chromosomal alterations increase the risk for diverse types of infection. Nat Med. 2021;27(6):1012–1024.
Age is the dominant risk factor for infectious diseases, but the mechanisms linking age to infectious disease risk are incompletely understood. Age-related mosaic chromosomal alterations (mCAs) detected from genotyping of blood-derived DNA, are structural somatic variants indicative of clonal hematopoiesis, and are associated with aberrant leukocyte cell counts, hematological malignancy, and mortality. Here, we show that mCAs predispose to diverse types of infections. We analyzed mCAs from 768,762 individuals without hematological cancer at the time of DNA acquisition across five biobanks. Expanded autosomal mCAs were associated with diverse incident infections (hazard ratio (HR) 1.25; 95% confidence interval (CI) = 1.15-1.36; P = 1.8 × 10-7), including sepsis (HR 2.68; 95% CI = 2.25-3.19; P = 3.1 × 10-28), pneumonia (HR 1.76; 95% CI = 1.53-2.03; P = 2.3 × 10-15), digestive system infections (HR 1.51; 95% CI = 1.32-1.73; P = 2.2 × 10-9) and genitourinary infections (HR 1.25; 95% CI = 1.11-1.41; P = 3.7 × 10-4). A genome-wide association study of expanded mCAs identified 63 loci, which were enriched at transcriptional regulatory sites for immune cells. These results suggest that mCAs are a marker of impaired immunity and confer increased predisposition to infections.
Sheppard B, Rappoport N, Loh PR, Sanders S, Zaitlen N, Dahl A. A model and test for coordinated polygenic epistasis in complex traits. Proc Natl Acad Sci U S A. 2021;118(15):e1922305118.
Interactions between genetic variants-epistasis-is pervasive in model systems and can profoundly impact evolutionary adaption, population disease dynamics, genetic mapping, and precision medicine efforts. In this work, we develop a model for structured polygenic epistasis, called coordinated epistasis (CE), and prove that several recent theories of genetic architecture fall under the formal umbrella of CE. Unlike standard epistasis models that assume epistasis and main effects are independent, CE captures systematic correlations between epistasis and main effects that result from pathway-level epistasis, on balance skewing the penetrance of genetic effects. To test for the existence of CE, we propose the even-odd (EO) test and prove it is calibrated in a range of realistic biological models. Applying the EO test in the UK Biobank, we find evidence of CE in 18 of 26 traits spanning disease, anthropometric, and blood categories. Finally, we extend the EO test to tissue-specific enrichment and identify several plausible tissue-trait pairs. Overall, CE is a dimension of genetic architecture that can capture structured, systemic forms of epistasis in complex human traits.
Ziyatdinov A, Kim J, Prokopenko D, Privé F, Laporte F, Loh PR, Kraft P, Aschard H. Estimating the effective sample size in association studies of quantitative traits. G3. 2021;11(6):jkab057.
The effective sample size (ESS) is a metric used to summarize in a single term the amount of correlation in a sample. It is of particular interest when predicting the statistical power of genome-wide association studies (GWAS) based on linear mixed models. Here, we introduce an analytical form of the ESS for mixed-model GWAS of quantitative traits and relate it to empirical estimators recently proposed. Using our framework, we derived approximations of the ESS for analyses of related and unrelated samples and for both marginal genetic and gene-environment interaction tests. We conducted simulations to validate our approximations and to provide a quantitative perspective on the statistical power of various scenarios, including power loss due to family relatedness and power gains due to conditioning on the polygenic signal. Our analyses also demonstrate that the power of gene-environment interaction GWAS in related individuals strongly depends on the family structure and exposure distribution. Finally, we performed a series of mixed-model GWAS on data from the UK Biobank and confirmed the simulation results. We notably found that the expected power drop due to family relatedness in the UK Biobank is negligible.
Sherman MA, Rodin R, Genovese G, Dias C, Barton A, Mukamel R, Berger B, Park P, Walsh C, Loh PR. Large mosaic copy number variations confer autism risk. Nat Neurosci. 2021;24(2):197–203.
Although germline de novo copy number variants (CNVs) are known causes of autism spectrum disorder (ASD), the contribution of mosaic (early-developmental) copy number variants (mCNVs) has not been explored. In this study, we assessed the contribution of mCNVs to ASD by ascertaining mCNVs in genotype array intensity data from 12,077 probands with ASD and 5,500 unaffected siblings. We detected 46 mCNVs in probands and 19 mCNVs in siblings, affecting 2.8-73.8% of cells. Probands carried a significant burden of large (>4-Mb) mCNVs, which were detected in 25 probands but only one sibling (odds ratio = 11.4, 95% confidence interval = 1.5-84.2, P = 7.4 × 10). Event size positively correlated with severity of ASD symptoms (P = 0.016). Surprisingly, we did not observe mosaic analogues of the short de novo CNVs recurrently observed in ASD (eg, 16p11.2). We further experimentally validated two mCNVs in postmortem brain tissue from 59 additional probands. These results indicate that mCNVs contribute a previously unexplained component of ASD risk.

2020

Brown D, Lin SH, Loh PR, Chanock S, Savage S, Machiela M. Genetically predicted telomere length is associated with clonal somatic copy number alterations in peripheral leukocytes. PLoS Genet. 2020;16(10):e1009078.
Telomeres are DNA-protein structures at the ends of chromosomes essential in maintaining chromosomal stability. Observational studies have identified associations between telomeres and elevated cancer risk, including hematologic malignancies; but biologic mechanisms relating telomere length to cancer etiology remain unclear. Our study sought to better understand the relationship between telomere length and cancer risk by evaluating genetically-predicted telomere length (gTL) in relation to the presence of clonal somatic copy number alterations (SCNAs) in peripheral blood leukocytes. Genotyping array data were acquired from 431,507 participants in the UK Biobank and used to detect SCNAs from intensity information and infer telomere length using a polygenic risk score (PRS) of variants previously associated with leukocyte telomere length. In total, 15,236 (3.5%) of individuals had a detectable clonal SCNA on an autosomal chromosome. Overall, higher gTL value was positively associated with the presence of an autosomal SCNA (OR = 1.07, 95% CI = 1.05-1.09, P = 1.61×10-15). There was high consistency in effect estimates across strata of chromosomal event location (e.g., telomeric ends, interstitial or whole chromosome event; Phet = 0.37) and strata of copy number state (e.g., gain, loss, or neutral events; Phet = 0.05). Higher gTL value was associated with a greater cellular fraction of clones carrying autosomal SCNAs (β = 0.004, 95% CI = 0.002-0.007, P = 6.61×10-4). Our population-based examination of gTL and SCNAs suggests inherited components of telomere length do not preferentially impact autosomal SCNA event location or copy number status, but rather likely influence cellular replicative potential.
Loh PR, Genovese G, McCarroll S. Monogenic and polygenic inheritance become instruments for clonal selection. Nature. 2020;584(7819):136–141.
Clonally expanded blood cells that contain somatic mutations (clonal haematopoiesis) are commonly acquired with age and increase the risk of blood cancer. The blood clones identified so far contain diverse large-scale mosaic chromosomal alterations (deletions, duplications and copy-neutral loss of heterozygosity (CN-LOH)) on all chromosomes, but the sources of selective advantage that drive the expansion of most clones remain unknown. Here, to identify genes, mutations and biological processes that give selective advantage to mutant clones, we analysed genotyping data from the blood-derived DNA of 482,789 participants from the UK Biobank. We identified 19,632 autosomal mosaic chromosomal alterations and analysed these for relationships to inherited genetic variation. We found 52 inherited, rare, large-effect coding or splice variants in 7 genes that were associated with greatly increased vulnerability to clonal haematopoiesis with specific acquired CN-LOH mutations. Acquired mutations systematically replaced the inherited risk alleles (at MPL) or duplicated them to the homologous chromosome (at FH, NBN, MRE11, ATM, SH2B3 and TM2D3). Three of the genes (MRE11, NBN and ATM) encode components of the MRN-ATM pathway, which limits cell division after DNA damage and telomere attrition; another two (MPL and SH2B3) encode proteins that regulate the self-renewal of stem cells. In addition, we found that CN-LOH mutations across the genome tended to cause chromosomal segments with alleles that promote the expansion of haematopoietic cells to replace their homologous (allelic) counterparts, increasing polygenic drive for blood-cell proliferation traits. Readily acquired mutations that replace chromosomal segments with their homologous counterparts seem to interact with pervasive inherited variation to create a challenge for lifelong cytopoiesis.
Terao C, Suzuki A, Momozawa Y, Akiyama M, Ishigaki K, Yamamoto K, Matsuda K, Murakami Y, McCarroll S, Kubo M, Loh PR, Kamatani Y. Chromosomal alterations among age-related haematopoietic clones in Japan. Nature. 2020;584(7819):130–135.
The extent to which the biology of oncogenesis and ageing are shaped by factors that distinguish human populations is unknown. Haematopoietic clones with acquired mutations become common with advancing age and can lead to blood cancers. Here we describe shared and population-specific patterns of genomic mutations and clonal selection in haematopoietic cells on the basis of 33,250 autosomal mosaic chromosomal alterations that we detected in 179,417 Japanese participants in the BioBank Japan cohort and compared with analogous data from the UK Biobank. In this long-lived Japanese population, mosaic chromosomal alterations were detected in more than 35.0% (s.e.m., 1.4%) of individuals older than 90 years, which suggests that such clones trend towards inevitability with advancing age. Japanese and European individuals exhibited key differences in the genomic locations of mutations in their respective haematopoietic clones; these differences predicted the relative rates of chronic lymphocytic leukaemia (which is more common among European individuals) and T cell leukaemia (which is more common among Japanese individuals) in these populations. Three different mutational precursors of chronic lymphocytic leukaemia (including trisomy 12, loss of chromosomes 13q and 13q, and copy-neutral loss of heterozygosity) were between two and six times less common among Japanese individuals, which suggests that the Japanese and European populations differ in selective pressures on clones long before the development of clinically apparent chronic lymphocytic leukaemia. Japanese and British populations also exhibited very different rates of clones that arose from B and T cell lineages, which predicted the relative rates of B and T cell cancers in these populations. We identified six previously undescribed loci at which inherited variants predispose to mosaic chromosomal alterations that duplicate or remove the inherited risk alleles, including large-effect rare variants at NBN, MRE11 and CTU2 (odds ratio, 28-91). We suggest that selective pressures on clones are modulated by factors that are specific to human populations. Further genomic characterization of clonal selection and cancer in populations from around the world is therefore warranted.
Hujoel M, Gazal S, Loh PR, Patterson N, Price A. Liability threshold modeling of case-control status and family history of disease increases association power. Nat Genet. 2020;52(5):541–547.
Family history of disease can provide valuable information in case-control association studies, but it is currently unclear how to best combine case-control status and family history of disease. We developed an association method based on posterior mean genetic liabilities under a liability threshold model, conditional on case-control status and family history (LT-FH). Analyzing 12 diseases from the UK Biobank (average N = 350,000) we compared LT-FH to genome-wide association without using family history (GWAS) and a previous proxy-based method incorporating family history (GWAX). LT-FH was 63% (standard error (s.e.) 6%) more powerful than GWAS and 36% (s.e. 4%) more powerful than the trait-specific maximum of GWAS and GWAX, based on the number of independent genome-wide-significant loci across all diseases (for example, 690 loci for LT-FH versus 423 for GWAS); relative improvements were similar when applying BOLT-LMM to GWAS, GWAX and LT-FH phenotypes. Thus, LT-FH greatly increases association power when family history of disease is available.

2019

Thompson D, Genovese G, Halvardson J, Ulirsch J, Wright D, Terao C, Davidsson O, Day F, Sulem P, Jiang Y, Danielsson M, Davies H, Dennis J, Dunlop M, Easton D, Fisher V, Zink F, Houlston R, Ingelsson M, Kar S, Kerrison N, Kinnersley B, Kristjansson R, Law P, Li R, Loveday C, Mattisson J, McCarroll S, Murakami Y, Murray A, Olszewski P, Rychlicka-Buniowska E, Scott R, Thorsteinsdottir U, Tomlinson I, Moghadam BT, Turnbull C, Wareham N, Gudbjartsson D, Kamatani Y, Hoffmann E, Jackson S, Stefansson K, Auton A, Ong K, Machiela M, Loh PR, Dumanski J, Chanock S, Forsberg L, Perry J. Genetic predisposition to mosaic Y chromosome loss in blood. Nature. 2019;575(7784):652–657.
Mosaic loss of chromosome Y (LOY) in circulating white blood cells is the most common form of clonal mosaicism, yet our knowledge of the causes and consequences of this is limited. Here, using a computational approach, we estimate that 20% of the male population represented in the UK Biobank study (n = 205,011) has detectable LOY. We identify 156 autosomal genetic determinants of LOY, which we replicate in 757,114 men of European and Japanese ancestry. These loci highlight genes that are involved in cell-cycle regulation and cancer susceptibility, as well as somatic drivers of tumour growth and targets of cancer therapy. We demonstrate that genetic susceptibility to LOY is associated with non-haematological effects on health in both men and women, which supports the hypothesis that clonal haematopoiesis is a biomarker of genomic instability in other tissues. Single-cell RNA sequencing identifies dysregulated expression of autosomal genes in leukocytes with LOY and provides insights into why clonal expansion of these cells may occur. Collectively, these data highlight the value of studying clonal mosaicism to uncover fundamental mechanisms that underlie cancer and other ageing-related diseases.