All

2018

Reshef Y, Finucane H, Kelley D, Gusev A, Kotliar D, Ulirsch J, Hormozdiari F, Nasser J, O’Connor L, Geijn B, Loh PR, Grossman S, Bhatia G, Gazal S, Palamara PF, Pinello L, Patterson N, Adams R, Price A. Detecting genome-wide directional effects of transcription factor binding on polygenic disease risk. Nat Genet. 2018;50(10):1483–1493.
Biological interpretation of genome-wide association study data frequently involves assessing whether SNPs linked to a biological process, for example, binding of a transcription factor, show unsigned enrichment for disease signal. However, signed annotations quantifying whether each SNP allele promotes or hinders the biological process can enable stronger statements about disease mechanism. We introduce a method, signed linkage disequilibrium profile regression, for detecting genome-wide directional effects of signed functional annotations on disease risk. We validate the method via simulations and application to molecular quantitative trait loci in blood, recovering known transcriptional regulators. We apply the method to expression quantitative trait loci in 48 Genotype-Tissue Expression tissues, identifying 651 transcription factor-tissue associations including 30 with robust evidence of tissue specificity. We apply the method to 46 diseases and complex traits (average n = 290 K), identifying 77 annotation-trait associations representing 12 independent transcription factor-trait associations, and characterize the underlying transcriptional programs using gene-set enrichment analyses. Our results implicate new causal disease genes and new disease mechanisms.
Loh PR, Genovese G, Handsaker R, Finucane H, A Reshef Y, Palamara PF, Birmann B, Talkowski M, Bakhoum S, McCarroll S, Price A. Insights into clonal haematopoiesis from 8,342 mosaic chromosomal alterations. Nature. 2018;559(7714):350–355.
The selective pressures that shape clonal evolution in healthy individuals are largely unknown. Here we investigate 8,342 mosaic chromosomal alterations, from 50 kb to 249 Mb long, that we uncovered in blood-derived DNA from 151,202 UK Biobank participants using phase-based computational techniques (estimated false discovery rate, 6-9%). We found six loci at which inherited variants associated strongly with the acquisition of deletions or loss of heterozygosity in cis. At three such loci (MPL, TM2D3-TARSL2, and FRA10B), we identified a likely causal variant that acted with high penetrance (5-50%). Inherited alleles at one locus appeared to affect the probability of somatic mutation, and at three other loci to be objects of positive or negative clonal selection. Several specific mosaic chromosomal alterations were strongly associated with future haematological malignancies. Our results reveal a multitude of paths towards clonal expansions with a wide range of effects on human health.
Hormozdiari F, Gazal S, Geijn B, Finucane H, Ju C, Loh PR, Schoech A, Reshef Y, Liu X, O’Connor L, Gusev A, Eskin E, Price A. Leveraging molecular quantitative trait loci to understand the genetic architecture of diseases and complex traits. Nat Genet. 2018;50(7):1041–1047.
There is increasing evidence that many risk loci found using genome-wide association studies are molecular quantitative trait loci (QTLs). Here we introduce a new set of functional annotations based on causal posterior probabilities of fine-mapped molecular cis-QTLs, using data from the Genotype-Tissue Expression (GTEx) and BLUEPRINT consortia. We show that these annotations are more strongly enriched for heritability (5.84× for eQTLs; P = 1.19 × 10) across 41 diseases and complex traits than annotations containing all significant molecular QTLs (1.80× for expression (e)QTLs). eQTL annotations obtained by meta-analyzing all GTEx tissues generally performed best, whereas tissue-specific eQTL annotations produced stronger enrichments for blood- and brain-related diseases and traits. eQTL annotations restricted to loss-of-function intolerant genes were even more enriched for heritability (17.06×; P = 1.20 × 10). All molecular QTLs except splicing QTLs remained significantly enriched in joint analysis, indicating that each of these annotations is uniquely informative for disease and complex trait architectures.
Zhu Z, Lee P, Chaffin M, Chung W, Loh PR, Lu Q, Christiani D, Liang L. A genome-wide cross-trait analysis from UK Biobank highlights the shared genetic architecture of asthma and allergic diseases. Nat Genet. 2018;50(6):857–864.
Clinical and epidemiological data suggest that asthma and allergic diseases are associated and may share a common genetic etiology. We analyzed genome-wide SNP data for asthma and allergic diseases in 33,593 cases and 76,768 controls of European ancestry from UK Biobank. Two publicly available independent genome-wide association studies were used for replication. We have found a strong genome-wide genetic correlation between asthma and allergic diseases (r = 0.75, P = 6.84 × 10). Cross-trait analysis identified 38 genome-wide significant loci, including 7 novel shared loci. Computational analysis showed that shared genetic loci are enriched in immune/inflammatory systems and tissues with epithelium cells. Our work identifies common genetic architectures shared between asthma and allergy and will help to advance understanding of the molecular mechanisms underlying co-morbid asthma and allergic diseases.
Finucane H, A Reshef Y, Anttila V, Slowikowski K, Gusev A, Byrnes A, Gazal S, Loh PR, Lareau C, Shoresh N, Genovese G, Saunders A, Macosko E, Pollack S, Brainstorm Consortium, Perry J, Buenrostro J, Bernstein B, Raychaudhuri S, McCarroll S, Neale B, Price A. Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nat Genet. 2018;50(4):621–629.
We introduce an approach to identify disease-relevant tissues and cell types by analyzing gene expression data together with genome-wide association study (GWAS) summary statistics. Our approach uses stratified linkage disequilibrium (LD) score regression to test whether disease heritability is enriched in regions surrounding genes with the highest specific expression in a given tissue. We applied our approach to gene expression data from several sources together with GWAS summary statistics for 48 diseases and traits (average N = 169,331) and found significant tissue-specific enrichments (false discovery rate (FDR) < 5%) for 34 traits. In our analysis of multiple tissues, we detected a broad range of enrichments that recapitulated known biology. In our brain-specific analysis, significant enrichments included an enrichment of inhibitory over excitatory neurons for bipolar disorder, and excitatory over inhibitory neurons for schizophrenia and body mass index. Our results demonstrate that our polygenic approach is a powerful way to leverage gene expression data for interpreting GWAS signals.

2017

Weng LC, Choi SH, Klarin D, Smith G, Loh PR, Chaffin M, Roselli C, Hulme O, Lunetta K, Dupuis J, Benjamin E, Newton-Cheh C, Kathiresan S, Ellinor P, Lubitz S. Heritability of Atrial Fibrillation. Circ Cardiovasc Genet. 2017;10(6).
BACKGROUND: Previous reports have implicated multiple genetic loci associated with AF, but the contributions of genome-wide variation to AF susceptibility have not been quantified. METHODS AND RESULTS: We assessed the contribution of genome-wide single-nucleotide polymorphism variation to AF risk (single-nucleotide polymorphism heritability, h2g ) using data from 120 286 unrelated individuals of European ancestry (2987 with AF) in the population-based UK Biobank. We ascertained AF based on self-report, medical record billing codes, procedure codes, and death records. We estimated h2g using a variance components method with variants having a minor allele frequency ≥1%. We evaluated h2g in age, sex, and genomic strata of interest. The h2g for AF was 22.1% (95% confidence interval, 15.6%-28.5%) and was similar for early- versus older-onset AF (≤65 versus >65 years of age), as well as for men and women. The proportion of AF variance explained by genetic variation was mainly accounted for by common (minor allele frequency, ≥5%) variants (20.4%; 95% confidence interval, 15.1%-25.6%). Only 6.4% (95% confidence interval, 5.1%-7.7%) of AF variance was attributed to variation within known AF susceptibility, cardiac arrhythmia, and cardiomyopathy gene regions. CONCLUSIONS: Genetic variation contributes substantially to AF risk. The risk for AF conferred by genomic variation is similar to that observed for several other cardiovascular diseases. Established AF loci only explain a moderate proportion of disease risk, suggesting that further genetic discovery, with an emphasis on common variation, is warranted to understand the causal genetic basis of AF.
Márquez-Luna C, Loh PR, Price A. Multiethnic polygenic risk scores improve risk prediction in diverse populations. Genet Epidemiol. 2017;41(8):811–823.
Methods for genetic risk prediction have been widely investigated in recent years. However, most available training data involves European samples, and it is currently unclear how to accurately predict disease risk in other populations. Previous studies have used either training data from European samples in large sample size or training data from the target population in small sample size, but not both. Here, we introduce a multiethnic polygenic risk score that combines training data from European samples and training data from the target population. We applied this approach to predict type 2 diabetes (T2D) in a Latino cohort using both publicly available European summary statistics in large sample size (Neff  = 40k) and Latino training data in small sample size (Neff  = 8k). Here, we attained a >70% relative improvement in prediction accuracy (from R2  = 0.027 to 0.047) compared to methods that use only one source of training data, consistent with large relative improvements in simulations. We observed a systematically lower load of T2D risk alleles in Latino individuals with more European ancestry, which could be explained by polygenic selection in ancestral European and/or Native American populations. We predict T2D in a South Asian UK Biobank cohort using European (Neff  = 40k) and South Asian (Neff  = 16k) training data and attained a >70% relative improvement in prediction accuracy, and application to predict height in an African UK Biobank cohort using European (N = 113k) and African (N = 2k) training data attained a 30% relative improvement. Our work reduces the gap in polygenic risk prediction accuracy between European and non-European target populations.
Gazal S, Finucane H, Furlotte N, Loh PR, Palamara PF, Liu X, Schoech A, Bulik-Sullivan B, Neale B, Gusev A, Price A. Linkage disequilibrium-dependent architecture of human complex traits shows action of negative selection. Nat Genet. 2017;49(10):1421–1427.
Recent work has hinted at the linkage disequilibrium (LD)-dependent architecture of human complex traits, where SNPs with low levels of LD (LLD) have larger per-SNP heritability. Here we analyzed summary statistics from 56 complex traits (average N = 101,401) by extending stratified LD score regression to continuous annotations. We determined that SNPs with low LLD have significantly larger per-SNP heritability and that roughly half of this effect can be explained by functional annotations negatively correlated with LLD, such as DNase I hypersensitivity sites (DHSs). The remaining signal is largely driven by our finding that more recent common variants tend to have lower LLD and to explain more heritability (P = 2.38 × 10-104); the youngest 20% of common SNPs explain 3.9 times more heritability than the oldest 20%, consistent with the action of negative selection. We also inferred jointly significant effects of other LD-related annotations and confirmed via forward simulations that they jointly predict deleterious effects.
Willems S, Wright D, Day F, Trajanoska K, Joshi P, Morris J, Matteini A, Garton F, Grarup N, Oskolkov N, Thalamuthu A, Mangino M, Liu J, Demirkan A, Lek M, Xu L, Wang G, Oldmeadow C, Gaulton K, Lotta L, Miyamoto-Mikami E, Rivas M, White T, Loh PR, . , Rivadeneira F, Langenberg C, Perry J, Wareham N, Scott R. Large-scale GWAS identifies multiple loci for hand grip strength providing biological insights into muscular fitness. Nat Commun. 2017;8:16015.
Hand grip strength is a widely used proxy of muscular fitness, a marker of frailty, and predictor of a range of morbidities and all-cause mortality. To investigate the genetic determinants of variation in grip strength, we perform a large-scale genetic discovery analysis in a combined sample of 195,180 individuals and identify 16 loci associated with grip strength (P<5 × 10-8) in combined analyses. A number of these loci contain genes implicated in structure and function of skeletal muscle fibres (ACTG1), neuronal maintenance and signal transduction (PEX14, TGFA, SYT1), or monogenic syndromes with involvement of psychomotor impairment (PEX14, LRPPRC and KANSL1). Mendelian randomization analyses are consistent with a causal effect of higher genetically predicted grip strength on lower fracture risk. In conclusion, our findings provide new biological insight into the mechanistic underpinnings of grip strength and the causal role of muscular strength in age-related morbidities and mortality.