Publications

2013

Xie G, Roshandel D, Sherva R, Monach P, Lu EY, Kung T, Carrington K, Zhang S, Pulit S, Ripke S, Carette S, Dellaripa P, Edberg J, Hoffman G, Khalidi N, Langford C, Mahr A, St Clair W, Seo P, Specks U, Spiera R, Stone J, Ytterberg S, Raychaudhuri S, Bakker P, Farrer L, Amos C, Merkel P, Siminovitch K. Association of granulomatosis with polyangiitis (Wegener’s) with HLA-DPB1*04 and SEMA6A gene variants: evidence from genome-wide analysis. Arthritis Rheum. 2013;65(9):2457–68.
OBJECTIVE: To identify genetic determinants of granulomatosis with polyangiitis (Wegener's) (GPA). METHODS: We carried out a genome-wide association study (GWAS) of 492 GPA cases and 1,506 healthy controls (white subjects of European descent), followed by replication analysis of the most strongly associated signals in an independent cohort of 528 GPA cases and 1,228 controls. RESULTS: Genome-wide significant associations were identified in 32 single-nucleotide polymorphic (SNP) markers across the HLA region, the majority of which were located in the HLA-DPB1 and HLA-DPA1 genes encoding the class II major histocompatibility complex (MHC) DPβ chain 1 and DPα chain 1 proteins, respectively. Peak association signals in these 2 genes, emanating from SNPs rs9277554 (for DPβ chain 1) and rs9277341 (DPα chain 1) were strongly replicated in an independent cohort (in the combined analysis of the initial cohort and the replication cohort, P = 1.92 × 10(-50) and 2.18 × 10(-39) , respectively). Imputation of classic HLA alleles and conditional analyses revealed that the SNP association signal was fully accounted for by the classic HLA-DPB1*04 allele. An independent single SNP, rs26595, near SEMA6A (the gene for semaphorin 6A) on chromosome 5, was also associated with GPA, reaching genome-wide significance in a combined analysis of the GWAS and replication cohorts (P = 2.09 × 10(-8) ). CONCLUSION: We identified the SEMA6A and HLA-DP loci as significant contributors to risk for GPA, with the HLA-DPB1*04 allele almost completely accounting for the MHC association. These two associations confirm the critical role of immunogenetic factors in the development of GPA.
Consortium CDGPG, Lee H, Ripke S, Neale B, Faraone S, Purcell S, Perlis R, Mowry B, Thapar A, Goddard M, Witte J, Absher D, Agartz I, Akil H, Amin F, Andreassen O, Anjorin A, Anney R, Anttila V, Arking D, Asherson P, Azevedo M, Backlund L, Badner J, Bailey A, Banaschewski T, Barchas J, Barnes M, Barrett T, Bass N, Battaglia A, Bauer M, Bayés M, Bellivier F, Bergen S, Berrettini W, Betancur C, Bettecken T, Biederman J, Binder E, Black D, Blackwood D, Bloss C, Boehnke M, Boomsma D, Breen G, Breuer R, Bruggeman R, Cormican P, Buccola N, Buitelaar J, Bunney W, Buxbaum J, Byerley W, Byrne E, Caesar S, Cahn W, Cantor R, Casas M, Chakravarti A, Chambert K, Choudhury K, Cichon S, Cloninger R, Collier D, Cook E, Coon H, Cormand B, Corvin A, Coryell W, Craig D, Craig I, Crosbie J, Cuccaro M, Curtis D, Czamara D, Datta S, Dawson G, Day R, De Geus E, Degenhardt F, Djurovic S, Donohoe G, Doyle A, Duan J, Dudbridge F, Duketis E, Ebstein R, Edenberg H, Elia J, Ennis S, Etain B, Fanous A, Farmer A, Ferrier N, Flickinger M, Fombonne E, Foroud T, Frank J, Franke B, Fraser C, Freedman R, Freimer N, Freitag C, Friedl M, Frisén L, Gallagher L, Gejman P, Georgieva L, Gershon E, Geschwind D, Giegling I, Gill M, Gordon S, Gordon-Smith K, Green E, Greenwood T, Grice D, Gross M, Grozeva D, Guan W, Gurling H, De Haan L, Haines J, Hakonarson H, Hallmayer J, Hamilton S, Hamshere M, Hansen T, Hartmann A, Hautzinger M, Heath A, Henders A, Herms S, Hickie I, Hipolito M, Hoefels S, Holmans P, Holsboer F, Hoogendijk W, Hottenga JJ, Hultman C, Hus V, Ingason A, Ising M, Jamain S, Jones E, Jones I, Jones L, Tzeng JY, Kähler A, Kahn R, Kandaswamy R, Keller M, Kennedy J, Kenny E, Kent L, Kim Y, Kirov G, Klauck S, Klei L, Knowles J, Kohli M, Koller D, Konte B, Korszun A, Krabbendam L, Krasucki R, Kuntsi J, Kwan P, Landén M, Långström N, Lathrop M, Lawrence J, Lawson W, Leboyer M, Ledbetter D, Lee P, Lencz T, Lesch KP, Levinson D, Lewis C, Li J, Lichtenstein P, Lieberman J, Lin DY, Linszen D, Liu C, Lohoff F, Loo S, Lord C, Lowe J, Lucae S, MacIntyre D, Madden P, Maestrini E, Magnusson P, Mahon P, Maier W, Malhotra A, Mane S, Martin C, Martin N, Mattheisen M, Matthews K, Mattingsdal M, McCarroll S, McGhee K, McGough J, McGrath P, McGuffin P, McInnis M, McIntosh A, McKinney R, McLean A, McMahon F, McMahon W, McQuillin A, Medeiros H, Medland S, Meier S, Melle I, Meng F, Meyer J, Middeldorp C, Middleton L, Milanova V, Miranda A, Monaco A, Montgomery G, Moran J, Moreno-De-Luca D, Morken G, Morris D, Morrow E, Moskvina V, Muglia P, Mühleisen T, Muir W, Müller-Myhsok B, Murtha M, Myers R, Myin-Germeys I, Neale M, Nelson S, Nievergelt C, Nikolov I, Nimgaonkar V, Nolen W, Nöthen M, Nurnberger J, Nwulia E, Nyholt D, O’Dushlaine C, Oades R, Olincy A, Oliveira G, Olsen L, Ophoff R, Osby U, Owen M, Palotie A, Parr J, Paterson A, Pato C, Pato M, Penninx B, Pergadia M, Pericak-Vance M, Pickard B, Pimm J, Piven J, Posthuma D, Potash J, Poustka F, Propping P, Puri V, Quested D, Quinn E, Ramos-Quiroga JA, Rasmussen H, Raychaudhuri S, Rehnström K, Reif A, Ribasés M, Rice J, Rietschel M, Roeder K, Roeyers H, Rossin L, Rothenberger A, Rouleau G, Ruderfer D, Rujescu D, Sanders A, Sanders S, Santangelo S, Sergeant J, Schachar R, Schalling M, Schatzberg A, Scheftner W, Schellenberg G, Scherer S, Schork N, Schulze T, Schumacher J, Schwarz M, Scolnick E, Scott L, Shi J, Shilling P, Shyn S, Silverman J, Slager S, Smalley S, Smit J, Smith E, Sonuga-Barke E, St Clair D, State M, Steffens M, Steinhausen HC, Strauss J, Strohmaier J, Stroup S, Sutcliffe J, Szatmari P, Szelinger S, Thirumalai S, Thompson R, Todorov A, Tozzi F, Treutlein J, Uhr M, Oord E, Van Grootheest G, Os J, Vicente A, Vieland V, Vincent J, Visscher P, Walsh C, Wassink T, Watson S, Weissman M, Werge T, Wienker T, Wijsman E, Willemsen G, Williams N, Willsey J, Witt S, Xu W, Young A, Yu T, Zammit S, Zandi P, Zhang P, Zitman F, Zöllner S, Devlin B, Kelsoe J, Sklar P, Daly M, O’Donovan M, Craddock N, Sullivan P, Smoller J, Kendler K, Wray N, International Inflammatory Bowel Disease Genetics Consortium (IIBDGC). Genetic relationship between five psychiatric disorders estimated from genome-wide SNPs. Nat Genet. 2013;45(9):984–94.
Most psychiatric disorders are moderately to highly heritable. The degree to which genetic variation is unique to individual disorders or shared across disorders is unclear. To examine shared genetic etiology, we use genome-wide genotype data from the Psychiatric Genomics Consortium (PGC) for cases and controls in schizophrenia, bipolar disorder, major depressive disorder, autism spectrum disorders (ASD) and attention-deficit/hyperactivity disorder (ADHD). We apply univariate and bivariate methods for the estimation of genetic variation within and covariation between disorders. SNPs explained 17-29% of the variance in liability. The genetic correlation calculated using common SNPs was high between schizophrenia and bipolar disorder (0.68 ± 0.04 s.e.), moderate between schizophrenia and major depressive disorder (0.43 ± 0.06 s.e.), bipolar disorder and major depressive disorder (0.47 ± 0.06 s.e.), and ADHD and major depressive disorder (0.32 ± 0.07 s.e.), low between schizophrenia and ASD (0.16 ± 0.06 s.e.) and non-significant for other pairs of disorders as well as between psychiatric disorders and the negative control of Crohn's disease. This empirical evidence of shared genetic etiology for psychiatric disorders can inform nosology and encourages the investigation of common pathophysiologies for related disorders.
Seddon J, Yu Y, Miller E, Reynolds R, Tan P, Gowrisankar S, Goldstein J, Triebwasser M, Anderson H, Zerbib J, Kavanagh D, Souied E, Katsanis N, Daly M, Atkinson J, Raychaudhuri S. Rare variants in CFI, C3 and C9 are associated with high risk of advanced age-related macular degeneration. Nat Genet. 2013;45(11):1366–70.
To define the role of rare variants in advanced age-related macular degeneration (AMD) risk, we sequenced the exons of 681 genes within all reported AMD loci and related pathways in 2,493 cases and controls. We first tested each gene for increased or decreased burden of rare variants in cases compared to controls. We found that 7.8% of AMD cases compared to 2.3% of controls are carriers of rare missense CFI variants (odds ratio (OR) = 3.6; P = 2 × 10(-8)). There was a predominance of dysfunctional variants in cases compared to controls. We then tested individual variants for association with disease. We observed significant association with rare missense alleles in genes other than CFI. Genotyping in 5,115 independent samples confirmed associations with AMD of an allele in C3 encoding p.Lys155Gln (replication P = 3.5 × 10(-5), OR = 2.8; joint P = 5.2 × 10(-9), OR = 3.8) and an allele in C9 encoding p.Pro167Ser (replication P = 2.4 × 10(-5), OR = 2.2; joint P = 6.5 × 10(-7), OR = 2.2). Finally, we show that the allele of C3 encoding Gln155 results in resistance to proteolytic inactivation by CFH and CFI. These results implicate loss of C3 protein regulation and excessive alternative complement activation in AMD pathogenesis, thus informing both the direction of effect and mechanistic underpinnings of this disorder.
Liao K, Cai T, Gainer V, Cagan A, Murphy S, Liu C, Churchill S, Shaw S, Kohane I, Solomon D, Plenge R, Karlson E. Lipid and lipoprotein levels and trend in rheumatoid arthritis compared to the general population. Arthritis Care Res (Hoboken). 2013;65(12):2046–50.
OBJECTIVE: Differences in lipid levels associated with cardiovascular (CV) risk between rheumatoid arthritis (RA) patients and the general population remain unclear. Determining these differences is important in understanding the role of lipids in CV risk in RA. METHODS: We studied 2,005 RA subjects from 2 large academic medical centers. We extracted electronic medical record data on the first low-density lipoprotein (LDL) measurement, and total cholesterol and high-density lipoprotein (HDL) measurements within 1 year of the LDL measurement. Subjects with an electronic statin prescription prior to the first LDL measurement were excluded. We compared lipid levels in RA patients to recently published levels from the general US population using the t-test and stratifying by published parameters, i.e., 2007-2010, and women. We determined lipid trends using separate linear regression models for total cholesterol, LDL cholesterol, and HDL cholesterol, testing the association between year of measurement (1989-2010) and lipid level, adjusted by age and sex. Lipid trends in RA were qualitatively compared to the published general population trends. RESULTS: Women with RA had a significantly lower total cholesterol (186 versus 200 mg/dl; P = 0.002) and LDL cholesterol (105 versus 118 mg/dl; P = 0.001) compared to the general population (2007-2010). HDL cholesterol was not significantly different in the 2 groups. In the RA cohort, total cholesterol and LDL cholesterol significantly decreased each year, while HDL cholesterol increased (all with P < 0.0001), consistent with overall trends observed in a previous study. CONCLUSION: RA patients appear to have an overall lower total cholesterol and LDL cholesterol than the general population despite the general overall risk of CV disease in RA from observational studies.
Chen Y, Carroll R, Hinz EM, Shah A, Eyler A, Denny J, Xu H. Applying active learning to high-throughput phenotyping algorithms for electronic health records data. J Am Med Inform Assoc. 2013;20(e2):e253–9.
OBJECTIVES: Generalizable, high-throughput phenotyping methods based on supervised machine learning (ML) algorithms could significantly accelerate the use of electronic health records data for clinical and translational research. However, they often require large numbers of annotated samples, which are costly and time-consuming to review. We investigated the use of active learning (AL) in ML-based phenotyping algorithms. METHODS: We integrated an uncertainty sampling AL approach with support vector machines-based phenotyping algorithms and evaluated its performance using three annotated disease cohorts including rheumatoid arthritis (RA), colorectal cancer (CRC), and venous thromboembolism (VTE). We investigated performance using two types of feature sets: unrefined features, which contained at least all clinical concepts extracted from notes and billing codes; and a smaller set of refined features selected by domain experts. The performance of the AL was compared with a passive learning (PL) approach based on random sampling. RESULTS: Our evaluation showed that AL outperformed PL on three phenotyping tasks. When unrefined features were used in the RA and CRC tasks, AL reduced the number of annotated samples required to achieve an area under the curve (AUC) score of 0.95 by 68% and 23%, respectively. AL also achieved a reduction of 68% for VTE with an optimal AUC of 0.70 using refined features. As expected, refined features improved the performance of phenotyping classifiers and required fewer annotated samples. CONCLUSIONS: This study demonstrated that AL can be useful in ML-based phenotyping methods. Moreover, AL and feature engineering based on domain knowledge could be combined to develop efficient and generalizable phenotyping methods.
Li G, Diogo D, Wu D, Spoonamore J, Dancik V, Franke L, Kurreeman F, Rossin E, Duclos G, Hartland C, Zhou X, Li K, Liu J, De Jager P, Siminovitch K, Zhernakova A, Raychaudhuri S, Bowes J, Eyre S, Padyukov L, Gregersen P, Worthington J, Rheumatoid Arthritis Consortium International (RACI), Gupta N, Clemons P, Stahl E, Tolliday N, Plenge R. Human genetics in rheumatoid arthritis guides a high-throughput drug screen of the CD40 signaling pathway. PLoS Genet. 2013;9(5):e1003487.
Although genetic and non-genetic studies in mouse and human implicate the CD40 pathway in rheumatoid arthritis (RA), there are no approved drugs that inhibit CD40 signaling for clinical care in RA or any other disease. Here, we sought to understand the biological consequences of a CD40 risk variant in RA discovered by a previous genome-wide association study (GWAS) and to perform a high-throughput drug screen for modulators of CD40 signaling based on human genetic findings. First, we fine-map the CD40 risk locus in 7,222 seropositive RA patients and 15,870 controls, together with deep sequencing of CD40 coding exons in 500 RA cases and 650 controls, to identify a single SNP that explains the entire signal of association (rs4810485, P = 1.4×10(-9)). Second, we demonstrate that subjects homozygous for the RA risk allele have ∼33% more CD40 on the surface of primary human CD19+ B lymphocytes than subjects homozygous for the non-risk allele (P = 10(-9)), a finding corroborated by expression quantitative trait loci (eQTL) analysis in peripheral blood mononuclear cells from 1,469 healthy control individuals. Third, we use retroviral shRNA infection to perturb the amount of CD40 on the surface of a human B lymphocyte cell line (BL2) and observe a direct correlation between amount of CD40 protein and phosphorylation of RelA (p65), a subunit of the NF-κB transcription factor. Finally, we develop a high-throughput NF-κB luciferase reporter assay in BL2 cells activated with trimerized CD40 ligand (tCD40L) and conduct an HTS of 1,982 chemical compounds and FDA-approved drugs. After a series of counter-screens and testing in primary human CD19+ B cells, we identify 2 novel chemical inhibitors not previously implicated in inflammation or CD40-mediated NF-κB signaling. Our study demonstrates proof-of-concept that human genetics can be used to guide the development of phenotype-based, high-throughput small-molecule screens to identify potential novel therapies in complex traits such as RA.
Raj T, Kuchroo M, Replogle J, Raychaudhuri S, Stranger B, De Jager P. Common risk alleles for inflammatory diseases are targets of recent positive selection. Am J Hum Genet. 2013;92(4):517–29.
Genome-wide association studies (GWASs) have identified hundreds of loci harboring genetic variation influencing inflammatory-disease susceptibility in humans. It has been hypothesized that present day inflammatory diseases may have arisen, in part, due to pleiotropic effects of host resistance to pathogens over the course of human history, with significant selective pressures acting to increase host resistance to pathogens. The extent to which genetic factors underlying inflammatory-disease susceptibility has been influenced by selective processes can now be quantified more comprehensively than previously possible. To understand the evolutionary forces that have shaped inflammatory-disease susceptibility and to elucidate functional pathways affected by selection, we performed a systems-based analysis to integrate (1) published GWASs for inflammatory diseases, (2) a genome-wide scan for signatures of positive selection in a population of European ancestry, (3) functional genomics data comprised of protein-protein interaction networks, and (4) a genome-wide expression quantitative trait locus (eQTL) mapping study in peripheral blood mononuclear cells (PBMCs). We demonstrate that loci for inflammatory-disease susceptibility are enriched for genomic signatures of recent positive natural selection, with selected loci forming a highly interconnected protein-protein interaction network. Further, we identify 21 loci for inflammatory-disease susceptibility that display signatures of recent positive selection, of which 13 also show evidence of cis-regulatory effects on genes within the associated locus. Thus, our integrated analyses highlight a set of susceptibility loci that might subserve a shared molecular function and has experienced selective pressure over the course of human history; today, these loci play a key role in influencing susceptibility to multiple different inflammatory diseases, in part through alterations of gene expression in immune cells.
Cui J, Stahl E, Saevarsdottir S, Miceli C, Diogo D, Trynka G, Raj T, Mirkov MU, Canhao H, Ikari K, Terao C, Okada Y, Wedrén S, Askling J, Yamanaka H, Momohara S, Taniguchi A, Ohmura K, Matsuda F, Mimori T, Gupta N, Kuchroo M, Morgan A, Isaacs J, Wilson A, Hyrich K, Herenius M, Doorenspleet M, Tak PP, Crusius B, Horst-Bruinsma I, Wolbink GJ, Riel P, Laar M, Guchelaar HJ, Shadick N, Allaart C, Huizinga T, Toes R, Kimberly R, Bridges L, Criswell L, Moreland L, Fonseca JE, Vries N, Stranger B, De Jager P, Raychaudhuri S, Weinblatt M, Gregersen P, Mariette X, Barton A, Padyukov L, Coenen MJ, Karlson E, Plenge R. Genome-wide association study and gene expression analysis identifies CD84 as a predictor of response to etanercept therapy in rheumatoid arthritis. PLoS Genet. 2013;9(3):e1003394.
Anti-tumor necrosis factor alpha (anti-TNF) biologic therapy is a widely used treatment for rheumatoid arthritis (RA). It is unknown why some RA patients fail to respond adequately to anti-TNF therapy, which limits the development of clinical biomarkers to predict response or new drugs to target refractory cases. To understand the biological basis of response to anti-TNF therapy, we conducted a genome-wide association study (GWAS) meta-analysis of more than 2 million common variants in 2,706 RA patients from 13 different collections. Patients were treated with one of three anti-TNF medications: etanercept (n = 733), infliximab (n = 894), or adalimumab (n = 1,071). We identified a SNP (rs6427528) at the 1q23 locus that was associated with change in disease activity score (ΔDAS) in the etanercept subset of patients (P = 8 × 10(-8)), but not in the infliximab or adalimumab subsets (P>0.05). The SNP is predicted to disrupt transcription factor binding site motifs in the 3' UTR of an immune-related gene, CD84, and the allele associated with better response to etanercept was associated with higher CD84 gene expression in peripheral blood mononuclear cells (P = 1 × 10(-11) in 228 non-RA patients and P = 0.004 in 132 RA patients). Consistent with the genetic findings, higher CD84 gene expression correlated with lower cross-sectional DAS (P = 0.02, n = 210) and showed a non-significant trend for better ΔDAS in a subset of RA patients with gene expression data (n = 31, etanercept-treated). A small, multi-ethnic replication showed a non-significant trend towards an association among etanercept-treated RA patients of Portuguese ancestry (n = 139, P = 0.4), but no association among patients of Japanese ancestry (n = 151, P = 0.8). Our study demonstrates that an allele associated with response to etanercept therapy is also associated with CD84 gene expression, and further that CD84 expression correlates with disease activity. These findings support a model in which CD84 genotypes and/or expression may serve as a useful biomarker for response to etanercept treatment in RA patients of European ancestry.
Diogo D, Kurreeman F, Stahl E, Liao K, Gupta N, Greenberg J, Rivas M, Hickey B, Flannick J, Thomson B, Guiducci C, Ripke S, Adzhubey I, Barton A, Kremer J, Alfredsson L, America CRRN, Rheumatoid Arthritis Consortium International, Sunyaev S, Martin J, Zhernakova A, Bowes J, Eyre S, Siminovitch K, Gregersen P, Worthington J, Klareskog L, Padyukov L, Raychaudhuri S, Plenge R. Rare, low-frequency, and common variants in the protein-coding sequence of biological candidate genes from GWASs contribute to risk of rheumatoid arthritis. Am J Hum Genet. 2013;92(1):15–27.
The extent to which variants in the protein-coding sequence of genes contribute to risk of rheumatoid arthritis (RA) is unknown. In this study, we addressed this issue by deep exon sequencing and large-scale genotyping of 25 biological candidate genes located within RA risk loci discovered by genome-wide association studies (GWASs). First, we assessed the contribution of rare coding variants in the 25 genes to the risk of RA in a pooled sequencing study of 500 RA cases and 650 controls of European ancestry. We observed an accumulation of rare nonsynonymous variants exclusive to RA cases in IL2RA and IL2RB (burden test: p = 0.007 and p = 0.018, respectively). Next, we assessed the aggregate contribution of low-frequency and common coding variants to the risk of RA by dense genotyping of the 25 gene loci in 10,609 RA cases and 35,605 controls. We observed a strong enrichment of coding variants with a nominal signal of association with RA (p < 0.05) after adjusting for the best signal of association at the loci (p(enrichment) = 6.4 × 10(-4)). For one locus containing CD2, we found that a missense variant, rs699738 (c.798C>A [p.His266Gln]), and a noncoding variant, rs624988, reside on distinct haplotypes and independently contribute to the risk of RA (p = 4.6 × 10(-6)). Overall, our results indicate that variants (distributed across the allele-frequency spectrum) within the protein-coding portion of a subset of biological candidate genes identified by GWASs contribute to the risk of RA. Further, we have demonstrated that very large sample sizes will be required for comprehensively identifying the independent alleles contributing to the missing heritability of RA.