Publications

2011

Hu X, Kim H, Stahl E, Plenge R, Daly M, Raychaudhuri S. Integrating autoimmune risk loci with gene-expression data identifies specific pathogenic immune cell subsets. Am J Hum Genet. 2011;89(4):496–506.
Although genome-wide association studies have implicated many individual loci in complex diseases, identifying the exact causal alleles and the cell types within which they act remains greatly challenging. To ultimately understand disease mechanism, researchers must carefully conceive functional studies in relevant pathogenic cell types to demonstrate the cellular impact of disease-associated genetic variants. This challenge is highlighted in autoimmune diseases, such as rheumatoid arthritis, where any of a broad range of immunological cell types might potentially be impacted by genetic variation to cause disease. To this end, we developed a statistical approach to identify potentially pathogenic cell types in autoimmune diseases by using a gene-expression data set of 223 murine-sorted immune cells from the Immunological Genome Consortium. We found enrichment of transitional B cell genes in systemic lupus erythematosus (p = 5.9 × 10(-6)) and epithelial-associated stimulated dendritic cell genes in Crohn disease (p = 1.6 × 10(-5)). Finally, we demonstrated enrichment of CD4+ effector memory T cell genes within rheumatoid arthritis loci (p < 10(-6)). To further validate the role of CD4+ effector memory T cells within rheumatoid arthritis, we identified 436 loci that were not yet known to be associated with the disease but that had a statistically suggestive association in a recent genome-wide association study (GWAS) meta-analysis (p(GWAS) < 0.001). Even among these putative loci, we noted a significant enrichment for genes specifically expressed in CD4+ effector memory T cells (p = 1.25 × 10(-4)). These cell types are primary candidates for future functional studies to reveal the role of risk alleles in autoimmunity. Our approach has application in other phenotypes, outside of autoimmunity, where many loci have been discovered and high-quality cell-type-specific gene expression is available.
Raychaudhuri S. Mapping rare and common causal alleles for complex human diseases. Cell. 2011;147(1):57–69.
Advances in genotyping and sequencing technologies have revolutionized the genetics of complex disease by locating rare and common variants that influence an individual's risk for diseases, such as diabetes, cancers, and psychiatric disorders. However, to capitalize on these data for prevention and therapies requires the identification of causal alleles and a mechanistic understanding for how these variants contribute to the disease. After discussing the strategies currently used to map variants for complex diseases, this Primer explores how variants may be prioritized for follow-up functional studies and the challenges and approaches for assessing the contributions of rare and common variants to disease phenotypes.
Raychaudhuri S. VIZ-GRAIL: visualizing functional connections across disease loci. Bioinformatics. 2011;27(11):1589–90.
MOTIVATION: As disease loci are rapidly discovered, an emerging challenge is to identify common pathways and biological functionality across loci. Such pathways might point to potential disease mechanisms. One strategy is to look for functionally related or interacting genes across genetic loci. Previously, we defined a statistical strategy, Gene Relationships Across Implicated Loci (GRAIL), to identify whether pair-wise gene relationships defined using PubMed text similarity are enriched across loci. Here, we have implemented VIZ-GRAIL, a software tool to display those relationships and to depict the underlying biological patterns. RESULTS: Our tool can seamlessly interact with the GRAIL web site to obtain the results of analyses and create easy to read visual displays. To most clearly display results, VIZ-GRAIL arranges genes and genetic loci to minimize intersecting pair-wise gene connections. VIZ-GRAIL can be easily applied to other types of functional connections, beyond those from GRAIL. This method should help investigators appreciate the presence of potentially important common functions across loci. AVAILABILITY: The GRAIL algorithm is implemented online at http://www.broadinstitute.org/mpg/grail/grail.php. VIZ-GRAIL source-code is at http://www.broadinstitute.org/mpg/grail/vizgrail.html.

2010

Beroukhim R, Mermel C, Porter D, Wei G, Raychaudhuri S, Donovan J, Barretina J, Boehm J, Dobson J, Urashima M, Mc Henry K, Pinchback R, Ligon A, Cho YJ, Haery L, Greulich H, Reich M, Winckler W, Lawrence M, Weir B, Tanaka K, Chiang D, Bass A, Loo A, Hoffman C, Prensner J, Liefeld T, Gao Q, Yecies D, Signoretti S, Maher E, Kaye F, Sasaki H, Tepper J, Fletcher J, Tabernero J, Baselga J, Tsao MS, Demichelis F, Rubin M, Jänne P, Daly M, Nucera C, Levine R, Ebert B, Gabriel S, Rustgi A, Antonescu C, Ladanyi M, Letai A, Garraway L, Loda M, Beer D, True L, Okamoto A, Pomeroy S, Singer S, Golub T, Lander E, Getz G, Sellers W, Meyerson M. The landscape of somatic copy-number alteration across human cancers. Nature. 2010;463(7283):899–905.
A powerful way to discover key genes with causal roles in oncogenesis is to identify genomic regions that undergo frequent alteration in human cancers. Here we present high-resolution analyses of somatic copy-number alterations (SCNAs) from 3,131 cancer specimens, belonging largely to 26 histological types. We identify 158 regions of focal SCNA that are altered at significant frequency across several cancer types, of which 122 cannot be explained by the presence of a known cancer target gene located within these regions. Several gene families are enriched among these regions of focal SCNA, including the BCL2 family of apoptosis regulators and the NF-kappaBeta pathway. We show that cancer cells containing amplifications surrounding the MCL1 and BCL2L1 anti-apoptotic genes depend on the expression of these genes for survival. Finally, we demonstrate that a large majority of SCNAs identified in individual cancer types are present in several cancer types.
Neale B, Fagerness J, Reynolds R, Sobrin L, Parker M, Raychaudhuri S, Tan P, Oh E, Merriam J, Souied E, Bernstein P, Li B, Frederick J, Zhang K, Brantley M, Lee A, Zack D, Campochiaro B, Campochiaro P, Ripke S, Smith T, Barile G, Katsanis N, Allikmets R, Daly M, Seddon J. Genome-wide association study of advanced age-related macular degeneration identifies a role of the hepatic lipase gene (LIPC). Proc Natl Acad Sci U S A. 2010;107(16):7395–400.

Advanced age-related macular degeneration (AMD) is the leading cause of late onset blindness. We present results of a genome-wide association study of 979 advanced AMD cases and 1,709 controls using the Affymetrix 6.0 platform with replication in seven additional cohorts (totaling 5,789 unrelated cases and 4,234 unrelated controls). We also present a comprehensive analysis of copy-number variations and polymorphisms for AMD. Our discovery data implicated the association between AMD and a variant in the hepatic lipase gene (LIPC) in the high-density lipoprotein cholesterol (HDL) pathway (discovery P = 4.53e-05 for rs493258). Our LIPC association was strongest for a functional promoter variant, rs10468017, (P = 1.34e-08), that influences LIPC expression and serum HDL levels with a protective effect of the minor T allele (HDL increasing) for advanced wet and dry AMD. The association we found with LIPC was corroborated by the Michigan/Penn/Mayo genome-wide association study; the locus near the tissue inhibitor of metalloproteinase 3 was corroborated by our replication cohort for rs9621532 with P = 3.71e-09. We observed weaker associations with other HDL loci (ABCA1, P = 9.73e-04; cholesterylester transfer protein, P = 1.41e-03; FADS1-3, P = 2.69e-02). Based on a lack of consistent association between HDL increasing alleles and AMD risk, the LIPC association may not be the result of an effect on HDL levels, but it could represent a pleiotropic effect of the same functional component. Results implicate different biologic pathways than previously reported and provide new avenues for prevention and treatment of AMD.

Orozco G, Eyre S, Hinks A, Ke X, Consortium WTCCY, Wilson A, Bax D, Morgan A, Emery P, Steer S, Hocking L, Reid D, Wordsworth P, Harrison P, Thomson W, Barton A, Worthington J. Association of CD40 with rheumatoid arthritis confirmed in a large UK case-control study. Ann Rheum Dis. 2010;69(5):813–6.
OBJECTIVE: A recent meta-analysis of published genome-wide association studies (GWAS) in populations of European descent reported novel associations of markers mapping to the CD40, CCL21 and CDK6 genes with rheumatoid arthritis (RA) susceptibility while a large-scale, case-control association study in a Japanese population identified association with multiple single nucleotide polymorphisms (SNPs) in the CD244 gene. The aim of the current study was to validate these potential RA susceptibility markers in a UK population. METHODS: A total of 4 SNPs (rs4810485 in CD40, rs2812378 in CCL21, rs42041 in CDK6 and rs6682654 in CD244) were genotyped in a UK cohort comprising 3962 UK patients with RA and 3531 healthy controls using the Sequenom iPlex platform. Genotype counts in patients and controls were analysed with the chi(2) test using Stata. RESULTS: Association to the CD40 gene was robustly replicated (p=2 x 10(-4), OR 0.86, 95% CI 0.79 to 0.93) and modest evidence was found for association with the CCL21 locus (p=0.04, OR 1.08, 95% CI 1.01 to 1.16). However, there was no evidence for association of rs42041 (CDK6) and rs6682654 (CD244) with RA susceptibility in this UK population. Following a meta-analysis including the original data, association to CD40 was confirmed (p=7.8 x 10(-8), OR 0.87 (95% CI 0.83 to 0.92). CONCLUSION: In this large UK cohort, strong association of the CD40 gene with susceptibility to RA was found, and weaker evidence for association with RA in the CCL21 locus.
Karlson E, Chibnik L, Kraft P, Cui J, Keenan B, Ding B, Raychaudhuri S, Klareskog L, Alfredsson L, Plenge R. Cumulative association of 22 genetic variants with seropositive rheumatoid arthritis risk. Ann Rheum Dis. 2010;69(6):1077–85.
BACKGROUND: Recent discoveries of risk alleles have made it possible to define genetic risk profiles for patients with rheumatoid arthritis (RA). This study examined whether a cumulative score based on 22 validated genetic risk alleles for seropositive RA would identify high-risk, asymptomatic individuals who might benefit from preventive interventions. METHODS: Eight human leucocyte antigen (HLA) alleles and 14 single-nucleotide polymorphisms representing 13 validated RA risk loci were genotyped among 289 white seropositive cases and 481 controls from the US Nurses' Health Studies (NHS) and 629 white cyclic-citrullinated peptide antibody-positive cases and 623 controls from the Swedish Epidemiologic Investigation of Rheumatoid Arthritis (EIRA). A weighted genetic risk score (GRS) was created, in which the weight for each risk allele is the log of the published odds ratio (OR). Logistic regression was used to study associations with incident RA. Area under the curve (AUC) statistics were compared from a clinical-only model and clinical plus genetic model in each cohort. RESULTS: Patients with GRS >1.25 SD of the mean had a significantly higher OR of seropositive RA in both NHS (OR=2.9, 95%CI 1.8 to 4.6) and EIRA (OR 3.4, 95% CI 2.3 to 5.0) referent to the population average. In NHS, the AUC for a clinical model was 0.57 and for a clinical plus genetic model was 0.66, and in EIRA was 0.63 and 0.75, respectively. CONCLUSION: The combination of 22 risk alleles into a weighted GRS significantly stratifies individuals for RA risk beyond clinical risk factors alone. Given the low incidence of RA, the clinical utility of a weighted GRS is limited in the general population.
Voight B, Scott L, Steinthorsdottir V, Morris A, Dina C, Welch R, Zeggini E, Huth C, Aulchenko Y, Thorleifsson G, McCulloch L, Ferreira T, Grallert H, Amin N, Wu G, Willer C, Raychaudhuri S, McCarroll S, Langenberg C, Hofmann O, Dupuis J, Qi L, Segrè A, Hoek M, Navarro P, Ardlie K, Balkau B, Benediktsson R, Bennett A, Blagieva R, Boerwinkle E, Bonnycastle L, Bengtsson Boström K, Bravenboer B, Bumpstead S, Burtt N, Charpentier G, Chines P, Cornelis M, Couper D, Crawford G, Doney A, Elliott K, Elliott A, Erdos M, Fox C, Franklin C, Ganser M, Gieger C, Grarup N, Green T, Griffin S, Groves C, Guiducci C, Hadjadj S, Hassanali N, Herder C, Isomaa B, Jackson A, Johnson P, Jørgensen T, Kao W, Klopp N, Kong A, Kraft P, Kuusisto J, Lauritzen T, Li M, Lieverse A, Lindgren C, Lyssenko V, Marre M, Meitinger T, Midthjell K, Morken M, Narisu N, Nilsson P, Owen K, Payne F, Perry J, Petersen AK, Platou C, Proença C, Prokopenko I, Rathmann W, Rayner W, Robertson N, Rocheleau G, Roden M, Sampson M, Saxena R, Shields B, Shrader P, Sigurdsson G, Sparsø T, Strassburger K, Stringham H, Sun Q, Swift A, Thorand B, Tichet J, Tuomi T, Dam R, Haeften T, Herpt T, Vliet-Ostaptchouk J, Walters B, Weedon M, Wijmenga C, Witteman J, Bergman R, Cauchi S, Collins F, Gloyn A, Gyllensten U, Hansen T, Hide W, Hitman G, Hofman A, Hunter D, Hveem K, Laakso M, Mohlke K, Morris A, Palmer C, Pramstaller P, Rudan I, Sijbrands E, Stein L, Tuomilehto J, Uitterlinden A, Walker M, Wareham N, Watanabe R, Abecasis G, Boehm B, Campbell H, Daly M, Hattersley A, Hu F, Meigs J, Pankow J, Pedersen O, Wichmann HE, Barroso I, Florez J, Frayling T, Groop L, Sladek R, Thorsteinsdottir U, Wilson J, Illig T, Froguel P, Duijn C, Stefansson K, Altshuler D, Boehnke M, McCarthy M, MAGIC investigators, GIANT Consortium. Twelve type 2 diabetes susceptibility loci identified through large-scale association analysis. Nat Genet. 2010;42(7):579–89.
By combining genome-wide association data from 8,130 individuals with type 2 diabetes (T2D) and 38,987 controls of European descent and following up previously unidentified meta-analysis signals in a further 34,412 cases and 59,925 controls, we identified 12 new T2D association signals with combined P<5x10(-8). These include a second independent signal at the KCNQ1 locus; the first report, to our knowledge, of an X-chromosomal association (near DUSP9); and a further instance of overlap between loci implicated in monogenic and multifactorial forms of diabetes (at HNF1A). The identified loci affect both beta-cell function and insulin action, and, overall, T2D association signals show evidence of enrichment for genes involved in cell cycle regulation. We also show that a high proportion of T2D susceptibility loci harbor independent association signals influencing apparently unrelated complex traits.
Raychaudhuri S, Korn J, McCarroll S, International Schizophrenia Consortium, Altshuler D, Sklar P, Purcell S, Daly M. Accurately assessing the risk of schizophrenia conferred by rare copy-number variation affecting genes with brain function. PLoS Genet. 2010;6(9):e1001097.
Investigators have linked rare copy number variation (CNVs) to neuropsychiatric diseases, such as schizophrenia. One hypothesis is that CNV events cause disease by affecting genes with specific brain functions. Under these circumstances, we expect that CNV events in cases should impact brain-function genes more frequently than those events in controls. Previous publications have applied "pathway" analyses to genes within neuropsychiatric case CNVs to show enrichment for brain-functions. While such analyses have been suggestive, they often have not rigorously compared the rates of CNVs impacting genes with brain function in cases to controls, and therefore do not address important confounders such as the large size of brain genes and overall differences in rates and sizes of CNVs. To demonstrate the potential impact of confounders, we genotyped rare CNV events in 2,415 unaffected controls with Affymetrix 6.0; we then applied standard pathway analyses using four sets of brain-function genes and observed an apparently highly significant enrichment for each set. The enrichment is simply driven by the large size of brain-function genes. Instead, we propose a case-control statistical test, cnv-enrichment-test, to compare the rate of CNVs impacting specific gene sets in cases versus controls. With simulations, we demonstrate that cnv-enrichment-test is robust to case-control differences in CNV size, CNV rate, and systematic differences in gene size. Finally, we apply cnv-enrichment-test to rare CNV events published by the International Schizophrenia Consortium (ISC). This approach reveals nominal evidence of case-association in neuronal-activity and the learning gene sets, but not the other two examined gene sets. The neuronal-activity genes have been associated in a separate set of schizophrenia cases and controls; however, testing in independent samples is necessary to definitively confirm this association. Our method is implemented in the PLINK software package.