Publications by Year: 2019

2019

Korsunsky I, Millard N, Fan J, Slowikowski K, Zhang F, Wei K, Baglaenko Y, Brenner M, Loh PR, Raychaudhuri S. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat Methods. 2019;16(12):1289–1296.
The emerging diversity of single-cell RNA-seq datasets allows for the full transcriptional characterization of cell types across a wide variety of biological and clinical conditions. However, it is challenging to analyze them together, particularly when datasets are assayed with different technologies, because biological and technical differences are interspersed. We present Harmony (https://github.com/immunogenomics/harmony), an algorithm that projects cells into a shared embedding in which cells group by cell type rather than dataset-specific conditions. Harmony simultaneously accounts for multiple experimental and biological factors. In six analyses, we demonstrate the superior performance of Harmony to previously published algorithms while requiring fewer computational resources. Harmony enables the integration of ~10 cells on a personal computer. We apply Harmony to peripheral blood mononuclear cells from datasets with large experimental differences, five studies of pancreatic islet cells, mouse embryogenesis datasets and the integration of scRNA-seq with spatial transcriptomics data.
Zhang F, Wei, Slowikowski, Fonseka C, Rao D, Kelly, Goodman S, Tabechian, Hughes L, Salomon-Escoto, Watts G, Jonsson A, Rangel-Moreno, Meednu, Rozo, Apruzzese, Eisenhauere T, Lieb D, Boyle D, Mandelin A, Consortium AMPRASLE (AMP R, Boyce B, DiCarlo, Gravallese E, Gregersen, Moreland, Firestein G, Hacohen, Nusbaum C, Lederer J, Perlman, Pitzalis, Filer, Holers V, Bykerk V, Donlin L, Anolik J, Brenner M, Raychaudhuri. Defining Inflammatory Cell States in Rheumatoid Arthritis Joint Synovial Tissues by Integrating Single-cell Transcriptomics and Mass Cytometry. Nature Immunology. 2019;20:928–942.

To define the cell populations that drive joint inflammation in rheumatoid arthritis (RA), we applied single-cell RNA sequencing (scRNA-seq), mass cytometry, bulk RNA sequencing (RNA-seq) and flow cytometry to T cells, B cells, monocytes, and fibroblasts from 51 samples of synovial tissue from patients with RA or osteoarthritis (OA). Utilizing an integrated strategy based on canonical correlation analysis of 5,265 scRNA-seq profiles, we identified 18 unique cell populations. Combining mass cytometry and transcriptomics revealed cell states expanded in RA synovia: THY1(CD90)+HLA-DRAhi sublining fibroblasts, IL1B+ pro-inflammatory monocytes, ITGAX+TBX21+autoimmune-associated B cells and PDCD1+ peripheral helper T (TPH) cells and follicular helper T (TFH) cells. We defined distinct subsets of CD8+ T cells characterized by GZMK+, GZMB+, and GNLY+ phenotypes. We mapped inflammatory mediators to their source cell populations; for example, we attributed IL6 expression to THY1+HLA-DRAhi fibroblasts and IL1B production to pro-inflammatory monocytes. These populations are potentially key mediators of RA pathogenesis.

Amariuta T, Luo Y, Gazal S, Davenport EE, Geijn B, Ishigaki K, Westra HJ, Teslovich N, Okada Y, Yamamoto K, RACI Consortium, GARNET consortium, Price A, Raychaudhuri S. IMPACT: Genomic annotation of cell-state-specific regulatory elements inferred from the epigenome of bound transcription factors. The American Journal of Human Genetics. 2019;104(5):879–895.
Despite significant progress in annotating the genome with experimental methods, much of the regulatory noncoding genome remains poorly defined. Here we assert that regulatory elements may be characterized by leveraging local epigenomic signatures at sites where specific transcription factors (TFs) are bound. To link these two identifying features, we introduce IMPACT, a genome annotation strategy which identifies regulatory elements defined by cell-state-specific TF binding profiles, learned from 515 chromatin and sequence annotations. We validate IMPACT using multiple compelling applications. First, IMPACT predicts TF motif binding with high accuracy (average AUC 0.92, s.e. 0.03; across 8 TFs), a significant improvement (all p<6.9e-15) over intersecting motifs with open chromatin (average AUC 0.66, s.e. 0.11). Second, an IMPACT annotation trained on RNA polymerase II is more enriched for peripheral blood cis-eQTL variation (N=3,754) than sequence based annotations, such as promoters and regions around the TSS, (permutation p<1e-3, 25% average increase in enrichment). Third, integration with rheumatoid arthritis (RA) summary statistics from European (N=38,242) and East Asian (N=22,515) populations revealed that the top 5% of CD4+ Treg IMPACT regulatory elements capture 85.7% (s.e. 19.4%) of RA h2 (p<1.6e-5) and that the top 9.8% of Treg IMPACT regulatory elements, consisting of all SNPs with a non-zero annotation value, capture 97.3% (s.e. 18.2%) of RA h2 (p<7.6e-7), the most comprehensive explanation for RA h2 to date. In comparison, the average RA h2 captured by compared CD4+ T histone marks is 42.3% and by CD4+ T specifically expressed gene sets is 36.4%. Finally, integration with RA fine-mapping data (N=27,345) revealed a significant enrichment (2.87, p<8.6e-3) of putatively causal variants across 20 RA associated loci in the top 1% of CD4+ Treg IMPACT regulatory regions. Overall, we find that IMPACT generalizes well to other cell types in identifying complex trait associated regulatory elements.
Spiliopoulou A, Colombo M, Plant D, Nair N, Cui J, Coenen MJ, Ikari K, Yamanaka H, Saevarsdottir S, Padyukov L, Bridges L, Kimberly R, Okada Y, Riel PC, Wolbink GJ, Horst-Bruinsma I, Vries N, Tak P, Ohmura K, Canhao H, Guchelaar HJ, Huizinga T, Criswell L, Raychaudhuri S, Weinblatt M, Wilson A, Mariette X, Isaacs J, Morgan A, Pitzalis C, Barton A, McKeigue P. Association of response to TNF inhibitors in rheumatoid arthritis with quantitative trait loci for and CD39. Ann Rheum Dis. 2019;78(8):1055–1061.
OBJECTIVES: We sought to investigate whether genetic effects on response to TNF inhibitors (TNFi) in rheumatoid arthritis (RA) could be localised by considering known genetic susceptibility loci for relevant traits and to evaluate the usefulness of these genetic loci for stratifying drug response. METHODS: We studied the relation of TNFi response, quantified by change in swollen joint counts ( Δ SJC) and erythrocyte sedimentation rate ( Δ ESR) with locus-specific scores constructed from genome-wide assocation study summary statistics in 2938 genotyped individuals: 37 scores for RA; scores for 19 immune cell traits; scores for expression or methylation of 93 genes with previously reported associations between transcript level and drug response. Multivariate associations were evaluated in penalised regression models by cross-validation. RESULTS: We detected a statistically significant association between Δ SJC and the RA score at the locus (p=0.0004) and an inverse association between Δ SJC and the score for expression of CD39 on CD4 T cells (p=0.00005). A previously reported association between CD39 expression on regulatory T cells and response to methotrexate was in the opposite direction. In stratified analysis by concomitant methotrexate treatment, the inverse association was stronger in the combination therapy group and dissipated in the TNFi monotherapy group. Overall, ability to predict TNFi response from genotypic scores was limited, with models explaining less than 1% of phenotypic variance. CONCLUSIONS: The association with the CD39 trait is difficult to interpret because patients with RA are often prescribed TNFi after failing to respond to methotrexate. The CD39 and pathways could be relevant for targeting drug therapy.
agibbs@bcm.edu CE, eMERGE Consortium. Harmonizing Clinical Sequencing and Interpretation for the eMERGE III Network. Am J Hum Genet. 2019;105(3):588–605.
The advancement of precision medicine requires new methods to coordinate and deliver genetic data from heterogeneous sources to physicians and patients. The eMERGE III Network enrolled >25,000 participants from biobank and prospective cohorts of predominantly healthy individuals for clinical genetic testing to determine clinically actionable findings. The network developed protocols linking together the 11 participant collection sites and 2 clinical genetic testing laboratories. DNA capture panels targeting 109 genes were used for testing of DNA and sample collection, data generation, interpretation, reporting, delivery, and storage were each harmonized. A compliant and secure network enabled ongoing review and reconciliation of clinical interpretations, while maintaining communication and data sharing between clinicians and investigators. A total of 202 individuals had positive diagnostic findings relevant to the indication for testing and 1,294 had additional/secondary findings of medical significance deemed to be returnable, establishing data return rates for other testing endeavors. This study accomplished integration of structured genomic results into multiple electronic health record (EHR) systems, setting the stage for clinical decision support to enable genomic medicine. Further, the established processes enable different sequencing sites to harmonize technical and interpretive aspects of sequencing tests, a critical achievement toward global standardization of genomic testing. The eMERGE protocols and tools are available for widespread dissemination.
Terao C, Brynedal B, Chen Z, Jiang X, Westerlind H, Hansson M, Jakobsson PJ, Lundberg K, Skriner K, Serre G, Rönnelid J, Mathsson-Alm L, Brink M, Dahlqvist SR, Padyukov L, Gregersen P, Barton A, Alfredsson L, Klareskog L, Raychaudhuri S. Distinct HLA Associations with Rheumatoid Arthritis Subsets Defined by Serological Subphenotype. Am J Hum Genet. 2019;105(3):616–624.
Rheumatoid arthritis (RA) is the most common immune-mediated arthritis. Anti-citrullinated peptide antibodies (ACPA) are highly specific to RA and assayed with the commercial CCP2 assay. Genetic drivers of RA within the MHC are different for CCP2-positive and -negative subsets of RA, particularly at HLA-DRB1. However, aspartic acid at amino acid position 9 in HLA-B (B) increases risk to both RA subsets. Here we explore how individual serologies associated with RA drive associations within the MHC. To define MHC differences for specific ACPA serologies, we quantified a total of 19 separate ACPAs in RA-affected case subjects from four cohorts (n = 6,805). We found a cluster of tightly co-occurring antibodies (canonical serologies, containing CCP2), along with several independently expressed antibodies (non-canonical serologies). After imputing HLA variants into 6,805 case subjects and 13,467 control subjects, we tested associations between the HLA region and RA subgroups based on the presence of canonical and/or non-canonical serologies. We examined CCP2(+) and CCP2(-) RA-affected case subjects separately. In CCP2(-) RA, we observed that the association between CCP2(-) RA and B was derived from individuals who were positive for non-canonical serologies (omnibus_p = 9.2 × 10). Similarly, we observed in CCP2(+) RA that associations between subsets of CCP2(+) RA and B were negatively correlated with the number of positive canonical serologies (p = 0.0096). These findings suggest unique genetic characteristics underlying fine-specific ACPAs, suggesting that RA may be further subdivided beyond simply seropositive and seronegative.
Nathan A, Baglaenko Y, Fonseka C, Beynor J, Raychaudhuri S. Multimodal single-cell approaches shed light on T cell heterogeneity. Curr Opin Immunol. 2019;61:17–25.
Single-cell methods have revolutionized the study of T cell biology by enabling the identification and characterization of individual cells. This has led to a deeper understanding of T cell heterogeneity by generating functionally relevant measurements - like gene expression, surface markers, chromatin accessibility, T cell receptor sequences - in individual cells. While these methods are independently valuable, they can be augmented when applied jointly, either on separate cells from the same sample or on the same cells. Multimodal approaches are already being deployed to characterize T cells in diverse disease contexts and demonstrate the value of having multiple insights into a cell's function. But, these data sets pose new statistical challenges for integration and joint analysis.
Pouget J, Consortium SWGPG, Han B, Wu Y, Mignot E, Ollila H, Barker J, Spain S, Dand N, Trembath R, Martin J, Mayes M, Bossini-Castillo L, López-Isac E, Jin Y, Santorico S, Spritz R, Hakonarson H, Polychronakos C, Raychaudhuri S, Knight J. Cross-disorder analysis of schizophrenia and 19 immune-mediated diseases identifies shared genetic risk. Hum Mol Genet. 2019;28(20):3498–3513.
Many immune diseases occur at different rates among people with schizophrenia compared to the general population. Here, we evaluated whether this phenomenon might be explained by shared genetic risk factors. We used data from large genome-wide association studies to compare the genetic architecture of schizophrenia to 19 immune diseases. First, we evaluated the association with schizophrenia of 581 variants previously reported to be associated with immune diseases at genome-wide significance. We identified five variants with potentially pleiotropic effects. While colocalization analyses were inconclusive, functional characterization of these variants provided the strongest evidence for a model in which genetic variation at rs1734907 modulates risk of schizophrenia and Crohn's disease via altered methylation and expression of EPHB4-a gene whose protein product guides the migration of neuronal axons in the brain and the migration of lymphocytes towards infected cells in the immune system. Next, we investigated genome-wide sharing of common variants between schizophrenia and immune diseases using cross-trait LD score regression. Of the 11 immune diseases with available genome-wide summary statistics, we observed genetic correlation between six immune diseases and schizophrenia: inflammatory bowel disease (rg = 0.12 ± 0.03, P = 2.49 × 10-4), Crohn's disease (rg = 0.097 ± 0.06, P = 3.27 × 10-3), ulcerative colitis (rg = 0.11 ± 0.04, P = 4.05 × 10-3), primary biliary cirrhosis (rg = 0.13 ± 0.05, P = 3.98 × 10-3), psoriasis (rg = 0.18 ± 0.07, P = 7.78 × 10-3) and systemic lupus erythematosus (rg = 0.13 ± 0.05, P = 3.76 × 10-3). With the exception of ulcerative colitis, the degree and direction of these genetic correlations were consistent with the expected phenotypic correlation based on epidemiological data. Our findings suggest shared genetic risk factors contribute to the epidemiological association of certain immune diseases and schizophrenia.