Publications

2022

Kang, Nathan, Weinand, Zhang, Millard, Rumker, Moody DB, Korsunsky, Raychaudhuri. Efficient and precise single-cell reference atlas mapping with Symphony. Nature Communications. 2022;12(5890).
Recent advances in single-cell technologies and integration algorithms make it possible to construct comprehensive reference atlases encompassing many donors, studies, disease states, and sequencing platforms. Much like mapping sequencing reads to a reference genome, it is essential to be able to map query cells onto complex, multimillion-cell reference atlases to rapidly identify relevant cell states and phenotypes. We present Symphony (https://github.com/immunogenomics/symphony), an algorithm for building large-scale, integrated reference atlases in a convenient, portable format that enables efficient query mapping within seconds. Symphony localizes query cells within a stable low-dimensional reference embedding, facilitating reproducible downstream transfer of reference-defined annotations to the query. We demonstrate the power of Symphony in multiple real-world datasets, including (1) mapping a multi-donor, multi-species query to predict pancreatic cell types, (2) localizing query cells along a developmental trajectory of fetal liver hematopoiesis, and (3) inferring surface protein expression with a multimodal CITE-seq atlas of memory T cells.
Lopez B, Kohale I, Du Z, Korsunsky I, Abdelmoula W, Dai Y, Stopka S, Gaglia G, Randall E, Regan M, Basu S, Clark A, Marin BM, Mladek A, Burgenske D, Agar J, Supko J, Grossman S, Nabors L, Raychaudhuri S, Ligon K, Wen P, Alexander B, Lee E, Santagata S, Sarkaria J, White F, Agar N. Multimodal platform for assessing drug distribution and response in clinical trials. Neuro Oncol. 2022;24(1):64–77.

BACKGROUND: Response to targeted therapy varies between patients for largely unknown reasons. Here, we developed and applied an integrative platform using mass spectrometry imaging (MSI), phosphoproteomics, and multiplexed tissue imaging for mapping drug distribution, target engagement, and adaptive response to gain insights into heterogeneous response to therapy.

METHODS: Patient-derived xenograft (PDX) lines of glioblastoma were treated with adavosertib, a Wee1 inhibitor, and tissue drug distribution was measured with MALDI-MSI. Phosphoproteomics was measured in the same tumors to identify biomarkers of drug target engagement and cellular adaptive response. Multiplexed tissue imaging was performed on sister sections to evaluate spatial co-localization of drug and cellular response. The integrated platform was then applied on clinical specimens from glioblastoma patients enrolled in the phase 1 clinical trial.

RESULTS: PDX tumors exposed to different doses of adavosertib revealed intra- and inter-tumoral heterogeneity of drug distribution and integration of the heterogeneous drug distribution with phosphoproteomics and multiplexed tissue imaging revealed new markers of molecular response to adavosertib. Analysis of paired clinical specimens from patients enrolled in the phase 1 clinical trial informed the translational potential of the identified biomarkers in studying patient's response to adavosertib.

CONCLUSIONS: The multimodal platform identified a signature of drug efficacy and patient-specific adaptive responses applicable to preclinical and clinical drug development. The information generated by the approach may inform mechanisms of success and failure in future early phase clinical trials, providing information for optimizing clinical trial design and guiding future application into clinical practice.

Lagattuta K, Kang J, Nathan A, Pauken K, Jonsson AH, Rao D, Sharpe A, Ishigaki K, Raychaudhuri S. Repertoire analyses reveal T cell antigen receptor sequence features that influence T cell fate. Nat Immunol. 2022;23(3):446–457.
T cells acquire a regulatory phenotype when their T cell antigen receptors (TCRs) experience an intermediate- to high-affinity interaction with a self-peptide presented via the major histocompatibility complex (MHC). Using TCRβ sequences from flow-sorted human cells, we identified TCR features that promote regulatory T cell (Treg) fate. From these results, we developed a scoring system to quantify TCR-intrinsic regulatory potential (TiRP). When applied to the tumor microenvironment, TiRP scoring helped to explain why only some T cell clones maintained the conventional T cell (Tconv) phenotype through expansion. To elucidate drivers of these predictive TCR features, we then examined the two elements of the Treg TCR ligand separately: the self-peptide and the human MHC class II molecule. These analyses revealed that hydrophobicity in the third complementarity-determining region (CDR3β) of the TCR promotes reactivity to self-peptides, while TCR variable gene (TRBV gene) usage shapes the TCR's general propensity for human MHC class II-restricted activation.
Pauken K, Lagattuta K, Lu B, Lucca L, Daud A, Hafler D, Kluger H, Raychaudhuri S, Sharpe A. TCR-sequencing in cancer and autoimmunity: barcodes and beyond. Trends Immunol. 2022;43(3):180–194.
The T cell receptor (TCR) endows T cells with antigen specificity and is central to nearly all aspects of T cell function. Each naïve T cell has a unique TCR sequence that is stably maintained during cell division. In this way, the TCR serves as a molecular barcode that tracks processes such as migration, differentiation, and proliferation of T cells. Recent technological advances have enabled sequencing of the TCR from single cells alongside deep molecular phenotypes on an unprecedented scale. In this review, we discuss strengths and limitations of TCR sequences as molecular barcodes and their application to study immune responses following Programmed Death-1 (PD-1) blockade in cancer. Additionally, we consider applications of TCR data beyond use as a barcode.
Reshef Y, Rumker L, Kang J, Nathan A, Korsunsky I, Asgari S, Murray M, Moody B, Raychaudhuri S. Co-varying neighborhood analysis identifies cell populations associated with phenotypes of interest from single-cell transcriptomics. Nat Biotechnol. 2022;40(3):355–363.
As single-cell datasets grow in sample size, there is a critical need to characterize cell states that vary across samples and associate with sample attributes, such as clinical phenotypes. Current statistical approaches typically map cells to clusters and then assess differences in cluster abundance. Here we present co-varying neighborhood analysis (CNA), an unbiased method to identify associated cell populations with greater flexibility than cluster-based approaches. CNA characterizes dominant axes of variation across samples by identifying groups of small regions in transcriptional space-termed neighborhoods-that co-vary in abundance across samples, suggesting shared function or regulation. CNA performs statistical testing for associations between any sample-level attribute and the abundances of these co-varying neighborhood groups. Simulations show that CNA enables more sensitive and accurate identification of disease-associated cell states than a cluster-based approach. When applied to published datasets, CNA captures a Notch activation signature in rheumatoid arthritis, identifies monocyte populations expanded in sepsis and identifies a novel T cell population associated with progression to active tuberculosis.
Ishigaki K, Lagattuta K, Luo Y, James E, Buckner J, Raychaudhuri S. HLA autoimmune risk alleles restrict the hypervariable region of T cell receptors. Nat Genet. 2022;54(4):393–402.
Polymorphisms in the human leukocyte antigen (HLA) genes strongly influence autoimmune disease risk. HLA risk alleles may influence thymic selection to increase the frequency of T cell receptors (TCRs) reactive to autoantigens (central hypothesis). However, research in human autoimmunity has provided little evidence supporting the central hypothesis. Here we investigated the influence of HLA alleles on TCR composition at the highly diverse complementarity determining region 3 (CDR3), which confers antigen recognition. We observed unexpectedly strong HLA-CDR3 associations. The strongest association was found at HLA-DRB1 amino acid position 13, the position that mediates genetic risk for multiple autoimmune diseases. We identified multiple CDR3 amino acid features enriched by HLA risk alleles. Moreover, the CDR3 features promoted by the HLA risk alleles are more enriched in candidate pathogenic TCRs than control TCRs (for example, citrullinated epitope-specific TCRs in patients with rheumatoid arthritis). Together, these results provide genetic evidence supporting the central hypothesis.
Guan S, Mehta B, Slater D, Thompson J, DiCarlo E, Pannellini T, Pearce-Fisher D, Zhang F, Raychaudhuri S, Hale C, Jiang C, Goodman S, Orange D. Rheumatoid Arthritis Synovial Inflammation Quantification Using Computer Vision. ACR Open Rheumatol. 2022;4(4):322–331.
OBJECTIVE: We quantified inflammatory burden in rheumatoid arthritis (RA) synovial tissue by using computer vision to automate the process of counting individual nuclei in hematoxylin and eosin images. METHODS: We adapted and applied computer vision algorithms to quantify nuclei density (count of nuclei per unit area of tissue) on synovial tissue from arthroplasty samples. A pathologist validated algorithm results by labeling nuclei in synovial images that were mislabeled or missed by the algorithm. Nuclei density was compared with other measures of RA inflammation such as semiquantitative histology scores, gene-expression data, and clinical measures of disease activity. RESULTS: The algorithm detected a median of 112,657 (range 8,160-821,717) nuclei per synovial sample. Based on pathologist-validated results, the sensitivity and specificity of the algorithm was 97% and 100%, respectively. The mean nuclei density calculated by the algorithm was significantly higher (P < 0.05) in synovium with increased histology scores for lymphocytic inflammation, plasma cells, and lining hyperplasia. Analysis of RNA sequencing identified 915 significantly differentially expressed genes in correlation with nuclei density (false discovery rate is less than 0.05). Mean nuclei density was significantly higher (P < 0.05) in patients with elevated levels of C-reactive protein, erythrocyte sedimentation rate, rheumatoid factor, and cyclized citrullinated protein antibody. CONCLUSION: Nuclei density is a robust measurement of inflammatory burden in RA and correlates with multiple orthogonal measurements of inflammation.
Maurits M, Korsunsky I, Raychaudhuri S, Murphy S, Smoller J, Weiss S, Huizinga T, Reinders M, Karlson E, Akker E, Knevel R. A framework for employing longitudinally collected multicenter electronic health records to stratify heterogeneous patient populations on disease history. J Am Med Inform Assoc. 2022;29(5):761–769.
OBJECTIVE: To facilitate patient disease subset and risk factor identification by constructing a pipeline which is generalizable, provides easily interpretable results, and allows replication by overcoming electronic health records (EHRs) batch effects. MATERIAL AND METHODS: We used 1872 billing codes in EHRs of 102 880 patients from 12 healthcare systems. Using tools borrowed from single-cell omics, we mitigated center-specific batch effects and performed clustering to identify patients with highly similar medical history patterns across the various centers. Our visualization method (PheSpec) depicts the phenotypic profile of clusters, applies a novel filtering of noninformative codes (Ranked Scope Pervasion), and indicates the most distinguishing features. RESULTS: We observed 114 clinically meaningful profiles, for example, linking prostate hyperplasia with cancer and diabetes with cardiovascular problems and grouping pediatric developmental disorders. Our framework identified disease subsets, exemplified by 6 "other headache" clusters, where phenotypic profiles suggested different underlying mechanisms: migraine, convulsion, injury, eye problems, joint pain, and pituitary gland disorders. Phenotypic patterns replicated well, with high correlations of ≥0.75 to an average of 6 (2-8) of the 12 different cohorts, demonstrating the consistency with which our method discovers disease history profiles. DISCUSSION: Costly clinical research ventures should be based on solid hypotheses. We repurpose methods from single-cell omics to build these hypotheses from observational EHR data, distilling useful information from complex data. CONCLUSION: We establish a generalizable pipeline for the identification and replication of clinically meaningful (sub)phenotypes from widely available high-dimensional billing codes. This approach overcomes datatype problems and produces comprehensive visualizations of validation-ready phenotypes.
Fava A, Rao D, Mohan C, Zhang T, Rosenberg A, Fenaroli P, Belmont M, Izmirly P, Clancy R, Trujillo JM, Fine D, Arazi A, Berthier C, Davidson A, James J, Diamond B, Hacohen N, Wofsy D, Raychaudhuri S, Apruzzese W, Accelerating Medicines Partnership in Rheumatoid Arthritis and Systemic Lupus Erythematosus Network, Buyon J, Petri M. Urine Proteomics and Renal Single-Cell Transcriptomics Implicate Interleukin-16 in Lupus Nephritis. Arthritis Rheumatol. 2022;74(5):829–839.
OBJECTIVE: Current lupus nephritis (LN) treatments are effective in only 30% of patients, emphasizing the need for novel therapeutic strategies. We undertook this study to develop mechanistic hypotheses and explore novel biomarkers by analyzing the longitudinal urinary proteomic profiles in LN patients undergoing treatment. METHODS: We quantified 1,000 urinary proteins in 30 patients with LN at the time of the diagnostic renal biopsy and after 3, 6, and 12 months. The proteins and molecular pathways detected in the urine proteome were then analyzed with respect to baseline clinical features and longitudinal trajectories. The intrarenal expression of candidate biomarkers was evaluated using single-cell transcriptomics of renal biopsy sections from LN patients. RESULTS: Our analysis revealed multiple biologic pathways, including chemotaxis, neutrophil activation, platelet degranulation, and extracellular matrix organization, which could be noninvasively quantified and monitored in the urine. We identified 237 urinary biomarkers associated with LN, as compared to controls without systemic lupus erythematosus. Interleukin-16 (IL-16), CD163, and transforming growth factor β mirrored intrarenal nephritis activity. Response to treatment was paralleled by a reduction in urinary IL-16, a CD4 ligand with proinflammatory and chemotactic properties. Single-cell RNA sequencing independently demonstrated that IL16 is the second most expressed cytokine by most infiltrating immune cells in LN kidneys. IL-16-producing cells were found at key sites of kidney injury. CONCLUSION: Urine proteomics may profoundly change the diagnosis and management of LN by noninvasively monitoring active intrarenal biologic pathways. These findings implicate IL-16 in LN pathogenesis, designating it as a potentially treatable target and biomarker.
Mysore V, Tahir S, Furuhashi K, Arora J, Rosetti F, Cullere X, Yazbeck P, Sekulic M, Lemieux M, Raychaudhuri S, Horwitz B, Mayadas T. Monocytes transition to macrophages within the inflamed vasculature via monocyte CCR2 and endothelial TNFR2. J Exp Med. 2022;219(5).
Monocytes undergo phenotypic and functional changes in response to inflammatory cues, but the molecular signals that drive different monocyte states remain largely undefined. We show that monocytes acquire macrophage markers upon glomerulonephritis and may be derived from CCR2+CX3CR1+ double-positive monocytes, which are preferentially recruited, dwell within glomerular capillaries, and acquire proinflammatory characteristics in the nephritic kidney. Mechanistically, the transition to immature macrophages begins within the vasculature and relies on CCR2 in circulating cells and TNFR2 in parenchymal cells, findings that are recapitulated in vitro with monocytes cocultured with TNF-TNFR2-activated endothelial cells generating CCR2 ligands. Single-cell RNA sequencing of cocultures defines a CCR2-dependent monocyte differentiation path associated with the acquisition of immune effector functions and generation of CCR2 ligands. Immature macrophages are detected in the urine of lupus nephritis patients, and their frequency correlates with clinical disease. In conclusion, CCR2-dependent functional specialization of monocytes into macrophages begins within the TNF-TNFR2-activated vasculature and may establish a CCR2-based autocrine, feed-forward loop that amplifies renal inflammation.