Publications

2023

Jiang MZ, Aguet F, Ardlie K, Chen J, Cornell E, Cruz D, et al. Canonical correlation analysis for multi-omics: Application to cross-cohort analysis.. PLoS genetics. 2023;19(5):e1010517.

Integrative approaches that simultaneously model multi-omics data have gained increasing popularity because they provide holistic system biology views of multiple or all components in a biological system of interest. Canonical correlation analysis (CCA) is a correlation-based integrative method designed to extract latent features shared between multiple assays by finding the linear combinations of features-referred to as canonical variables (CVs)-within each assay that achieve maximal across-assay correlation. Although widely acknowledged as a powerful approach for multi-omics data, CCA has not been systematically applied to multi-omics data in large cohort studies, which has only recently become available. Here, we adapted sparse multiple CCA (SMCCA), a widely-used derivative of CCA, to proteomics and methylomics data from the Multi-Ethnic Study of Atherosclerosis (MESA) and Jackson Heart Study (JHS). To tackle challenges encountered when applying SMCCA to MESA and JHS, our adaptations include the incorporation of the Gram-Schmidt (GS) algorithm with SMCCA to improve orthogonality among CVs, and the development of Sparse Supervised Multiple CCA (SSMCCA) to allow supervised integration analysis for more than two assays. Effective application of SMCCA to the two real datasets reveals important findings. Applying our SMCCA-GS to MESA and JHS, we identified strong associations between blood cell counts and protein abundance, suggesting that adjustment of blood cell composition should be considered in protein-based association studies. Importantly, CVs obtained from two independent cohorts also demonstrate transferability across the cohorts. For example, proteomic CVs learned from JHS, when transferred to MESA, explain similar amounts of blood cell count phenotypic variance in MESA, explaining 39.0%   50.0% variation in JHS and 38.9%   49.1% in MESA. Similar transferability was observed for other omics-CV-trait pairs. This suggests that biologically meaningful and cohort-agnostic variation is captured by CVs. We anticipate that applying our SMCCA-GS and SSMCCA on various cohorts would help identify cohort-agnostic biologically meaningful relationships between multi-omics data and phenotypic traits.

Reynolds KM, Horimoto ARR V, Lin BM, Zhang Y, Kurniansyah N, Yu B, et al. Ancestry-driven metabolite variation provides insights into disease states in admixed populations.. Genome medicine. 2023;15(1):52.

BACKGROUND: Metabolic pathways are related to physiological functions and disease states and are influenced by genetic variation and environmental factors. Hispanics/Latino individuals have ancestry-derived genomic regions (local ancestry) from their recent admixture that have been less characterized for associations with metabolite abundance and disease risk.

METHODS: We performed admixture mapping of 640 circulating metabolites in 3887 Hispanic/Latino individuals from the Hispanic Community Health Study/Study of Latinos (HCHS/SOL). Metabolites were quantified in fasting serum through non-targeted mass spectrometry (MS) analysis using ultra-performance liquid chromatography-MS/MS. Replication was performed in 1856 nonoverlapping HCHS/SOL participants with metabolomic data.

RESULTS: By leveraging local ancestry, this study identified significant ancestry-enriched associations for 78 circulating metabolites at 484 independent regions, including 116 novel metabolite-genomic region associations that replicated in an independent sample. Among the main findings, we identified Native American enriched genomic regions at chromosomes 11 and 15, mapping to FADS1/FADS2 and LIPC, respectively, associated with reduced long-chain polyunsaturated fatty acid metabolites implicated in metabolic and inflammatory pathways. An African-derived genomic region at chromosome 2 was associated with N-acetylated amino acid metabolites. This region, mapped to ALMS1, is associated with chronic kidney disease, a disease that disproportionately burdens individuals of African descent.

CONCLUSIONS: Our findings provide important insights into differences in metabolite quantities related to ancestry in admixed populations including metabolites related to regulation of lipid polyunsaturated fatty acids and N-acetylated amino acids, which may have implications for common diseases in populations.

STUDY OBJECTIVES: Shift work is a risk factor for cardiometabolic disease, possibly through effects on sleep-wake rhythms. We hypothesized that evening (afternoon and night combined) and irregular (irregular/on-call or rotating combined) shift work during pregnancy is associated with increased odds of preeclampsia, preterm birth, and gestational diabetes mellitus (GDM), mediated by irregular sleep timing.

METHODS: The Nulliparous Pregnancy Outcomes Study: Monitoring Mothers-to-be (nuMoM2b) is a prospective cohort study (n = 10 038) designed to investigate risk factors for adverse pregnancy outcomes. Medical outcomes were determined with medical record abstraction and/or questionnaires; sleep midpoint was measured in a subset of participants with ≥5-day wrist actigraphy (ActiWatch). We estimated the association of evening and irregular shift work during pregnancy with preeclampsia, preterm birth, and GDM using logistic regression, adjusted for adversity (cumulative variable for poverty, education, health insurance, and partner status), smoking, self-reported race/ethnicity, and age. Finally, we explored whether the association between shiftwork and GDM was mediated by variability in sleep timing.

RESULTS: Evening shift work is associated with approximately 75% increased odds of developing GDM (adjusted OR = 1.75, 95% CI: 1.12-2.66); we did not observe associations with irregular shifts, preterm birth, or preeclampsia after adjustment. Pregnant evening shift workers were found to have approximately 45 minutes greater variability in sleep timing compared to day workers (p < .005); sleep-timing variability explained 25% of the association between evening shift work and GDM in a mediation analysis.

CONCLUSIONS: Evening shift work was associated with GDM, and this relationship may be mediated by variability in sleep timing.

Zhang Y, Liu X, Wiggins KL, Kurniansyah N, Guo X, Rodrigue AL, et al. Association of Mitochondrial DNA Copy Number With Brain MRI Markers and Cognitive Function: A Meta-analysis of Community-Based Cohorts.. Neurology. 2023;100(18):e1930-e1943.

BACKGROUND AND OBJECTIVES: Previous studies suggest that lower mitochondrial DNA (mtDNA) copy number (CN) is associated with neurodegenerative diseases. However, whether mtDNA CN in whole blood is related to endophenotypes of Alzheimer disease (AD) and AD-related dementia (AD/ADRD) needs further investigation. We assessed the association of mtDNA CN with cognitive function and MRI measures in community-based samples of middle-aged to older adults.

METHODS: We included dementia-free participants from 9 diverse community-based cohorts with whole-genome sequencing in the Trans-Omics for Precision Medicine (TOPMed) program. Circulating mtDNA CN was estimated as twice the ratio of the average coverage of mtDNA to nuclear DNA. Brain MRI markers included total brain, hippocampal, and white matter hyperintensity volumes. General cognitive function was derived from distinct cognitive domains. We performed cohort-specific association analyses of mtDNA CN with AD/ADRD endophenotypes assessed within ±5 years (i.e., cross-sectional analyses) or 5-20 years after blood draw (i.e., prospective analyses) adjusting for potential confounders. We further explored associations stratified by sex and age (<60 vs ≥60 years). Fixed-effects or sample size-weighted meta-analyses were performed to combine results. Finally, we performed mendelian randomization (MR) analyses to assess causality.

RESULTS: We included up to 19,152 participants (mean age 59 years, 57% women). Higher mtDNA CN was cross-sectionally associated with better general cognitive function (β = 0.04; 95% CI 0.02-0.06) independent of age, sex, batch effects, race/ethnicity, time between blood draw and cognitive evaluation, cohort-specific variables, and education. Additional adjustment for blood cell counts or cardiometabolic traits led to slightly attenuated results. We observed similar significant associations with cognition in prospective analyses, although of reduced magnitude. We found no significant associations between mtDNA CN and brain MRI measures in meta-analyses. MR analyses did not reveal a causal relation between mtDNA CN in blood and cognition.

DISCUSSION: Higher mtDNA CN in blood is associated with better current and future general cognitive function in large and diverse communities across the United States. Although MR analyses did not support a causal role, additional research is needed to assess causality. Circulating mtDNA CN could serve nevertheless as a biomarker of current and future cognitive function in the community.

Granot-Hershkovitz E, Xia R, Yang Y, Spitzer B, Tarraf W, Vásquez PM, et al. Interaction analysis of ancestry-enriched variants with APOE-ɛ4 on MCI in the Study of Latinos-Investigation of Neurocognitive Aging.. Scientific reports. 2023;13(1):5114.

APOE-ɛ4 risk on Mild Cognitive Impairment (MCI) and Alzheimer's Disease (AD) differs between race/ethnic groups, presumably due to ancestral genomic background surrounding the APOE locus. We studied whether African and Amerindian ancestry-enriched genetic variants in the APOE region modify the effect of the APOE-ɛ4 alleles on Mild Cognitive Impairment (MCI) in Hispanics/Latinos. We defined African and Amerindian ancestry-enriched variants as those common in one Hispanic/Latino parental ancestry and rare in the other two. We identified such variants in the APOE region with a predicted moderate impact based on the SnpEff tool. We tested their interaction with APOE-ɛ4 on MCI in the Study of Latinos-Investigation of Neurocognitive Aging (SOL-INCA) population and African Americans from the Atherosclerosis Risk In Communities (ARIC) study. We identified 5 Amerindian and 14 African enriched variants with an expected moderate effect. A suggestive significant interaction (p-value = 0.01) was found for one African-enriched variant, rs8112679, located in the ZNF222 gene fourth exon. Our results suggest there are no ancestry-enriched variants with large effect sizes of interaction effects with APOE-ɛ4 on MCI in the APOE region in the Hispanic/Latino population. Further studies are needed in larger datasets to identify potential interactions with smaller effect sizes.

Mills EW, Cassidy M, Sofer T, Tadros T, Zei P, Sauer W, et al. Evaluation of obstructive sleep apnea among consecutive patients with all patterns of atrial fibrillation using WatchPAT home sleep testing.. American heart journal. 2023;261:95-103.

BACKGROUND: Atrial fibrillation (AF) is the most common arrhythmia encountered in clinical practice and is associated with significant morbidity, mortality, and financial burden. Obstructive sleep apnea (OSA) is more common in individuals with AF and may impair the efficacy of rhythm control strategies including catheter ablation. However, the prevalence of undiagnosed OSA in all-comers with AF is unknown.

DESIGN: This pragmatic, phase IV prospective cohort study will test 250-300 consecutive ambulatory AF patients with all patterns of atrial fibrillation (paroxysmal, persistent, and long-term persistent) and no prior sleep testing for OSA using the WatchPAT system, a disposable home sleep test (HST). The primary outcome of the study is the prevalence of undiagnosed OSA in all-comers with atrial fibrillation.

RESULTS: Preliminary results from the initial pilot enrollment of approximately 15% (N = 38) of the planned sample size demonstrate a 79.0% prevalence of at least mild (AHI≥5) OSA or greater in consecutively enrolled patient with all patterns of AF.

CONCLUSIONS: We report the design, methodology, and preliminary results of our study to define the prevalence of OSA in AF patients. This study will help inform approaches to OSA screening in patients with AF for which there is currently little practical guidance.

CLINICAL TRIAL REGISTRATION: NCT05155813.

Suzuki K, Hatzikotoulas K, Southam L, Taylor HJ, Yin X, Lorenz KM, et al. Multi-ancestry genome-wide study in >2.5 million individuals reveals heterogeneity in mechanistic pathways of type 2 diabetes and complications.. medRxiv : the preprint server for health sciences. 2023;

Type 2 diabetes (T2D) is a heterogeneous disease that develops through diverse pathophysiological processes. To characterise the genetic contribution to these processes across ancestry groups, we aggregate genome-wide association study (GWAS) data from 2,535,601 individuals (39.7% non-European ancestry), including 428,452 T2D cases. We identify 1,289 independent association signals at genome-wide significance (P<5×10-8) that map to 611 loci, of which 145 loci are previously unreported. We define eight non-overlapping clusters of T2D signals characterised by distinct profiles of cardiometabolic trait associations. These clusters are differentially enriched for cell-type specific regions of open chromatin, including pancreatic islets, adipocytes, endothelial, and enteroendocrine cells. We build cluster-specific partitioned genetic risk scores (GRS) in an additional 137,559 individuals of diverse ancestry, including 10,159 T2D cases, and test their association with T2D-related vascular outcomes. Cluster-specific partitioned GRS are more strongly associated with coronary artery disease and end-stage diabetic nephropathy than an overall T2D GRS across ancestry groups, highlighting the importance of obesity-related processes in the development of vascular outcomes. Our findings demonstrate the value of integrating multi-ancestry GWAS with single-cell epigenomics to disentangle the aetiological heterogeneity driving the development and progression of T2D, which may offer a route to optimise global access to genetically-informed diabetes care.

Pirzada A, Cai J, Heiss G, Sotres-Alvarez D, Gallo LC, Youngblood ME, et al. Evolving Science on Cardiovascular Disease Among Hispanic/Latino Adults: JACC International.. Journal of the American College of Cardiology. 2023;81(15):1505-20.

The landmark, multicenter HCHS/SOL (Hispanic Community Health Study/Study of Latinos) is the largest, most comprehensive, longitudinal community-based cohort study to date of diverse Hispanic/Latino persons in the United States. The HCHS/SOL aimed to address the dearth of comprehensive data on risk factors for cardiovascular disease (CVD) and other chronic diseases in this population and has expanded considerably in scope since its inception. This paper describes the aims/objectives and data collection of the HCHS/SOL and its ancillary studies to date and highlights the critical and sizable contributions made by the study to understanding the prevalence of and changes in CVD risk/protective factors and the burden of CVD and related chronic conditions among adults of diverse Hispanic/Latino backgrounds. The continued follow-up of this cohort will allow in-depth investigations on cardiovascular and pulmonary outcomes in this population, and data from the ongoing ancillary studies will facilitate generation of new hypotheses and study questions.

Weinstock JS, Gopakumar J, Burugula BB, Uddin MM, Jahn N, Belk JA, et al. Aberrant activation of TCL1A promotes stem cell expansion in clonal haematopoiesis.. Nature. 2023;616(7958):755-63.

Mutations in a diverse set of driver genes increase the fitness of haematopoietic stem cells (HSCs), leading to clonal haematopoiesis1. These lesions are precursors for blood cancers2-6, but the basis of their fitness advantage remains largely unknown, partly owing to a paucity of large cohorts in which the clonal expansion rate has been assessed by longitudinal sampling. Here, to circumvent this limitation, we developed a method to infer the expansion rate from data from a single time point. We applied this method to 5,071 people with clonal haematopoiesis. A genome-wide association study revealed that a common inherited polymorphism in the TCL1A promoter was associated with a slower expansion rate in clonal haematopoiesis overall, but the effect varied by driver gene. Those carrying this protective allele exhibited markedly reduced growth rates or prevalence of clones with driver mutations in TET2, ASXL1, SF3B1 and SRSF2, but this effect was not seen in clones with driver mutations in DNMT3A. TCL1A was not expressed in normal or DNMT3A-mutated HSCs, but the introduction of mutations in TET2 or ASXL1 led to the expression of TCL1A protein and the expansion of HSCs in vitro. The protective allele restricted TCL1A expression and expansion of mutant HSCs, as did experimental knockdown of TCL1A expression. Forced expression of TCL1A promoted the expansion of human HSCs in vitro and mouse HSCs in vivo. Our results indicate that the fitness advantage of several commonly mutated driver genes in clonal haematopoiesis may be mediated by TCL1A activation.

Wong WJ, Emdin C, Bick AG, Zekavat SM, Niroula A, Pirruccello JP, et al. Clonal haematopoiesis and risk of chronic liver disease.. Nature. 2023;616(7958):747-54.

Chronic liver disease is a major public health burden worldwide1. Although different aetiologies and mechanisms of liver injury exist, progression of chronic liver disease follows a common pathway of liver inflammation, injury and fibrosis2. Here we examined the association between clonal haematopoiesis of indeterminate potential (CHIP) and chronic liver disease in 214,563 individuals from 4 independent cohorts with whole-exome sequencing data (Framingham Heart Study, Atherosclerosis Risk in Communities Study, UK Biobank and Mass General Brigham Biobank). CHIP was associated with an increased risk of prevalent and incident chronic liver disease (odds ratio = 2.01, 95% confidence interval (95% CI) [1.46, 2.79]; P < 0.001). Individuals with CHIP were more likely to demonstrate liver inflammation and fibrosis detectable by magnetic resonance imaging compared to those without CHIP (odds ratio = 1.74, 95% CI [1.16, 2.60]; P = 0.007). To assess potential causality, Mendelian randomization analyses showed that genetic predisposition to CHIP was associated with a greater risk of chronic liver disease (odds ratio = 2.37, 95% CI [1.57, 3.6]; P < 0.001). In a dietary model of non-alcoholic steatohepatitis, mice transplanted with Tet2-deficient haematopoietic cells demonstrated more severe liver inflammation and fibrosis. These effects were mediated by the NLRP3 inflammasome and increased levels of expression of downstream inflammatory cytokines in Tet2-deficient macrophages. In summary, clonal haematopoiesis is associated with an elevated risk of liver inflammation and chronic liver disease progression through an aberrant inflammatory response.