Publications

2024

Suzuki K, Hatzikotoulas K, Southam L, Taylor HJ, Yin X, Lorenz KM, et al. Genetic drivers of heterogeneity in type 2 diabetes pathophysiology.. Nature. 2024;627(8003):347-5.

Type 2 diabetes (T2D) is a heterogeneous disease that develops through diverse pathophysiological processes1,2 and molecular mechanisms that are often specific to cell type3,4. Here, to characterize the genetic contribution to these processes across ancestry groups, we aggregate genome-wide association study data from 2,535,601 individuals (39.7% not of European ancestry), including 428,452 cases of T2D. We identify 1,289 independent association signals at genome-wide significance (P < 5 × 10-8) that map to 611 loci, of which 145 loci are, to our knowledge, previously unreported. We define eight non-overlapping clusters of T2D signals that are characterized by distinct profiles of cardiometabolic trait associations. These clusters are differentially enriched for cell-type-specific regions of open chromatin, including pancreatic islets, adipocytes, endothelial cells and enteroendocrine cells. We build cluster-specific partitioned polygenic scores5 in a further 279,552 individuals of diverse ancestry, including 30,288 cases of T2D, and test their association with T2D-related vascular outcomes. Cluster-specific partitioned polygenic scores are associated with coronary artery disease, peripheral artery disease and end-stage diabetic nephropathy across ancestry groups, highlighting the importance of obesity-related processes in the development of vascular outcomes. Our findings show the value of integrating multi-ancestry genome-wide association study data with single-cell epigenomics to disentangle the aetiological heterogeneity that drives the development and progression of T2D. This might offer a route to optimize global access to genetically informed diabetes care.

Nagarajan P, Winkler TW, Bentley AR, Miller CL, Kraja AT, Schwander K, et al. A Large-Scale Genome-Wide Study of Gene-Sleep Duration Interactions for Blood Pressure in 811,405 Individuals from Diverse Populations.. medRxiv : the preprint server for health sciences. 2024;.

Although both short and long sleep duration are associated with elevated hypertension risk, our understanding of their interplay with biological pathways governing blood pressure remains limited. To address this, we carried out genome-wide cross-population gene-by-short-sleep and long-sleep duration interaction analyses for three blood pressure traits (systolic, diastolic, and pulse pressure) in 811,405 individuals from diverse population groups. We discover 22 novel gene-sleep duration interaction loci for blood pressure, mapped to genes involved in neurological, thyroidal, bone metabolism, and hematopoietic pathways. Non-overlap between short sleep (12) and long sleep (10) interactions underscores the plausibility of distinct influences of both sleep duration extremes in cardiovascular health. With several of our loci reflecting specificity towards population background or sex, our discovery sheds light on the importance of embracing granularity when addressing heterogeneity entangled in gene-environment interactions, and in therapeutic design approaches for blood pressure management.

2023

Granot-Hershkovitz E, He S, Bressler J, Yu B, Tarraf W, Rebholz CM, et al. Plasma metabolites associated with cognitive function across race/ethnicities affirming the importance of healthy nutrition.. Alzheimer’s & dementia : the journal of the Alzheimer’s Association. 2023;19(4):1331-42.

INTRODUCTION: We studied the replication and generalization of previously identified metabolites potentially associated with global cognitive function in multiple race/ethnicities and assessed the contribution of diet to these associations.

METHODS: We tested metabolite-cognitive function associations in U.S.A. Hispanic/Latino adults (n = 2222) from the Community Health Study/ Study of Latinos (HCHS/SOL) and in European (n = 1365) and African (n = 478) Americans from the Atherosclerosis Risk In Communities (ARIC) Study. We applied Mendelian Randomization (MR) analyses to assess causal associations between the metabolites and cognitive function and between Mediterranean diet and cognitive function.

RESULTS: Six metabolites were consistently associated with lower global cognitive function across all studies. Of these, four were sugar-related (e.g., ribitol). MR analyses provided weak evidence for a potential causal effect of ribitol on cognitive function and bi-directional effects of cognitive performance on diet.

DISCUSSION: Several diet-related metabolites were associated with global cognitive function across studies with different race/ethnicities.

HIGHLIGHTS: Metabolites associated with cognitive function in Puerto Rican adults were recently identified. We demonstrate the generalizability of these associations across diverse race/ethnicities. Most identified metabolites are related to sugars. Mendelian Randomization (MR) provides weak evidence for a causal effect of ribitol on cognitive function. Beta-cryptoxanthin and other metabolites highlight the importance of a healthy diet.

Zhou LY, Sofer T, Horimoto ARR V, Talavera GA, Lash JP, Cai J, et al. Polygenic risk scores and kidney traits in the Hispanic/Latino population: The Hispanic Community Health Study/Study of Latinos.. HGG advances. 2023;4(2):100177.

Estimated glomerular filtration rate (eGFR) is used to evaluate kidney function and determine the presence of chronic kidney disease (CKD), a highly prevalent disease in the US1 , 2 , 3 that varies among subgroups of Hispanic/Latino individuals.4 , 5 The polygenic risk score (PRS) is a popular method that uses large genome-wide association studies (GWASs) to provide a strong estimate of disease risk.7 However, due to the limited availability of summary statistics from GWAS meta-analyses based on Hispanic/Latino populations, PRSs can only be computed using different ancestry GWASs. The performance of eGFR PRSs derived from other GWAS reference populations for Hispanic/Latino population has not been examined. We compared PRS constructions for eGFR prediction in Hispanic/Latino individuals using GWAS-significant variants, clumping and thresholding (C&T),8 and PRS-CS,22 as well as a combination of PRSs calculated with different reference GWAS meta-analyses from European and multi-ethnic studies in Hispanic/Latino individuals from the Hispanic Community Health Study/Study of Latinos (HCHS/SOL). All eGFR PRSs were highly associated with eGFR (p < 1E-20). Additionally, eGFR PRSs were significantly associated with lower risk of prevalent CKD at visit 1 or 2 and incident CKD at visit 2, with the combined PRSs having the best performance. These PRS findings were replicated in an additional dataset of Hispanic/Latino individuals using data from the Women's Health Initiative SNP Health Association Resource (WHI-SHARe).17.

Fernández-Rhodes L, McArdle CE, Rao H, Wang Y, Martinez-Miller EE, Ward JB, et al. A Gene-Acculturation Study of Obesity Among US Hispanic/Latinos: The Hispanic Community Health Study/Study of Latinos.. Psychosomatic medicine. 2023;85(4):358-65.

OBJECTIVE: In the United States, Hispanic/Latino adults face a high burden of obesity; yet, not all individuals are equally affected, partly due in part to this ethnic group's marked sociocultural diversity. We sought to analyze the modification of body mass index (BMI) genetic effects in Hispanic/Latino adults by their level of acculturation, a complex biosocial phenomenon that remains understudied.

METHODS: Among 11,747 Hispanic/Latinos adults in the Hispanic Community Health Study/Study of Latinos aged 18 to 76 years from four urban communities (2008-2011), we a) tested our hypothesis that the effect of a genetic risk score (GRS) for increased BMI may be exacerbated by higher levels of acculturation and b) examined if GRS acculturation interactions varied by gender or Hispanic/Latino background group. All genetic modeling controlled for relatedness, age, gender, principal components of ancestry, center, and complex study design within a generalized estimated equation framework.

RESULTS: We observed a GRS increase of 0.34 kg/m 2 per risk allele in weighted mean BMI. The estimated main effect of GRS on BMI varied both across acculturation level and across gender. The difference between high and low acculturation ranged from 0.03 to 0.23 kg/m 2 per risk allele, but varied across acculturation measure and gender.

CONCLUSIONS: These results suggest the presence of effect modification by acculturation, with stronger effects on BMI among highly acculturated individuals and female immigrants. Future studies of obesity in the Hispanic/Latino community should account for sociocultural environments and consider their intersection with gender to better target obesity interventions.

Seyerle AA, Laurie CA, Coombes BJ, Jain D, Conomos MP, Brody J, et al. Whole Genome Analysis of Venous Thromboembolism: the Trans-Omics for Precision Medicine Program.. Circulation. Genomic and precision medicine. 2023;16(2):e003532.

BACKGROUND: Risk for venous thromboembolism has a strong genetic component. Whole genome sequencing from the TOPMed program (Trans-Omics for Precision Medicine) allowed us to look for new associations, particularly rare variants missed by standard genome-wide association studies.

METHODS: The 3793 cases and 7834 controls (11.6% of cases were individuals of African, Hispanic/Latino, or Asian ancestry) were analyzed using a single variant approach and an aggregate gene-based approach using our primary filter (included only loss-of-function and missense variants predicted to be deleterious) and our secondary filter (included all missense variants).

RESULTS: Single variant analyses identified associations at 5 known loci. Aggregate gene-based analyses identified only PROC (odds ratio, 6.2 for carriers of rare variants; P=7.4×10-14) when using our primary filter. Employing our secondary variant filter led to a smaller effect size at PROC (odds ratio, 3.8; P=1.6×10-14), while excluding variants found only in rare isoforms led to a larger one (odds ratio, 7.5). Different filtering strategies improved the signal for 2 other known genes: PROS1 became significant (minimum P=1.8×10-6 with the secondary filter), while SERPINC1 did not (minimum P=4.4×10-5 with minor allele frequency <0.0005). Results were largely the same when restricting the analyses to include only unprovoked cases; however, one novel gene, MS4A1, became significant (P=4.4×10-7 using all missense variants with minor allele frequency <0.0005).

CONCLUSIONS: Here, we have demonstrated the importance of using multiple variant filtering strategies, as we detected additional genes when filtering variants based on their predicted deleteriousness, frequency, and presence on the most expressed isoforms. Our primary analyses did not identify new candidate loci; thus larger follow-up studies are needed to replicate the novel MS4A1 locus and to identify additional rare variation associated with venous thromboembolism.

Sofer T, Kurniansyah N, Murray M, Ho YL, Abner E, Esko T, et al. Genome-wide association study of obstructive sleep apnoea in the Million Veteran Program uncovers genetic heterogeneity by sex.. EBioMedicine. 2023;90:104536.

BACKGROUND: Genome-wide association studies (GWAS) for obstructive sleep apnoea (OSA) are limited due to the underdiagnosis of OSA, leading to misclassification of OSA, which consequently reduces statistical power. We performed a GWAS of OSA in the Million Veteran Program (MVP) of the U.S. Department of Veterans Affairs (VA) healthcare system, where OSA prevalence is close to its true population prevalence.

METHODS: We performed GWAS of 568,576 MVP participants, stratified by biological sex and by harmonized race/ethnicity and genetic ancestry (HARE) groups of White, Black, Hispanic, and Asian individuals. We considered both BMI adjusted (BMI-adj) and unadjusted (BMI-unadj) models. We replicated associations in independent datasets, and analysed the heterogeneity of OSA genetic associations across HARE and sex groups. We finally performed a larger meta-analysis GWAS of MVP, FinnGen, and the MGB Biobank, totalling 916,696 individuals.

FINDINGS: MVP participants are 91% male. OSA prevalence is 21%. In MVP there were 18 and 6 genome-wide significant loci in BMI-unadj and BMI-adj analyses, respectively, corresponding to 21 association regions. Of these, 17 were not previously reported in association with OSA, and 13 replicated in FinnGen (False Discovery Rate p-value < 0.05). There were widespread significant differences in genetic effects between men and women, but less so across HARE groups. Meta-analysis of MVP, FinnGen, and MGB biobank revealed 17 additional, previously unreported, genome-wide significant regions.

INTERPRETATION: Sex differences in genetic associations with OSA are widespread, likely associated with multiple OSA risk factors. OSA shares genetic underpinnings with several sleep phenotypes, suggesting shared aetiology and causal pathways.

FUNDING: Described in acknowledgements.

Jiang MZ, Aguet F, Ardlie K, Chen J, Cornell E, Cruz D, et al. Canonical correlation analysis for multi-omics: Application to cross-cohort analysis.. PLoS genetics. 2023;19(5):e1010517.

Integrative approaches that simultaneously model multi-omics data have gained increasing popularity because they provide holistic system biology views of multiple or all components in a biological system of interest. Canonical correlation analysis (CCA) is a correlation-based integrative method designed to extract latent features shared between multiple assays by finding the linear combinations of features-referred to as canonical variables (CVs)-within each assay that achieve maximal across-assay correlation. Although widely acknowledged as a powerful approach for multi-omics data, CCA has not been systematically applied to multi-omics data in large cohort studies, which has only recently become available. Here, we adapted sparse multiple CCA (SMCCA), a widely-used derivative of CCA, to proteomics and methylomics data from the Multi-Ethnic Study of Atherosclerosis (MESA) and Jackson Heart Study (JHS). To tackle challenges encountered when applying SMCCA to MESA and JHS, our adaptations include the incorporation of the Gram-Schmidt (GS) algorithm with SMCCA to improve orthogonality among CVs, and the development of Sparse Supervised Multiple CCA (SSMCCA) to allow supervised integration analysis for more than two assays. Effective application of SMCCA to the two real datasets reveals important findings. Applying our SMCCA-GS to MESA and JHS, we identified strong associations between blood cell counts and protein abundance, suggesting that adjustment of blood cell composition should be considered in protein-based association studies. Importantly, CVs obtained from two independent cohorts also demonstrate transferability across the cohorts. For example, proteomic CVs learned from JHS, when transferred to MESA, explain similar amounts of blood cell count phenotypic variance in MESA, explaining 39.0%   50.0% variation in JHS and 38.9%   49.1% in MESA. Similar transferability was observed for other omics-CV-trait pairs. This suggests that biologically meaningful and cohort-agnostic variation is captured by CVs. We anticipate that applying our SMCCA-GS and SSMCCA on various cohorts would help identify cohort-agnostic biologically meaningful relationships between multi-omics data and phenotypic traits.

Reynolds KM, Horimoto ARR V, Lin BM, Zhang Y, Kurniansyah N, Yu B, et al. Ancestry-driven metabolite variation provides insights into disease states in admixed populations.. Genome medicine. 2023;15(1):52.

BACKGROUND: Metabolic pathways are related to physiological functions and disease states and are influenced by genetic variation and environmental factors. Hispanics/Latino individuals have ancestry-derived genomic regions (local ancestry) from their recent admixture that have been less characterized for associations with metabolite abundance and disease risk.

METHODS: We performed admixture mapping of 640 circulating metabolites in 3887 Hispanic/Latino individuals from the Hispanic Community Health Study/Study of Latinos (HCHS/SOL). Metabolites were quantified in fasting serum through non-targeted mass spectrometry (MS) analysis using ultra-performance liquid chromatography-MS/MS. Replication was performed in 1856 nonoverlapping HCHS/SOL participants with metabolomic data.

RESULTS: By leveraging local ancestry, this study identified significant ancestry-enriched associations for 78 circulating metabolites at 484 independent regions, including 116 novel metabolite-genomic region associations that replicated in an independent sample. Among the main findings, we identified Native American enriched genomic regions at chromosomes 11 and 15, mapping to FADS1/FADS2 and LIPC, respectively, associated with reduced long-chain polyunsaturated fatty acid metabolites implicated in metabolic and inflammatory pathways. An African-derived genomic region at chromosome 2 was associated with N-acetylated amino acid metabolites. This region, mapped to ALMS1, is associated with chronic kidney disease, a disease that disproportionately burdens individuals of African descent.

CONCLUSIONS: Our findings provide important insights into differences in metabolite quantities related to ancestry in admixed populations including metabolites related to regulation of lipid polyunsaturated fatty acids and N-acetylated amino acids, which may have implications for common diseases in populations.