Matching heterogeneous cohorts by projected principal components reveals two novel Alzheimer's disease-associated genes in the Hispanic population.

Willett, Julian Daniel Sunday, Mohamad Waqas, Serhiy Naumenko, Kristina Mullin, Julian Hecker, Lars Bertram, Christoph Lange, Ioannis Vlachos, Winston Hide, Rudolph E Tanzi, and Dmitry Prokopenko. 2026. “Matching Heterogeneous Cohorts by Projected Principal Components Reveals Two Novel Alzheimer’s Disease-Associated Genes in the Hispanic Population..” Alzheimer’s & Dementia : The Journal of the Alzheimer’s Association 22(2):e71189.

Abstract

INTRODUCTION: Alzheimer's disease (AD) is the most common form of dementia. Studies have suggested prevalence is greater in individuals self-identifying as Hispanic. Population-specific results enable personalized and equitable interventions. Ethnicity as a stratifier co-occurs with genomic inflation due to heterogeneity.

METHODS: We conducted genome-wide association studies (GWAS) and meta-analyses among subjects from the Alzheimer's Disease Sequencing Project (ADSP) Umbrella whole genome sequencing (WGS) dataset who self-identified as Hispanic and All of Us (AoU) sub-cohorts matched to that cohort, using projected genetically-derived principal components.

RESULTS: We identified a common variant in PIEZO2 on chromosome 18 protective for AD in ADSP subjects, with a p-value just beyond genome-wide significance (p =  5.4 × 10 - 8 $5.4\ \times {{10}^{ - 8}}$ ). Meta-analyses with genetically-matched AoU participants yielded three (two novel) genome-wide significant AD-associated loci based on rare lead variants: rs374043832 (RGS6/PSEN1), rs192423465 (ASPSCR1), and rs935208076 (GDAP2), which were nominally significant in AoU sub-cohorts.

DISCUSSION: We demonstrate a way to match subjects between large biobanks and small disease-specific cohorts, enabling novel findings.

Last updated on 04/02/2026
PubMed