Publications

2021

Chen J, Spracklen CN, Marenne G, Varshney A, Corbin LJ, Luan J, et al. The trans-ancestral genomic architecture of glycemic traits.. Nature genetics. 2021;53(6):840-6.

Glycemic traits are used to diagnose and monitor type 2 diabetes and cardiometabolic health. To date, most genetic studies of glycemic traits have focused on individuals of European ancestry. Here we aggregated genome-wide association studies comprising up to 281,416 individuals without diabetes (30% non-European ancestry) for whom fasting glucose, 2-h glucose after an oral glucose challenge, glycated hemoglobin and fasting insulin data were available. Trans-ancestry and single-ancestry meta-analyses identified 242 loci (99 novel; P < 5 × 10-8), 80% of which had no significant evidence of between-ancestry heterogeneity. Analyses restricted to individuals of European ancestry with equivalent sample size would have led to 24 fewer new loci. Compared with single-ancestry analyses, equivalent-sized trans-ancestry fine-mapping reduced the number of estimated variants in 99% credible sets by a median of 37.5%. Genomic-feature, gene-expression and gene-set analyses revealed distinct biological signatures for each trait, highlighting different underlying biological pathways. Our results increase our understanding of diabetes pathophysiology by using trans-ancestry studies for improved power and resolution.

Sofer T, Zheng X, Laurie CA, Gogarten SM, Brody JA, Conomos MP, et al. Variant-specific inflation factors for assessing population stratification at the phenotypic variance level.. Nature communications. 2021;12(1):3506.

In modern Whole Genome Sequencing (WGS) epidemiological studies, participant-level data from multiple studies are often pooled and results are obtained from a single analysis. We consider the impact of differential phenotype variances by study, which we term 'variance stratification'. Unaccounted for, variance stratification can lead to both decreased statistical power, and increased false positives rates, depending on how allele frequencies, sample sizes, and phenotypic variances vary across the studies that are pooled. We develop a procedure to compute variant-specific inflation factors, and show how it can be used for diagnosis of genetic association analyses on pooled individual level data from multiple studies. We describe a WGS-appropriate analysis approach, implemented in freely-available software, which allows study-specific variances and thereby improves performance in practice. We illustrate the variance stratification problem, its solutions, and the proposed diagnostic procedure, in simulations and in data from the Trans-Omics for Precision Medicine Whole Genome Sequencing Program (TOPMed), used in association tests for hemoglobin concentrations and BMI.

Keramati AR, Chen MH, Rodriguez BAT, Yanek LR, Bhan A, Gaynor BJ, et al. Genome sequencing unveils a regulatory landscape of platelet reactivity.. Nature communications. 2021;12(1):3626.

Platelet aggregation at the site of atherosclerotic vascular injury is the underlying pathophysiology of myocardial infarction and stroke. To build upon prior GWAS, here we report on 16 loci identified through a whole genome sequencing (WGS) approach in 3,855 NHLBI Trans-Omics for Precision Medicine (TOPMed) participants deeply phenotyped for platelet aggregation. We identify the RGS18 locus, which encodes a myeloerythroid lineage-specific regulator of G-protein signaling that co-localizes with expression quantitative trait loci (eQTL) signatures for RGS18 expression in platelets. Gene-based approaches implicate the SVEP1 gene, a known contributor of coronary artery disease risk. Sentinel variants at RGS18 and PEAR1 are associated with thrombosis risk and increased gastrointestinal bleeding risk, respectively. Our WGS findings add to previously identified GWAS loci, provide insights regarding the mechanism(s) by which genetics may influence cardiovascular disease risk, and underscore the importance of rare variant and regulatory approaches to identifying loci contributing to complex phenotypes.

Li R, Rueschman M, Gottlieb DJ, Redline S, Sofer T. A composite sleep and pulmonary phenotype predicting hypertension.. EBioMedicine. 2021;68:103433.

BACKGROUND: Multiple aspects of sleep and Sleep Disordered Breathing (SDB) have been linked to hypertension. However, the standard measure of SDB, the Apnoea Hypopnea Index (AHI), has not identified patients likely to experience large improvements in blood pressure with SDB treatment.

METHODS: To use machine learning to select sleep and pulmonary measures associated with hypertension development when considered jointly, we applied feature screening followed by Elastic Net penalized regression in association with incident hypertension using a wide array of polysomnography measures, and lung function, derived for the Sleep Heart Health Study (SHHS).

FINDINGS: At baseline, n=860 SHHS individuals with complete data were age 61 years, on average. Of these, 291 developed hypertension  5 years later. A combination of pulmonary function and 18 sleep phenotypes predicted incident hypertension (OR=1.43, 95% confidence interval [1.14, 1.80] per 1 standard deviation (SD) of the phenotype), while the apnoea-hypopnea index (AHI) had low evidence of association with incident hypertension (OR =1.13, 95% confidence interval [0.97, 1.33] per 1 SD). In a generalization analysis in 923 individuals from the Multi-Ethnic Study of Atherosclerosis, aged 65 on average with 615 individuals with hypertension, the new phenotype was cross-sectionally associated with hypertension (OR=1.26, 95% CI [1.10, 1.45]).

INTERPRETATION: A unique combination of sleep and pulmonary function measures better predicts hypertension compared to the AHI. The composite measure included indices capturing apnoea and hypopnea event durations, with shorter event lengths associated with increased risk of hypertension.

FUNDING: This research was supported by National Heart, Lung, and Blood Institute (NHLBI) contracts HHSN268201500003I, N01-HC-95159, N01-HC-95160, N01-HC-95161, N01-HC-95162, N01-HC-95163, N01-HC-95164, N01-HC-95165, N01-HC-95166, N01-HC-95167, N01-HC-95168, and N01-HC-95169 and by National Center for Advancing Translational Sciences grants UL1-TR- 000040, UL1-TR-001079, and UL1-TR-001420. The MESA Sleep ancillary study was supported by NHLBI grant HL-56984. Pulmonary phenotyping in MESA was funded by NHLBI grants R01-HL077612 and R01-HL093081. This work was supported by NHLBI grant R35HL135818 to Susan Redline.

Justice AE, Young K, Gogarten SM, Sofer T, Graff M, Love SAM, et al. Genome-wide association study of body fat distribution traits in Hispanics/Latinos from the HCHS/SOL.. Human molecular genetics. 2021;30(22):2190-204.

Central obesity is a leading health concern with a great burden carried by ethnic minority populations, especially Hispanics/Latinos. Genetic factors contribute to the obesity burden overall and to inter-population differences. We aimed to identify the loci associated with central adiposity measured as waist-to-hip ratio (WHR), waist circumference (WC) and hip circumference (HIP) adjusted for body mass index (adjBMI) by using the Hispanic Community Health Study/Study of Latinos (HCHS/SOL); determine if differences in associations differ by background group within HCHS/SOL and determine whether previously reported associations generalize to HCHS/SOL. Our analyses included 7472 women and 5200 men of mainland (Mexican, Central and South American) and Caribbean (Puerto Rican, Cuban and Dominican) background residing in the USA. We performed genome-wide association analyses stratified and combined across sexes using linear mixed-model regression. We identified 16 variants for waist-to-hip ratio adjusted for body mass index (WHRadjBMI), 22 for waist circumference adjusted for body mass index (WCadjBMI) and 28 for hip circumference adjusted for body mass index (HIPadjBMI), which reached suggestive significance (P < 1 × 10-6). Many loci exhibited differences in strength of associations by ethnic background and sex. We brought a total of 66 variants forward for validation in cohorts (N = 34 161) with participants of Hispanic/Latino, African and European descent. We confirmed four novel loci (P < 0.05 and consistent direction of effect, and P < 5 × 10-8 after meta-analysis), including two for WHRadjBMI (rs13301996, rs79478137); one for WCadjBMI (rs3168072) and one for HIPadjBMI (rs28692724). Also, we generalized previously reported associations to HCHS/SOL, (8 for WHRadjBMI, 10 for WCadjBMI and 12 for HIPadjBMI). Our study highlights the importance of large-scale genomic studies in ancestrally diverse Hispanic/Latino populations for identifying and characterizing central obesity susceptibility that may be ancestry-specific.

STUDY OBJECTIVES: In an older African-American sample (n = 231) we tested associations of the household environment and in-bed behaviors with sleep duration, efficiency, and wakefulness after sleep onset (WASO).

METHODS: Older adult participants completed a household-level sleep environment questionnaire, a sleep questionnaire, and underwent 7-day wrist actigraphy for objective measures of sleep. Perceived household environment (self-reported) was evaluated using questions regarding safety, physical comfort, temperature, noise, and light disturbances. In-bed behaviors included watching television, listening to radio/music, use of computer/tablet/phone, playing video games, reading books, and eating. To estimate the combined effect of the components in each domain (perceived household environment and in-bed behaviors), we calculated and standardized a weighted score per sleep outcome (e.g. duration, efficiency, WASO), with a higher score indicating worse conditions. The weights were derived from the coefficients of each component estimated from linear regression models predicting each sleep outcome while adjusting for covariates.

RESULTS: A standard deviation increase in an adverse household environment score was associated with lower self-reported sleep duration (β = -13.9 min, 95% confidence interval: -26.1, -1.7) and actigraphy-based sleep efficiency (β = -0.7%, -1.4, 0.0). A standard deviation increase in the in-bed behaviors score was associated with lower actigraphy-based sleep duration (β = -9.7 min, -18.0, -1.3), sleep efficiency (β = -1.2%, -1.9, -0.6), and higher WASO (5.3 min, 2.1, 8.6).

CONCLUSION: Intervening on the sleep environment, including healthy sleep practices, may improve sleep duration and continuity among African-Americans.

Bryan MS, Sofer T, Afshar M, Mossavar-Rahmani Y, Hosgood D, Punjabi NM, et al. Mendelian randomization analysis of arsenic metabolism and pulmonary function within the Hispanic Community Health Study/Study of Latinos.. Scientific reports. 2021;11(1):13470.

Arsenic exposure has been linked to poor pulmonary function, and inefficient arsenic metabolizers may be at increased risk. Dietary rice has recently been identified as a possible substantial route of exposure to arsenic, and it remains unknown whether it can provide a sufficient level of exposure to affect pulmonary function in inefficient metabolizers. Within 12,609 participants of HCHS/SOL, asthma diagnoses and spirometry-based measures of pulmonary function were assessed, and rice consumption was inferred from grain intake via a food frequency questionnaire. After stratifying by smoking history, the relationship between arsenic metabolism efficiency [percentages of inorganic arsenic (%iAs), monomethylarsenate (%MMA), and dimethylarsinate (%DMA) species in urine] and the measures of pulmonary function were estimated in a two-sample Mendelian randomization approach (genotype information from an Illumina HumanOmni2.5-8v1-1 array), focusing on participants with high inferred rice consumption. Among never-smoking high inferred consumers of rice (n = 1395), inefficient metabolism was associated with past asthma diagnosis and forced vital capacity below the lower limit of normal (LLN) (OR 1.40, p = 0.0212 and OR 1.42, p = 0.0072, respectively, for each percentage-point increase in %iAs; OR 1.26, p = 0.0240 and OR 1.24, p = 0.0193 for %MMA; OR 0.87, p = 0.0209 and OR 0.87, p = 0.0123 for the marker of efficient metabolism, %DMA). Among ever-smoking high inferred consumers of rice (n = 1127), inefficient metabolism was associated with peak expiratory flow below LLN (OR 1.54, p = 0.0108/percentage-point increase in %iAs, OR 1.37, p = 0.0097 for %MMA, and OR 0.83, p = 0.0093 for %DMA). Less efficient arsenic metabolism was associated with indicators of pulmonary dysfunction among those with high inferred rice consumption, suggesting that reductions in dietary arsenic could improve respiratory health.

Sofer T, Lee J, Kurniansyah N, Jain D, Laurie CA, Gogarten SM, et al. BinomiRare: A robust test for association of a rare genetic variant with a binary outcome for mixed models and any case-control proportion.. HGG advances. 2021;2(3).

Whole-genome sequencing (WGS) and whole-exome sequencing studies have become increasingly available and are being used to identify rare genetic variants associated with health and disease outcomes. Investigators routinely use mixed models to account for genetic relatedness or other clustering variables (e.g., family or household) when testing genetic associations. However, no existing tests of the association of a rare variant with a binary outcome in the presence of correlated data control the type 1 error where there are (1) few individuals harboring the rare allele, (2) a small proportion of cases relative to controls, and (3) covariates to adjust for. Here, we address all three issues in developing a framework for testing rare variant association with a binary trait in individuals harboring at least one risk allele. In this framework, we estimate outcome probabilities under the null hypothesis and then use them, within the individuals with at least one risk allele, to test variant associations. We extend the BinomiRare test, which was previously proposed for independent observations, and develop the Conway-Maxwell-Poisson (CMP) test and study their properties in simulations. We show that the BinomiRare test always controls the type 1 error, while the CMP test sometimes does not. We then use the BinomiRare test to test the association of rare genetic variants in target genes with small-vessel disease (SVD) stroke, short sleep, and venous thromboembolism (VTE), in whole-genome sequence data from the Trans-Omics for Precision Medicine (TOPMed) program.

Cade BE, Lee J, Sofer T, Wang H, Zhang M, Chen H, et al. Whole-genome association analyses of sleep-disordered breathing phenotypes in the NHLBI TOPMed program.. Genome medicine. 2021;13(1):136.

BACKGROUND: Sleep-disordered breathing is a common disorder associated with significant morbidity. The genetic architecture of sleep-disordered breathing remains poorly understood. Through the NHLBI Trans-Omics for Precision Medicine (TOPMed) program, we performed the first whole-genome sequence analysis of sleep-disordered breathing.

METHODS: The study sample was comprised of 7988 individuals of diverse ancestry. Common-variant and pathway analyses included an additional 13,257 individuals. We examined five complementary traits describing different aspects of sleep-disordered breathing: the apnea-hypopnea index, average oxyhemoglobin desaturation per event, average and minimum oxyhemoglobin saturation across the sleep episode, and the percentage of sleep with oxyhemoglobin saturation < 90%. We adjusted for age, sex, BMI, study, and family structure using MMSKAT and EMMAX mixed linear model approaches. Additional bioinformatics analyses were performed with MetaXcan, GIGSEA, and ReMap.

RESULTS: We identified a multi-ethnic set-based rare-variant association (p = 3.48 × 10-8) on chromosome X with ARMCX3. Additional rare-variant associations include ARMCX3-AS1, MRPS33, and C16orf90. Novel common-variant loci were identified in the NRG1 and SLC45A2 regions, and previously associated loci in the IL18RAP and ATP2B4 regions were associated with novel phenotypes. Transcription factor binding site enrichment identified associations with genes implicated with respiratory and craniofacial traits. Additional analyses identified significantly associated pathways.

CONCLUSIONS: We have identified the first gene-based rare-variant associations with objectively measured sleep-disordered breathing traits. Our results increase the understanding of the genetic architecture of sleep-disordered breathing and highlight associations in genes that modulate lung development, inflammation, respiratory rhythmogenesis, and HIF1A-mediated hypoxic response.

Luo Y, Kanai M, Choi W, Li X, Sakaue S, Yamamoto K, et al. A high-resolution HLA reference panel capturing global population diversity enables multi-ancestry fine-mapping in HIV host response.. Nature genetics. 2021;53(10):1504-16.

Fine-mapping to plausible causal variation may be more effective in multi-ancestry cohorts, particularly in the MHC, which has population-specific structure. To enable such studies, we constructed a large (n = 21,546) HLA reference panel spanning five global populations based on whole-genome sequences. Despite population-specific long-range haplotypes, we demonstrated accurate imputation at G-group resolution (94.2%, 93.7%, 97.8% and 93.7% in admixed African (AA), East Asian (EAS), European (EUR) and Latino (LAT) populations). Applying HLA imputation to genome-wide association study data for HIV-1 viral load in three populations (EUR, AA and LAT), we obviated effects of previously reported associations from population-specific HIV studies and discovered a novel association at position 156 in HLA-B. We pinpointed the MHC association to three amino acid positions (97, 67 and 156) marking three consecutive pockets (C, B and D) within the HLA-B peptide-binding groove, explaining 12.9% of trait variance.