Publications

2017

Jain D, Hodonsky CJ, Schick UM, Morrison J V, Minnerath S, Brown L, et al. Genome-wide association of white blood cell counts in Hispanic/Latino Americans: the Hispanic Community Health Study/Study of Latinos.. Human molecular genetics. 2017;26(6):1193-204.

Circulating white blood cell (WBC) counts (neutrophils, monocytes, lymphocytes, eosinophils, basophils) differ by ethnicity. The genetic factors underlying basal WBC traits in Hispanics/Latinos are unknown. We performed a genome-wide association study of total WBC and differential counts in a large, ethnically diverse US population sample of Hispanics/Latinos ascertained by the Hispanic Community Health Study and Study of Latinos (HCHS/SOL). We demonstrate that several previously known WBC-associated genetic loci (e.g. the African Duffy antigen receptor for chemokines null variant for neutrophil count) are generalizable to WBC traits in Hispanics/Latinos. We identified and replicated common and rare germ-line variants at FLT3 (a gene often somatically mutated in leukemia) associated with monocyte count. The common FLT3 variant rs76428106 has a large allele frequency differential between African and non-African populations. We also identified several novel genetic loci involving or regulating hematopoietic transcription factors (CEBPE-SLC7A7, CEBPA and CRBN-TRNT1) associated with basophil count. The minor allele of the CEBPE variant associated with lower basophil count has been previously associated with Amerindian ancestry and higher risk of acute lymphoblastic leukemia in Hispanics. Together, these data suggest that germline genetic variation affecting transcriptional and signaling pathways that underlie WBC development and lineage specification can contribute to inter-individual as well as ethnic differences in peripheral blood cell counts (normal hematopoiesis) in addition to susceptibility to leukemia (malignant hematopoiesis).

Sofer T, Heller R, Bogomolov M, Avery CL, Graff M, North KE, et al. A powerful statistical framework for generalization testing in GWAS, with application to the HCHS/SOL.. Genetic epidemiology. 2017;41(3):251-8.

In genome-wide association studies (GWAS), "generalization" is the replication of genotype-phenotype association in a population with different ancestry than the population in which it was first identified. Current practices for declaring generalizations rely on testing associations while controlling the family-wise error rate (FWER) in the discovery study, then separately controlling error measures in the follow-up study. This approach does not guarantee control over the FWER or false discovery rate (FDR) of the generalization null hypotheses. It also fails to leverage the two-stage design to increase power for detecting generalized associations. We provide a formal statistical framework for quantifying the evidence of generalization that accounts for the (in)consistency between the directions of associations in the discovery and follow-up studies. We develop the directional generalization FWER (FWERg ) and FDR (FDRg ) controlling r-values, which are used to declare associations as generalized. This framework extends to generalization testing when applied to a published list of Single Nucleotide Polymorphism-(SNP)-trait associations. Our methods control FWERg or FDRg under various SNP selection rules based on P-values in the discovery study. We find that it is often beneficial to use a more lenient P-value threshold than the genome-wide significance threshold. In a GWAS of total cholesterol in the Hispanic Community Health Study/Study of Latinos (HCHS/SOL), when testing all SNPs with P-values <5×10-8 (15 genomic regions) for generalization in a large GWAS of whites, we generalized SNPs from 15 regions. But when testing all SNPs with P-values <6.6×10-5 (89 regions), we generalized SNPs from 27 regions.

Sofer T, Cornelis MC, Kraft P, Tchetgen EJT. CONTROL FUNCTION ASSISTED IPW ESTIMATION WITH A SECONDARY OUTCOME IN CASE-CONTROL STUDIES.. Statistica Sinica. 2017;27(2):785-804.

Case-control studies are designed towards studying associations between risk factors and a single, primary outcome. Information about additional, secondary outcomes is also collected, but association studies targeting such secondary outcomes should account for the case-control sampling scheme, or otherwise results may be biased. Often, one uses inverse probability weighted (IPW) estimators to estimate population effects in such studies. IPW estimators are robust, as they only require correct specification of the mean regression model of the secondary outcome on covariates, and knowledge of the disease prevalence. However, IPW estimators are inefficient relative to estimators that make additional assumptions about the data generating mechanism. We propose a class of estimators for the effect of risk factors on a secondary outcome in case-control studies that combine IPW with an additional modeling assumption: specification of the disease outcome probability model. We incorporate this model via a mean zero control function. We derive the class of all regular and asymptotically linear estimators corresponding to our modeling assumption, when the secondary outcome mean is modeled using either the identity or the log link. We find the efficient estimator in our class of estimators and show that it reduces to standard IPW when the model for the primary disease outcome is unrestricted, and is more efficient than standard IPW when the model is either parametric or semiparametric.

Hodonsky CJ, Jain D, Schick UM, Morrison J V, Brown L, McHugh CP, et al. Genome-wide association study of red blood cell traits in Hispanics/Latinos: The Hispanic Community Health Study/Study of Latinos.. PLoS genetics. 2017;13(4):e1006760.

Prior GWAS have identified loci associated with red blood cell (RBC) traits in populations of European, African, and Asian ancestry. These studies have not included individuals with an Amerindian ancestral background, such as Hispanics/Latinos, nor evaluated the full spectrum of genomic variation beyond single nucleotide variants. Using a custom genotyping array enriched for Amerindian ancestral content and 1000 Genomes imputation, we performed GWAS in 12,502 participants of Hispanic Community Health Study and Study of Latinos (HCHS/SOL) for hematocrit, hemoglobin, RBC count, RBC distribution width (RDW), and RBC indices. Approximately 60% of previously reported RBC trait loci generalized to HCHS/SOL Hispanics/Latinos, including African ancestral alpha- and beta-globin gene variants. In addition to the known 3.8kb alpha-globin copy number variant, we identified an Amerindian ancestral association in an alpha-globin regulatory region on chromosome 16p13.3 for mean corpuscular volume and mean corpuscular hemoglobin. We also discovered and replicated three genome-wide significant variants in previously unreported loci for RDW (SLC12A2 rs17764730, PSMB5 rs941718), and hematocrit (PROX1 rs3754140). Among the proxy variants at the SLC12A2 locus we identified rs3812049, located in a bi-directional promoter between SLC12A2 (which encodes a red cell membrane ion-transport protein) and an upstream anti-sense long-noncoding RNA, LINC01184, as the likely causal variant. We further demonstrate that disruption of the regulatory element harboring rs3812049 affects transcription of SLC12A2 and LINC01184 in human erythroid progenitor cells. Together, these results reinforce the importance of genetic study of diverse ancestral populations, in particular Hispanics/Latinos.

Yan Q, Brehm J, Pino-Yanes M, Forno E, Lin J, Oh SS, et al. A meta-analysis of genome-wide association studies of asthma in Puerto Ricans.. The European respiratory journal. 2017;49(5).

Puerto Ricans are disproportionately affected with asthma in the USA. In this study, we aim to identify genetic variants that confer susceptibility to asthma in Puerto Ricans.We conducted a meta-analysis of genome-wide association studies (GWAS) of asthma in Puerto Ricans, including participants from: the Genetics of Asthma in Latino Americans (GALA) I-II, the Hartford-Puerto Rico Study and the Hispanic Community Health Study. Moreover, we examined whether susceptibility loci identified in previous meta-analyses of GWAS are associated with asthma in Puerto Ricans.The only locus to achieve genome-wide significance was chromosome 17q21, as evidenced by our top single nucleotide polymorphism (SNP), rs907092 (OR 0.71, p=1.2×10-12) at IKZF3 Similar to results in non-Puerto Ricans, SNPs in genes in the same linkage disequilibrium block as IKZF3 (e.g. ZPBP2, ORMDL3 and GSDMB) were significantly associated with asthma in Puerto Ricans. With regard to results from a meta-analysis in Europeans, we replicated findings for rs2305480 at GSDMB, but not for SNPs in any other genes. On the other hand, we replicated results from a meta-analysis of North American populations for SNPs at IL1RL1, TSLP and GSDMB but not for IL33Our findings suggest that common variants on chromosome 17q21 have the greatest effects on asthma in Puerto Ricans.

Few genome-wide association studies (GWAS) of type 2 diabetes (T2D) have been conducted in U.S. Hispanics/Latinos of diverse backgrounds who are disproportionately affected by diabetes. We conducted a GWAS in 2,499 T2D case subjects and 5,247 control subjects from six Hispanic/Latino background groups in the Hispanic Community Health Study/Study of Latinos (HCHS/SOL). Our GWAS identified two known loci (TCF7L2 and KCNQ1) reaching genome-wide significance levels. Conditional analysis on known index single nucleotide polymorphisms (SNPs) indicated an additional independent signal at KCNQ1, represented by an African ancestry-specific variant, rs1049549 (odds ratio 1.49 [95% CI 1.27-1.75]). This association was consistent across Hispanic/Latino background groups and replicated in the MEta-analysis of type 2 DIabetes in African Americans (MEDIA) Consortium. Among 80 previously known index SNPs at T2D loci, 66 SNPs showed consistency with the reported direction of associations and 14 SNPs significantly generalized to the HCHS/SOL. A genetic risk score based on these 80 index SNPs was significantly associated with T2D (odds ratio 1.07 [1.06-1.09] per risk allele), with a stronger effect observed in nonobese than in obese individuals. Our study identified a novel independent signal suggesting an African ancestry-specific allele at KCNQ1 for T2D. Associations between previously identified loci and T2D were generally shown in a large cohort of U.S. Hispanics/Latinos.

Liang J, Le TH, Edwards DRV, Tayo BO, Gaulton KJ, Smith JA, et al. Single-trait and multi-trait genome-wide association analyses identify novel loci for blood pressure in African-ancestry populations.. PLoS genetics. 2017;13(5):e1006728.

Hypertension is a leading cause of global disease, mortality, and disability. While individuals of African descent suffer a disproportionate burden of hypertension and its complications, they have been underrepresented in genetic studies. To identify novel susceptibility loci for blood pressure and hypertension in people of African ancestry, we performed both single and multiple-trait genome-wide association analyses. We analyzed 21 genome-wide association studies comprised of 31,968 individuals of African ancestry, and validated our results with additional 54,395 individuals from multi-ethnic studies. These analyses identified nine loci with eleven independent variants which reached genome-wide significance (P < 1.25×10-8) for either systolic and diastolic blood pressure, hypertension, or for combined traits. Single-trait analyses identified two loci (TARID/TCF21 and LLPH/TMBIM4) and multiple-trait analyses identified one novel locus (FRMD3) for blood pressure. At these three loci, as well as at GRP20/CDH17, associated variants had alleles common only in African-ancestry populations. Functional annotation showed enrichment for genes expressed in immune and kidney cells, as well as in heart and vascular cells/tissues. Experiments driven by these findings and using angiotensin-II induced hypertension in mice showed altered kidney mRNA expression of six genes, suggesting their potential role in hypertension. Our study provides new evidence for genes related to hypertension susceptibility, and the need to study African-ancestry populations in order to identify biologic factors contributing to hypertension.

Genetic variants contribute to normal variation of iron-related traits and may also cause clinical syndromes of iron deficiency or excess. Iron overload and deficiency can adversely affect human health. For example, elevated iron storage is associated with increased diabetes risk, although mechanisms are still being investigated. We conducted the first genome-wide association study of serum iron, total iron binding capacity (TIBC), transferrin saturation, and ferritin in a Hispanic/Latino cohort, the Hispanic Community Health Study/Study of Latinos (>12 000 participants) and also assessed the generalization of previously known loci to this population. We then evaluated whether iron-associated variants were associated with diabetes and glycemic traits. We found evidence for a novel association between TIBC and a variant near the gene for protein phosphatase 1, regulatory subunit 3B (PPP1R3B; rs4841132, β = -0.116, P = 7.44 × 10-8). The effect strengthened when iron deficient individuals were excluded (β = -0.121, P = 4.78 × 10-9). Ten of sixteen variants previously associated with iron traits generalized to HCHS/SOL, including variants at the transferrin (TF), hemochromatosis (HFE), fatty acid desaturase 2 (FADS2)/myelin regulatory factor (MYRF), transmembrane protease, serine 6 (TMPRSS6), transferrin receptor (TFR2), N-acetyltransferase 2 (arylamine N-acetyltransferase) (NAT2), ABO blood group (ABO), and GRB2 associated binding protein 3 (GAB3) loci. In examining iron variant associations with glucose homeostasis, an iron-raising variant of TMPRSS6 was associated with lower HbA1c levels (P = 8.66 × 10-10). This association was attenuated upon adjustment for iron measures. In contrast, the iron-raising allele of PPP1R3B was associated with higher levels of fasting glucose (P = 7.70 × 10-7) and fasting insulin (P = 4.79 × 10-6), but these associations were not attenuated upon adjustment for TIBC-so iron is not likely a mediator. These results provide new genetic information on iron traits and their connection with glucose homeostasis.

Nolte IM, Munoz L, Tragante V, Amare AT, Jansen R, Vaez A, et al. Genetic loci associated with heart rate variability and their effects on cardiac disease risk.. Nature communications. 2017;8:15805.

Reduced cardiac vagal control reflected in low heart rate variability (HRV) is associated with greater risks for cardiac morbidity and mortality. In two-stage meta-analyses of genome-wide association studies for three HRV traits in up to 53,174 individuals of European ancestry, we detect 17 genome-wide significant SNPs in eight loci. HRV SNPs tag non-synonymous SNPs (in NDUFA11 and KIAA1755), expression quantitative trait loci (eQTLs) (influencing GNG11, RGS6 and NEO1), or are located in genes preferentially expressed in the sinoatrial node (GNG11, RGS6 and HCN4). Genetic risk scores account for 0.9 to 2.6% of the HRV variance. Significant genetic correlation is found for HRV with heart rate (-0.74<rg<-0.55) and blood pressure (-0.35<rg<-0.20). These findings provide clinically relevant biological insight into heritable variation in vagal heart rhythm regulation, with a key role for genetic variants (GNG11, RGS6) that influence G-protein heterotrimer action in GIRK-channel induced pacemaker membrane hyperpolarization.

Sanders AE, Sofer T, Wong Q, Kerr KF, Agler C, Shaffer JR, et al. Chronic Periodontitis Genome-wide Association Study in the Hispanic Community Health Study / Study of Latinos.. Journal of dental research. 2017;96(1):64-72.

Chronic periodontitis (CP) has a genetic component, particularly its severe forms. Evidence from genome-wide association studies (GWASs) has highlighted several potential novel loci. Here, the authors report the first GWAS of CP among a large community-based sample of Hispanics/Latinos. The authors interrogated a quantitative trait of CP (mean interproximal clinical attachment level determined by full-mouth periodontal examinations) among 10,935 adult participants (mean age: 45 y, range: 18 to 76 y) from the Hispanic Community Health Study / Study of Latinos. Genotyping was done with a custom Illumina Omni2.5M array, and imputation to approximately 20 million single-nucleotide polymorphisms was based on the 1000 Genomes Project phase 1 reference panel. Analyses were based on linear mixed models adjusting for sex, age, study design features, ancestry, and kinship and employed a conventional P < 5 × 10-8 statistical significance threshold. The authors identified a genome-wide significant association signal in the 1q42.2 locus ( TSNAX-DISC1 noncoding RNA, lead single-nucleotide polymorphism: rs149133391, minor allele [C] frequency = 0.01, P = 7.9 × 10-9) and 4 more loci with suggestive evidence of association ( P < 5 × 10-6): 1q22 (rs13373934), 5p15.33 (rs186066047), 6p22.3 (rs10456847), and 11p15.1 (rs75715012). We tested these loci for replication in independent samples of European-American ( n = 4,402) and African-American ( n = 908) participants of the Atherosclerosis Risk in Communities study. There was no replication among the European Americans; however, the TSNAX-DISC1 locus replicated in the African-American sample (rs149133391, minor allele frequency = 0.02, P = 9.1 × 10-3), while the 1q22 locus was directionally concordant and nominally significant (rs13373934, P = 4.0 × 10-2). This discovery GWAS of interproximal clinical attachment level-a measure of lifetime periodontal tissue destruction-was conducted in a large, community-based sample of Hispanic/Latinos. It identified a genome-wide significant locus that was independently replicated in an African-American population. Identifying this genetic marker offers direction for interrogation in subsequent genomic and experimental studies of CP.