Mounting evidence suggests that malignant tumors are initiated and maintained by a subpopulation of cancerous cells with biological properties similar to those of normal stem cells. However, descriptions of stem-like gene and pathway signatures in cancers are inconsistent across experimental systems. Driven by a need to improve our understanding of molecular processes that are common and unique across cancer stem cells (CSCs), we have developed the Stem Cell Discovery Engine (SCDE)-an online database of curated CSC experiments coupled to the Galaxy analytical framework. The SCDE allows users to consistently describe, share and compare CSC data at the gene and pathway level. Our initial focus has been on carefully curating tissue and cancer stem cell-related experiments from blood, intestine and brain to create a high quality resource containing 53 public studies and 1098 assays. The experimental information is captured and stored in the multi-omics Investigation/Study/Assay (ISA-Tab) format and can be queried in the data repository. A linked Galaxy framework provides a comprehensive, flexible environment populated with novel tools for gene list comparisons against molecular signatures in GeneSigDB and MSigDB, curated experiments in the SCDE and pathways in WikiPathways. The SCDE is available at http://discovery.hsci.harvard.edu.
Publications
2012
African Americans are disproportionately affected by type 2 diabetes (T2DM) yet few studies have examined T2DM using genome-wide association approaches in this ethnicity. The aim of this study was to identify genes associated with T2DM in the African American population. We performed a Genome Wide Association Study (GWAS) using the Affymetrix 6.0 array in 965 African-American cases with T2DM and end-stage renal disease (T2DM-ESRD) and 1029 population-based controls. The most significant SNPs (n = 550 independent loci) were genotyped in a replication cohort and 122 SNPs (n = 98 independent loci) were further tested through genotyping three additional validation cohorts followed by meta-analysis in all five cohorts totaling 3,132 cases and 3,317 controls. Twelve SNPs had evidence of association in the GWAS (P<0.0071), were directionally consistent in the Replication cohort and were associated with T2DM in subjects without nephropathy (P<0.05). Meta-analysis in all cases and controls revealed a single SNP reaching genome-wide significance (P<2.5×10(-8)). SNP rs7560163 (P = 7.0×10(-9), OR (95% CI) = 0.75 (0.67-0.84)) is located intergenically between RND3 and RBM43. Four additional loci (rs7542900, rs4659485, rs2722769 and rs7107217) were associated with T2DM (P<0.05) and reached more nominal levels of significance (P<2.5×10(-5)) in the overall analysis and may represent novel loci that contribute to T2DM. We have identified novel T2DM-susceptibility variants in the African-American population. Notably, T2DM risk was associated with the major allele and implies an interesting genetic architecture in this population. These results suggest that multiple loci underlie T2DM susceptibility in the African-American population and that these loci are distinct from those identified in other ethnic populations.
Circulating levels of adiponectin, a hormone produced predominantly by adipocytes, are highly heritable and are inversely associated with type 2 diabetes mellitus (T2D) and other metabolic traits. We conducted a meta-analysis of genome-wide association studies in 39,883 individuals of European ancestry to identify genes associated with metabolic disease. We identified 8 novel loci associated with adiponectin levels and confirmed 2 previously reported loci (P = 4.5×10(-8)-1.2×10(-43)). Using a novel method to combine data across ethnicities (N = 4,232 African Americans, N = 1,776 Asians, and N = 29,347 Europeans), we identified two additional novel loci. Expression analyses of 436 human adipocyte samples revealed that mRNA levels of 18 genes at candidate regions were associated with adiponectin concentrations after accounting for multiple testing (p<3×10(-4)). We next developed a multi-SNP genotypic risk score to test the association of adiponectin decreasing risk alleles on metabolic traits and diseases using consortia-level meta-analytic data. This risk score was associated with increased risk of T2D (p = 4.3×10(-3), n = 22,044), increased triglycerides (p = 2.6×10(-14), n = 93,440), increased waist-to-hip ratio (p = 1.8×10(-5), n = 77,167), increased glucose two hours post oral glucose tolerance testing (p = 4.4×10(-3), n = 15,234), increased fasting insulin (p = 0.015, n = 48,238), but with lower in HDL-cholesterol concentrations (p = 4.5×10(-13), n = 96,748) and decreased BMI (p = 1.4×10(-4), n = 121,335). These findings identify novel genetic determinants of adiponectin levels, which, taken together, influence risk of T2D and markers of insulin resistance.
Gene expression quantitative trait loci (eQTL) are useful for identifying single nucleotide polymorphisms (SNPs) associated with diseases. At times, a genetic variant may be associated with a master regulator involved in the manifestation of a disease. The downstream target genes of the master regulator are typically co-expressed and share biological function. Therefore, it is practical to screen for eQTLs by identifying SNPs associated with the targets of a transcript-regulator (TR). We used a multivariate regression with the gene expression of known targets of TRs and SNPs to identify TReQTLs in European (CEU) and African (YRI) HapMap populations. A nominal p-value of <1×10(-6) revealed 234 SNPs in CEU and 154 in YRI as TReQTLs. These represent 36 independent (tag) SNPs in CEU and 39 in YRI affecting the downstream targets of 25 and 36 TRs respectively. At a false discovery rate (FDR) = 45%, one cis-acting tag SNP (within 1 kb of a gene) in each population was identified as a TReQTL. In CEU, the SNP (rs16858621) in Pcnxl2 was found to be associated with the genes regulated by CREM whereas in YRI, the SNP (rs16909324) was linked to the targets of miRNA hsa-miR-125a. To infer the pathways that regulate expression, we ranked TReQTLs by connectivity within the structure of biological process subtrees. One TReQTL SNP (rs3790904) in CEU maps to Lphn2 and is associated (nominal p-value = 8.1×10(-7)) with the targets of the X-linked breast cancer suppressor Foxp3. The structure of the biological process subtree and a gene interaction network of the TReQTL revealed that tumor necrosis factor, NF-kappaB and variants in G-protein coupled receptors signaling may play a central role as communicators in Foxp3 functional regulation. The potential pleiotropic effect of the Foxp3 TReQTLs was gleaned from integrating mRNA-Seq data and SNP-set enrichment into the analysis.
To make full use of research data, the bioscience community needs to adopt technologies and reward mechanisms that support interoperability and promote the growth of an open 'data commoning' culture. Here we describe the prerequisites for data commoning and present an established and growing ecosystem of solutions using the shared 'Investigation-Study-Assay' framework to support that vision.
BACKGROUND & AIMS: The precise mechanisms by which IFN exerts its antiviral effect against HCV have not yet been elucidated. We sought to identify host genes that mediate the antiviral effect of IFN-α by conducting a whole-genome siRNA library screen.
METHODS: High throughput screening was performed using an HCV genotype 1b replicon, pRep-Feo. Those pools with replicate robust Z scores ≥2.0 entered secondary validation in full-length OR6 replicon cells. Huh7.5.1 cells infected with JFH1 were then used to validate the rescue efficacy of selected genes for HCV replication under IFN-α treatment.
RESULTS: We identified and confirmed 93 human genes involved in the IFN-α anti-HCV effect using a whole-genome siRNA library. Gene ontology analysis revealed that mRNA processing (23 genes, p=2.756e-22), translation initiation (nine genes, p=2.42e-6), and IFN signaling (five genes, p=1.00e-3) were the most enriched functional groups. Nine genes were components of U4/U6.U5 tri-snRNP. We confirmed that silencing squamous cell carcinoma antigen recognized by T cells (SART1), a specific factor of tri-snRNP, abrogates IFN-α's suppressive effects against HCV in both replicon cells and JFH1 infectious cells. We further found that SART1 was not IFN-α inducible, and its anti-HCV effector in the JFH1 infectious model was through regulation of interferon stimulated genes (ISGs) with or without IFN-α.
CONCLUSIONS: We identified 93 genes that mediate the anti-HCV effect of IFN-α through genome-wide siRNA screening; 23 and nine genes were involved in mRNA processing and translation initiation, respectively. These findings reveal an unexpected role for mRNA processing in generation of the antiviral state, and suggest a new avenue for therapeutic development in HCV.
BACKGROUND: Animal studies suggest that early-life lead exposure influences gene expression and production of proteins associated with Alzheimer's disease (AD).
OBJECTIVES: We attempted to assess the relationship between early-life lead exposure and potential biomarkers for AD among young men and women. We also attempted to assess whether early-life lead exposure was associated with changes in expression of AD-related genes.
METHODS: We used sandwich enzyme-linked immunosorbent assays (ELISA) to measure plasma concentrations of amyloid β proteins Aβ40 and Aβ42 among 55 adults who had participated as newborns and young children in a prospective cohort study of the effects of lead exposure on development. We used RNA microarray techniques to analyze gene expression.
RESULTS: Mean plasma Aβ42 concentrations were lower among 13 participants with high umbilical cord blood lead concentrations (≥ 10 μg/dL) than in 42 participants with lower cord blood lead concentrations (p = 0.08). Among 10 participants with high prenatal lead exposure, we found evidence of an inverse relationship between umbilical cord lead concentration and expression of ADAM metallopeptidase domain 9 (ADAM9), reticulon 4 (RTN4), and low-density lipoprotein receptor-related protein associated protein 1 (LRPAP1) genes, whose products are believed to affect Aβ production and deposition. Gene network analysis suggested enrichment in gene sets involved in nerve growth and general cell development.
CONCLUSIONS: Data from our exploratory study suggest that prenatal lead exposure may influence Aβ-related biological pathways that have been implicated in AD onset. Gene network analysis identified further candidates to study the mechanisms of developmental lead neurotoxicity.
Little is known about the mechanisms of persistent airflow obstruction that result from chronic occupational endotoxin exposure. We sought to analyze the inflammatory response underlying persistent airflow obstruction as a result of chronic occupational endotoxin exposure. We developed a murine model of daily inhaled endotoxin for periods of 5 days to 8 weeks. We analyzed physiologic lung dysfunction, lung histology, bronchoalveolar lavage fluid and total lung homogenate inflammatory cell and cytokine profiles, and pulmonary gene expression profiles. We observed an increase in airway hyperresponsiveness as a result of chronic endotoxin exposure. After 8 weeks, the mice exhibited an increase in bronchoalveolar lavage and lung neutrophils that correlated with an increase in proinflammatory cytokines. Detailed analyses of inflammatory cell subsets revealed an expansion of dendritic cells (DCs), and in particular, proinflammatory DCs, with a reduced percentage of macrophages. Gene expression profiling revealed the up-regulation of a panel of genes that was consistent with DC recruitment, and lung histology revealed an accumulation of DCs in inflammatory aggregates around the airways in 8-week-exposed animals. Repeated, low-dose LPS inhalation, which mirrors occupational exposure, resulted in airway hyperresponsiveness, associated with a failure to resolve the proinflammatory response, an inverted macrophage to DC ratio, and a significant rise in the inflammatory DC population. These findings point to a novel underlying mechanism of airflow obstruction as a result of occupational LPS exposure, and suggest molecular and cellular targets for therapeutic development.
2011
The endoplasmic reticulum (ER) is the main site of protein and lipid synthesis, membrane biogenesis, xenobiotic detoxification and cellular calcium storage, and perturbation of ER homeostasis leads to stress and the activation of the unfolded protein response. Chronic activation of ER stress has been shown to have an important role in the development of insulin resistance and diabetes in obesity. However, the mechanisms that lead to chronic ER stress in a metabolic context in general, and in obesity in particular, are not understood. Here we comparatively examined the proteomic and lipidomic landscape of hepatic ER purified from lean and obese mice to explore the mechanisms of chronic ER stress in obesity. We found suppression of protein but stimulation of lipid synthesis in the obese ER without significant alterations in chaperone content. Alterations in ER fatty acid and lipid composition result in the inhibition of sarco/endoplasmic reticulum calcium ATPase (SERCA) activity and ER stress. Correcting the obesity-induced alteration of ER phospholipid composition or hepatic Serca overexpression in vivo both reduced chronic ER stress and improved glucose homeostasis. Hence, we established that abnormal lipid and calcium metabolism are important contributors to hepatic ER stress in obesity.
A simple biochemical method to isolate mRNAs pulled down with a transfected, biotinylated microRNA was used to identify direct target genes of miR-34a, a tumor suppressor gene. The method reidentified most of the known miR-34a regulated genes expressed in K562 and HCT116 cancer cell lines. Transcripts for 982 genes were enriched in the pull-down with miR-34a in both cell lines. Despite this large number, validation experiments suggested that 90% of the genes identified in both cell lines can be directly regulated by miR-34a. Thus miR-34a is capable of regulating hundreds of genes. The transcripts pulled down with miR-34a were highly enriched for their roles in growth factor signaling and cell cycle progression. These genes form a dense network of interacting gene products that regulate multiple signal transduction pathways that orchestrate the proliferative response to external growth stimuli. Multiple candidate miR-34a-regulated genes participate in RAS-RAF-MAPK signaling. Ectopic miR-34a expression reduced basal ERK and AKT phosphorylation and enhanced sensitivity to serum growth factor withdrawal, while cells genetically deficient in miR-34a were less sensitive. Fourteen new direct targets of miR-34a were experimentally validated, including genes that participate in growth factor signaling (ARAF and PIK3R2) as well as genes that regulate cell cycle progression at various phases of the cell cycle (cyclins D3 and G2, MCM2 and MCM5, PLK1 and SMAD4). Thus miR-34a tempers the proliferative and pro-survival effect of growth factor stimulation by interfering with growth factor signal transduction and downstream pathways required for cell division.