Publications

2014

Tan, Shen Mynn, Gabriel Altschuler, Tian Yun Zhao, Haw Siang Ang, Henry Yang, Bing Lim, Leah Vardy, Winston Hide, Andrew M Thomson, and Ricky R Lareu. [2014] 2014. “Divergent LIN28-MRNA Associations Result in Translational Suppression Upon the Initiation of Differentiation..” Nucleic Acids Research 42(12):7997-8007. doi: 10.1093/nar/gku430.

LIN28 function is fundamental to the activity and behavior of human embryonic stem cells (hESCs) and induced pluripotent stem cells. Its main roles in these cell types are the regulation of translational efficiency and let-7 miRNA maturation. However, LIN28-associated mRNA cargo shifting and resultant regulation of translational efficiency upon the initiation of differentiation remain unknown. An RNA-immunoprecipitation and microarray analysis protocol, eRIP, that has high specificity and sensitivity was developed to test endogenous LIN28-associated mRNA cargo shifting. A combined eRIP and polysome analysis of early stage differentiation of hESCs with two distinct differentiation cues revealed close similarities between the dynamics of LIN28 association and translational modulation of genes involved in the Wnt signaling, cell cycle, RNA metabolism and proteasomal pathways. Our data demonstrate that change in translational efficiency is a major contributor to early stages of differentiation of hESCs, in which LIN28 plays a central role. This implies that eRIP analysis of LIN28-associated RNA cargoes may be used for rapid functional quality control of pluripotent stem cells under manufacture for therapeutic applications.

2013

White, Richard A, Jørgen Bjørnholt V, Donna D Baird, Tore Midtvedt, Jennifer R Harris, Marcello Pagano, Winston Hide, Knut Rudi, Birgitte Moen, Nina Iszatt, Shyamal D Peddada, and Merete Eggesbø. [2013] 2013. “Novel Developmental Analyses Identify Longitudinal Patterns of Early Gut Microbiota That Affect Infant Growth..” PLoS Computational Biology 9(5):e1003042. doi: 10.1371/journal.pcbi.1003042.

It is acknowledged that some obesity trajectories are set early in life, and that rapid weight gain in infancy is a risk factor for later development of obesity. Identifying modifiable factors associated with early rapid weight gain is a prerequisite for curtailing the growing worldwide obesity epidemic. Recently, much attention has been given to findings indicating that gut microbiota may play a role in obesity development. We aim at identifying how the development of early gut microbiota is associated with expected infant growth. We developed a novel procedure that allows for the identification of longitudinal gut microbiota patterns (corresponding to the gut ecosystem developing), which are associated with an outcome of interest, while appropriately controlling for the false discovery rate. Our method identified developmental pathways of Staphylococcus species and Escherichia coli that were associated with expected growth, and traditional methods indicated that the detection of Bacteroides species at day 30 was associated with growth. Our method should have wide future applicability for studying gut microbiota, and is particularly important for translational considerations, as it is critical to understand the timing of microbiome transitions prior to attempting to manipulate gut microbiota in early life.

Sui, Shannan Ho, Emily Merrill, Nils Gehlenborg, Psalm Haseley, Ilya Sytchev, Richard Park, Philippe Rocca-Serra, Stephane Corlosquet, Alejandra Gonzalez-Beltran, Eamonn Maguire, Oliver Hofmann, Peter Park, Sudeshna Das, Susanna-Assunta Sansone, and Winston Hide. [2013] 2013. “The Stem Cell Commons: An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework..” AMIA Joint Summits on Translational Science Proceedings. AMIA Joint Summits on Translational Science 2013:70.

Comparisons of stem cell experiments at both molecular and semantic levels remain challenging due to inconsistencies in results, data formats, and descriptions among biomedical research discoveries. The Harvard Stem Cell Institute (HSCI) has created the Stem Cell Commons (stemcellcommons.org), an open, community-based approach to data sharing. Experimental information is integrated using the Investigation-Study-Assay tabular format (ISA-Tab) used by over 30 organizations (ISA Commons, isacommons.org). The early adoption of this format permitted the novel integration of three independent systems to facilitate stem cell data storage, exchange and analysis: the Blood Genomics Repository, the Stem Cell Discovery Engine, and the new Refinery platform that links the Galaxy analytical engine to data repositories.

Lai, Peggy S, Oliver Hofmann, Rebecca M Baron, Manuela Cernadas, Quanxin Ryan Meng, Herbert S Bresler, David M Brass, Ivana Yang V, David A Schwartz, David C Christiani, and Winston Hide. [2013] 2013. “Integrating Murine Gene Expression Studies to Understand Obstructive Lung Disease Due to Chronic Inhaled Endotoxin..” PloS One 8(5):e62910. doi: 10.1371/journal.pone.0062910.

RATIONALE: Endotoxin is a near ubiquitous environmental exposure that that has been associated with both asthma and chronic obstructive pulmonary disease (COPD). These obstructive lung diseases have a complex pathophysiology, making them difficult to study comprehensively in the context of endotoxin. Genome-wide gene expression studies have been used to identify a molecular snapshot of the response to environmental exposures. Identification of differentially expressed genes shared across all published murine models of chronic inhaled endotoxin will provide insight into the biology underlying endotoxin-associated lung disease.

METHODS: We identified three published murine models with gene expression profiling after repeated low-dose inhaled endotoxin. All array data from these experiments were re-analyzed, annotated consistently, and tested for shared genes found to be differentially expressed. Additional functional comparison was conducted by testing for significant enrichment of differentially expressed genes in known pathways. The importance of this gene signature in smoking-related lung disease was assessed using hierarchical clustering in an independent experiment where mice were exposed to endotoxin, smoke, and endotoxin plus smoke.

RESULTS: A 101-gene signature was detected in three murine models, more than expected by chance. The three model systems exhibit additional similarity beyond shared genes when compared at the pathway level, with increasing enrichment of inflammatory pathways associated with longer duration of endotoxin exposure. Genes and pathways important in both asthma and COPD were shared across all endotoxin models. Mice exposed to endotoxin, smoke, and smoke plus endotoxin were accurately classified with the endotoxin gene signature.

CONCLUSIONS: Despite the differences in laboratory, duration of exposure, and strain of mouse used in three experimental models of chronic inhaled endotoxin, surprising similarities in gene expression were observed. The endotoxin component of tobacco smoke may play an important role in disease development.

Altschuler, Gabriel M, Oliver Hofmann, Irina Kalatskaya, Rebecca Payne, Shannan J Ho Sui, Uma Saxena, Andrei Krivtsov V, Scott A Armstrong, Tianxi Cai, Lincoln Stein, and Winston A Hide. [2013] 2013. “Pathprinting: An Integrative Approach to Understand the Functional Basis of Disease..” Genome Medicine 5(7):68. doi: 10.1186/gm472.

New strategies to combat complex human disease require systems approaches to biology that integrate experiments from cell lines, primary tissues and model organisms. We have developed Pathprint, a functional approach that compares gene expression profiles in a set of pathways, networks and transcriptionally regulated targets. It can be applied universally to gene expression profiles across species. Integration of large-scale profiling methods and curation of the public repository overcomes platform, species and batch effects to yield a standard measure of functional distance between experiments. We show that pathprints combine mouse and human blood developmental lineage, and can be used to identify new prognostic indicators in acute myeloid leukemia. The code and resources are available at http://compbio.sph.harvard.edu/hidelab/pathprint.

Petrocca, Fabio, Gabriel Altschuler, Shen Mynn Tan, Marc L Mendillo, Haoheng Yan, Joseph Jerry, Andrew L Kung, Winston Hide, Tan A Ince, and Judy Lieberman. [2013] 2013. “A Genome-Wide SiRNA Screen Identifies Proteasome Addiction As a Vulnerability of Basal-Like Triple-Negative Breast Cancer Cells..” Cancer Cell 24(2):182-96. doi: 10.1016/j.ccr.2013.07.008.

Basal-like triple-negative breast cancers (TNBCs) have poor prognosis. To identify basal-like TNBC dependencies, a genome-wide siRNA lethality screen compared two human breast epithelial cell lines transformed with the same genes: basal-like BPLER and myoepithelial HMLER. Expression of the screen's 154 BPLER dependency genes correlated with poor prognosis in breast, but not lung or colon, cancer. Proteasome genes were overrepresented hits. Basal-like TNBC lines were selectively sensitive to proteasome inhibitor drugs relative to normal epithelial, luminal, and mesenchymal TNBC lines. Proteasome inhibition reduced growth of established basal-like TNBC tumors in mice and blocked tumor-initiating cell function and macrometastasis. Proteasome addiction in basal-like TNBCs was mediated by NOXA and linked to MCL-1 dependence.

Sandberg, Cecilie Jonsgar, Gabriel Altschuler, Jieun Jeong, Kirsten Kierulf Strømme, Biljana Stangeland, Wayne Murrell, Unn-Hilde Grasmo-Wendler, Ola Myklebost, Eirik Helseth, Einar Osland Vik-Mo, Winston Hide, and Iver A Langmoen. [2013] 2013. “Comparison of Glioma Stem Cells to Neural Stem Cells from the Adult Human Brain Identifies Dysregulated Wnt- Signaling and a Fingerprint Associated With Clinical Outcome..” Experimental Cell Research 319(14):2230-43. doi: 10.1016/j.yexcr.2013.06.004.

Glioblastoma is the most common brain tumor. Median survival in unselected patients is <10 months. The tumor harbors stem-like cells that self-renew and propagate upon serial transplantation in mice, although the clinical relevance of these cells has not been well documented. We have performed the first genome-wide analysis that directly relates the gene expression profile of nine enriched populations of glioblastoma stem cells (GSCs) to five identically isolated and cultivated populations of stem cells from the normal adult human brain. Although the two cell types share common stem- and lineage-related markers, GSCs show a more heterogeneous gene expression. We identified a number of pathways that are dysregulated in GSCs. A subset of these pathways has previously been identified in leukemic stem cells, suggesting that cancer stem cells of different origin may have common features. Genes upregulated in GSCs were also highly expressed in embryonic and induced pluripotent stem cells. We found that canonical Wnt-signaling plays an important role in GSCs, but not in adult human neural stem cells. As well we identified a 30-gene signature highly overexpressed in GSCs. The expression of these signature genes correlates with clinical outcome and demonstrates the clinical relevance of GSCs.

Sompallae, Ramakrishna, Oliver Hofmann, Christopher A Maher, Craig Gedye, Andreas Behren, Morana Vitezic, Carsten O Daub, Sylvie Devalle, Otavia L Caballero, Piero Carninci, Yoshihide Hayashizaki, Elizabeth R Lawlor, Jonathan Cebon, and Winston Hide. [2013] 2013. “A Comprehensive Promoter Landscape Identifies a Novel Promoter for CD133 in Restricted Tissues, Cancers, and Stem Cells..” Frontiers in Genetics 4:209. doi: 10.3389/fgene.2013.00209.

PROM1 is the gene encoding prominin-1 or CD133, an important cell surface marker for the isolation of both normal and cancer stem cells. PROM1 transcripts initiate at a range of transcription start sites (TSS) associated with distinct tissue and cancer expression profiles. Using high resolution Cap Analysis of Gene Expression (CAGE) sequencing we characterize TSS utilization across a broad range of normal and developmental tissues. We identify a novel proximal promoter (P6) within CD133(+) melanoma cell lines and stem cells. Additional exon array sampling finds P6 to be active in populations enriched for mesenchyme, neural stem cells and within CD133(+) enriched Ewing sarcomas. The P6 promoter is enriched with respect to previously characterized PROM1 promoters for a HMGI/Y (HMGA1) family transcription factor binding site motif and exhibits different epigenetic modifications relative to the canonical promoter region of PROM1.

Huang, Hsuan-Ting, Katie L Kathrein, Abby Barton, Zachary Gitlin, Yue-Hua Huang, Thomas P Ward, Oliver Hofmann, Anthony Dibiase, Anhua Song, Svitlana Tyekucheva, Winston Hide, Yi Zhou, and Leonard I Zon. [2013] 2013. “A Network of Epigenetic Regulators Guides Developmental Haematopoiesis in Vivo..” Nature Cell Biology 15(12):1516-25. doi: 10.1038/ncb2870.

The initiation of cellular programs is orchestrated by key transcription factors and chromatin regulators that activate or inhibit target gene expression. To generate a compendium of chromatin factors that establish the epigenetic code during developmental haematopoiesis, a large-scale reverse genetic screen was conducted targeting orthologues of 425 human chromatin factors in zebrafish. A set of chromatin regulators was identified that target different stages of primitive and definitive blood formation, including factors not previously implicated in haematopoiesis. We identified 15 factors that regulate development of primitive erythroid progenitors and 29 factors that regulate development of definitive haematopoietic stem and progenitor cells. These chromatin factors are associated with SWI/SNF and ISWI chromatin remodelling, SET1 methyltransferase, CBP-p300-HBO1-NuA4 acetyltransferase, HDAC-NuRD deacetylase, and Polycomb repressive complexes. Our work provides a comprehensive view of how specific chromatin factors and their associated complexes play a major role in the establishment of haematopoietic cells in vivo.

2012

Taniya, Takayuki, Susumu Tanaka, Yumi Yamaguchi-Kabata, Hideki Hanaoka, Chisato Yamasaki, Harutoshi Maekawa, Roberto A Barrero, Boris Lenhard, Milton W Datta, Mary Shimoyama, Roger Bumgarner, Ranajit Chakraborty, Ian Hopkinson, Libin Jia, Winston Hide, Charles Auffray, Shinsei Minoshima, Tadashi Imanishi, and Takashi Gojobori. [2012] 2012. “A Prioritization Analysis of Disease Association by Data-Mining of Functional Annotation of Human Genes..” Genomics 99(1):1-9. doi: 10.1016/j.ygeno.2011.10.002.

Complex diseases result from contributions of multiple genes that act in concert through pathways. Here we present a method to prioritize novel candidates of disease-susceptibility genes depending on the biological similarities to the known disease-related genes. The extent of disease-susceptibility of a gene is prioritized by analyzing seven features of human genes captured in H-InvDB. Taking rheumatoid arthritis (RA) and prostate cancer (PC) as two examples, we evaluated the efficiency of our method. Highly scored genes obtained included TNFSF12 and OSM as candidate disease genes for RA and PC, respectively. Subsequent characterization of these genes based upon an extensive literature survey reinforced the validity of these highly scored genes as possible disease-susceptibility genes. Our approach, Prioritization ANalysis of Disease Association (PANDA), is an efficient and cost-effective method to narrow down a large set of genes into smaller subsets that are most likely to be involved in the disease pathogenesis.