Publications

2008

Hofmann, Oliver, Otavia L Caballero, Brian J Stevenson, Yao-Tseng Chen, Tzeela Cohen, Ramon Chua, Christopher A Maher, Sumir Panji, Ulf Schaefer, Adele Kruger, Minna Lehvaslaiho, Piero Carninci, Yoshihide Hayashizaki, Victor Jongeneel, Andrew J G Simpson, Lloyd J Old, and Winston Hide. [2008] 2008. “Genome-Wide Analysis of Cancer/Testis Gene Expression..” Proceedings of the National Academy of Sciences of the United States of America 105(51):20422-7. doi: 10.1073/pnas.0810777105.

Cancer/Testis (CT) genes, normally expressed in germ line cells but also activated in a wide range of cancer types, often encode antigens that are immunogenic in cancer patients, and present potential for use as biomarkers and targets for immunotherapy. Using multiple in silico gene expression analysis technologies, including twice the number of expressed sequence tags used in previous studies, we have performed a comprehensive genome-wide survey of expression for a set of 153 previously described CT genes in normal and cancer expression libraries. We find that although they are generally highly expressed in testis, these genes exhibit heterogeneous gene expression profiles, allowing their classification into testis-restricted (39), testis/brain-restricted (14), and a testis-selective (85) group of genes that show additional expression in somatic tissues. The chromosomal distribution of these genes confirmed the previously observed dominance of X chromosome location, with CT-X genes being significantly more testis-restricted than non-X CT. Applying this core classification in a genome-wide survey we identified >30 CT candidate genes; 3 of them, PEPP-2, OTOA, and AKAP4, were confirmed as testis-restricted or testis-selective using RT-PCR, with variable expression frequencies observed in a panel of cancer cell lines. Our classification provides an objective ranking for potential CT genes, which is useful in guiding further identification and characterization of these potentially important diagnostic and therapeutic targets.

2007

Kruger, Adele, Oliver Hofmann, Piero Carninci, Yoshihide Hayashizaki, and Winston Hide. [2007] 2007. “Simplified Ontologies Allowing Comparison of Developmental Mammalian Gene Expression..” Genome Biology 8(10):R229.

Model organisms represent an important resource for understanding the fundamental aspects of mammalian biology. Mapping of biological phenomena between model organisms is complex and if it is to be meaningful, a simplified representation can be a powerful means for comparison. The Developmental eVOC ontologies presented here are simplified orthogonal ontologies describing the temporal and spatial distribution of developmental human and mouse anatomy. We demonstrate the ontologies by identifying genes showing a bias for developmental brain expression in human and mouse.

Stevenson, Brian J, Christian Iseli, Sumir Panji, Monique Zahn-Zabal, Winston Hide, Lloyd J Old, Andrew J Simpson, and Victor Jongeneel. [2007] 2007. “Rapid Evolution of Cancer/Testis Genes on the X Chromosome..” BMC Genomics 8:129.

BACKGROUND: Cancer/testis (CT) genes are normally expressed only in germ cells, but can be activated in the cancer state. This unusual property, together with the finding that many CT proteins elicit an antigenic response in cancer patients, has established a role for this class of genes as targets in immunotherapy regimes. Many families of CT genes have been identified in the human genome, but their biological function for the most part remains unclear. While it has been shown that some CT genes are under diversifying selection, this question has not been addressed before for the class as a whole.

RESULTS: To shed more light on this interesting group of genes, we exploited the generation of a draft chimpanzee (Pan troglodytes) genomic sequence to examine CT genes in an organism that is closely related to human, and generated a high-quality, manually curated set of human:chimpanzee CT gene alignments. We find that the chimpanzee genome contains homologues to most of the human CT families, and that the genes are located on the same chromosome and at a similar copy number to those in human. Comparison of putative human:chimpanzee orthologues indicates that CT genes located on chromosome X are diverging faster and are undergoing stronger diversifying selection than those on the autosomes or than a set of control genes on either chromosome X or autosomes.

CONCLUSION: Given their high level of diversifying selection, we suggest that CT genes are primarily responsible for the observed rapid evolution of protein-coding genes on the X chromosome.

Seoighe, Cathal, Farahnaz Ketwaroo, Visva Pillay, Konrad Scheffler, Natasha Wood, Rodger Duffet, Marketa Zvelebil, Neil Martinson, James McIntyre, Lynn Morris, and Winston Hide. [2007] 2007. “A Model of Directional Selection Applied to the Evolution of Drug Resistance in HIV-1..” Molecular Biology and Evolution 24(4):1025-31.

Understanding how pathogens acquire resistance to drugs is important for the design of treatment strategies, particularly for rapidly evolving viruses such as HIV-1. Drug treatment can exert strong selective pressures and sites within targeted genes that confer resistance frequently evolve far more rapidly than the neutral rate. Rapid evolution at sites that confer resistance to drugs can be used to help elucidate the mechanisms of evolution of drug resistance and to discover or corroborate novel resistance mutations. We have implemented standard maximum likelihood methods that are used to detect diversifying selection and adapted them for use with serially sampled reverse transcriptase (RT) coding sequences isolated from a group of 300 HIV-1 subtype C-infected women before and after single-dose nevirapine (sdNVP) to prevent mother-to-child transmission. We have also extended the standard models of codon evolution for application to the detection of directional selection. Through simulation, we show that the directional selection model can provide a substantial improvement in sensitivity over models of diversifying selection. Five of the sites within the RT gene that are known to harbor mutations that confer resistance to nevirapine (NVP) strongly supported the directional selection model. There was no evidence that other mutations that are known to confer NVP resistance were selected in this cohort. The directional selection model, applied to serially sampled sequences, also had more power than the diversifying selection model to detect selection resulting from factors other than drug resistance. Because inference of selection from serial samples is unlikely to be adversely affected by recombination, the methods we describe may have general applicability to the analysis of positive selection affecting recombining coding sequences when serially sampled data are available.

Lombard, Zane, Nicki Tiffin, Oliver Hofmann, Vladimir B Bajic, Winston Hide, and Michele Ramsay. [2007] 2007. “Computational Selection and Prioritization of Candidate Genes for Fetal Alcohol Syndrome..” BMC Genomics 8:389.

BACKGROUND: Fetal alcohol syndrome (FAS) is a serious global health problem and is observed at high frequencies in certain South African communities. Although in utero alcohol exposure is the primary trigger, there is evidence for genetic- and other susceptibility factors in FAS development. No genome-wide association or linkage studies have been performed for FAS, making computational selection and -prioritization of candidate disease genes an attractive approach.

RESULTS: 10174 Candidate genes were initially selected from the whole genome using a previously described method, which selects candidate genes according to their expression in disease-affected tissues. Hereafter candidates were prioritized for experimental investigation by investigating criteria pertinent to FAS and binary filtering. 29 Criteria were assessed by mining various database sources to populate criteria-specific gene lists. Candidate genes were then prioritized for experimental investigation using a binary system that assessed the criteria gene lists against the candidate list, and candidate genes were scored accordingly. A group of 87 genes was prioritized as candidates and for future experimental validation. The validity of the binary prioritization method was assessed by investigating the protein-protein interactions, functional enrichment and common promoter element binding sites of the top-ranked genes.

CONCLUSION: This analysis highlighted a list of strong candidate genes from the TGF-beta, MAPK and Hedgehog signalling pathways, which are all integral to fetal development and potential targets for alcohol's teratogenic effect. We conclude that this novel bioinformatics approach effectively prioritizes credible candidate genes for further experimental analysis.

Schwegmann, Anita, Reto Guler, Antony J Cutler, Berenice Arendse, William G C Horsnell, Alexandra Flemming, Andreas H Kottmann, Gregory Ryan, Winston Hide, Michael Leitges, Cathal Seoighe, and Frank Brombacher. [2007] 2007. “Protein Kinase C Delta Is Essential for Optimal Macrophage-Mediated Phagosomal Containment of Listeria Monocytogenes..” Proceedings of the National Academy of Sciences of the United States of America 104(41):16251-6.

Activation of macrophages and subsequent "killing" effector functions against infectious pathogens are essential for the establishment of protective immunity. NF-IL6 is a transcription factor downstream of IFN-gamma and TNF in the macrophage activation pathway required for bacterial killing. Comparison of microarray expression profiles of Listeria monocytogenes (LM)-infected macrophages from WT and NF-IL6-deficient mice enabled us to identify candidate genes downstream of NF-IL6 involved in the unknown pathways of LM killing independent of reactive oxygen intermediates and reactive nitrogen intermediates. One differentially expressed gene, PKCdelta, had higher mRNA levels in the LM-infected NF-IL6-deficient macrophages as compared with WT. To define the role of PKCdelta during listeriosis, we infected PKCdelta-deficient mice with LM. PKCdelta-deficient mice were highly susceptible to LM infection with increased bacterial burden and enhanced histopathology despite enhanced NF-IL6 mRNA expression. Subsequent studies in PKCdelta-deficient macrophages demonstrated that, despite elevated levels of proinflammatory cytokines and NO production, increased escape of LM from the phagosome into the cytoplasm and uncontrolled bacterial growth occurred. Taken together these data identified PKCdelta as a critical factor for confinement of LM within macrophage phagosomes.

2006

Mehrle, Alexander, Heiko Rosenfelder, Ingo Schupp, Coral del Val, Dorit Arlt, Florian Hahne, Stephanie Bechtel, Jeremy Simpson, Oliver Hofmann, Winston Hide, Karl-Heinz Glatting, Wolfgang Huber, Rainer Pepperkok, Annemarie Poustka, and Stefan Wiemann. [2006] 2006. “The LIFEdb Database in 2006..” Nucleic Acids Research 34(Database issue):D415-8.

LIFEdb (http://www.LIFEdb.de) integrates data from large-scale functional genomics assays and manual cDNA annotation with bioinformatics gene expression and protein analysis. New features of LIFEdb include (i) an updated user interface with enhanced query capabilities, (ii) a configurable output table and the option to download search results in XML, (iii) the integration of data from cell-based screening assays addressing the influence of protein-overexpression on cell proliferation and (iv) the display of the relative expression ('Electronic Northern') of the genes under investigation using curated gene expression ontology information. LIFEdb enables researchers to systematically select and characterize genes and proteins of interest, and presents data and information via its user-friendly web-based interface.

Bajic, Vladimir B, Sin Lam Tan, Alan Christoffels, Christian Schönbach, Leonard Lipovich, Liang Yang, Oliver Hofmann, Adele Kruger, Winston Hide, Chikatoshi Kai, Jun Kawai, David A Hume, Piero Carninci, and Yoshihide Hayashizaki. [2006] 2006. “Mice and Men: Their Promoter Properties..” PLoS Genetics 2(4):e54.

Using the two largest collections of Mus musculus and Homo sapiens transcription start sites (TSSs) determined based on CAGE tags, ditags, full-length cDNAs, and other transcript data, we describe the compositional landscape surrounding TSSs with the aim of gaining better insight into the properties of mammalian promoters. We classified TSSs into four types based on compositional properties of regions immediately surrounding them. These properties highlighted distinctive features in the extended core promoters that helped us delineate boundaries of the transcription initiation domain space for both species. The TSS types were analyzed for associations with initiating dinucleotides, CpG islands, TATA boxes, and an extensive collection of statistically significant cis-elements in mouse and human. We found that different TSS types show preferences for different sets of initiating dinucleotides and cis-elements. Through Gene Ontology and eVOC categories and tissue expression libraries we linked TSS characteristics to expression. Moreover, we show a link of TSS characteristics to very specific genomic organization in an example of immune-response-related genes (GO:0006955). Our results shed light on the global properties of the two transcriptomes not revealed before and therefore provide the framework for better understanding of the transcriptional mechanisms in the two species, as well as a framework for development of new and more efficient promoter- and gene-finding tools.

Tiffin, Nicki, Euan Adie, Frances Turner, Han G Brunner, Marc A van Driel, Martin Oti, Nuria Lopez-Bigas, Christos Ouzounis, Carolina Perez-Iratxeta, Miguel A Andrade-Navarro, Adebowale Adeyemo, Mary Elizabeth Patti, Colin A M Semple, and Winston Hide. [2006] 2006. “Computational Disease Gene Identification: A Concert of Methods Prioritizes Type 2 Diabetes and Obesity Candidate Genes..” Nucleic Acids Research 34(10):3067-81.

Genome-wide experimental methods to identify disease genes, such as linkage analysis and association studies, generate increasingly large candidate gene sets for which comprehensive empirical analysis is impractical. Computational methods employ data from a variety of sources to identify the most likely candidate disease genes from these gene sets. Here, we review seven independent computational disease gene prioritization methods, and then apply them in concert to the analysis of 9556 positional candidate genes for type 2 diabetes (T2D) and the related trait obesity. We generate and analyse a list of nine primary candidate genes for T2D genes and five for obesity. Two genes, LPL and BCKDHA, are common to these two sets. We also present a set of secondary candidates for T2D (94 genes) and for obesity (116 genes) with 58 genes in common to both diseases.

2005

Aksoy, Serap, Matt Berriman, Neil Hall, Masahira Hattori, Winston Hide, and Michael J Lehane. [2005] 2005. “A Case for a Glossina Genome Project..” Trends in Parasitology 21(3):107-11.

Given the medical and agricultural significance of Glossina, knowledge of the genomic aspects of the vector and vector-pathogen interactions are a high priority. In preparation for a full genome sequence initiative, an extensive set of expressed sequence tags (ESTs) has been generated from tissue-specific normalized libraries. In addition, bacterial artificial chromosome (BAC) libraries are being constructed, and information on the genome structure and size from different species has been obtained. An international consortium is now in place to further efforts to lead to a full genome project.