Publications

2009

Lal, Ashish, Francisco Navarro, Christopher A Maher, Laura E Maliszewski, Nan Yan, Elizabeth O’Day, Dipanjan Chowdhury, Derek M Dykxhoorn, Perry Tsai, Oliver Hofmann, Kevin G Becker, Myriam Gorospe, Winston Hide, and Judy Lieberman. [2009] 2009. “MiR-24 Inhibits Cell Proliferation by Targeting E2F2, MYC, and Other Cell-Cycle Genes via Binding to ‘seedless’ 3’UTR MicroRNA Recognition Elements..” Molecular Cell 35(5):610-25. doi: 10.1016/j.molcel.2009.08.020.

miR-24, upregulated during terminal differentiation of multiple lineages, inhibits cell-cycle progression. Antagonizing miR-24 restores postmitotic cell proliferation and enhances fibroblast proliferation, whereas overexpressing miR-24 increases the G1 compartment. The 248 mRNAs downregulated upon miR-24 overexpression are highly enriched for DNA repair and cell-cycle regulatory genes that form a direct interaction network with prominent nodes at genes that enhance (MYC, E2F2, CCNB1, and CDC2) or inhibit (p27Kip1 and VHL) cell-cycle progression. miR-24 directly regulates MYC and E2F2 and some genes that they transactivate. Enhanced proliferation from antagonizing miR-24 is abrogated by knocking down E2F2, but not MYC, and cell proliferation, inhibited by miR-24 overexpression, is rescued by miR-24-insensitive E2F2. Therefore, E2F2 is a critical miR-24 target. The E2F2 3'UTR lacks a predicted miR-24 recognition element. In fact, miR-24 regulates expression of E2F2, MYC, AURKB, CCNA2, CDC2, CDK4, and FEN1 by recognizing seedless but highly complementary sequences.

2008

H-Invitational, Genome Information Integration Project And, 2, Chisato Yamasaki, Katsuhiko Murakami, Yasuyuki Fujii, Yoshiharu Sato, Erimi Harada, Jun-ichi Takeda, Takayuki Taniya, Ryuichi Sakate, Shingo Kikugawa, Makoto Shimada, Motohiko Tanino, Kanako O Koyanagi, Roberto A Barrero, Craig Gough, Hong-Woo Chun, Takuya Habara, Hideki Hanaoka, Yosuke Hayakawa, Phillip B Hilton, Yayoi Kaneko, Masako Kanno, Yoshihiro Kawahara, Toshiyuki Kawamura, Akihiro Matsuya, Naoki Nagata, Kensaku Nishikata, Akiko Ogura Noda, Shin Nurimoto, Naomi Saichi, Hiroaki Sakai, Ryoko Sanbonmatsu, Rie Shiba, Mami Suzuki, Kazuhiko Takabayashi, Aiko Takahashi, Takuro Tamura, Masayuki Tanaka, Susumu Tanaka, Fusano Todokoro, Kaori Yamaguchi, Naoyuki Yamamoto, Toshihisa Okido, Jun Mashima, Aki Hashizume, Lihua Jin, Kyung-Bum Lee, Yi-Chueh Lin, Asami Nozaki, Katsunaga Sakai, Masahito Tada, Satoru Miyazaki, Takashi Makino, Hajime Ohyanagi, Naoki Osato, Nobuhiko Tanaka, Yoshiyuki Suzuki, Kazuho Ikeo, Naruya Saitou, Hideaki Sugawara, Claire O’Donovan, Tamara Kulikova, Eleanor Whitfield, Brian Halligan, Mary Shimoyama, Simon Twigger, Kei Yura, Kouichi Kimura, Tomohiro Yasuda, Tetsuo Nishikawa, Yutaka Akiyama, Chie Motono, Yuri Mukai, Hideki Nagasaki, Makiko Suwa, Paul Horton, Reiko Kikuno, Osamu Ohara, Doron Lancet, Eric Eveno, Esther Graudens, Sandrine Imbeaud, Marie Anne Debily, Yoshihide Hayashizaki, Clara Amid, Michael Han, Andreas Osanger, Toshinori Endo, Michael A Thomas, Mika Hirakawa, Wojciech Makalowski, Mitsuteru Nakao, Nam-Soon Kim, Hyang-Sook Yoo, Sandro J De Souza, Maria de Fatima Bonaldo, Yoshihito Niimura, Vladimir Kuryshev, Ingo Schupp, Stefan Wiemann, Matthew Bellgard, Masafumi Shionyu, Libin Jia, Danielle Thierry-Mieg, Jean Thierry-Mieg, Lukas Wagner, Qinghua Zhang, Mitiko Go, Shinsei Minoshima, Masafumi Ohtsubo, Kousuke Hanada, Peter Tonellato, Takao Isogai, Ji Zhang, Boris Lenhard, Sangsoo Kim, Zhu Chen, Ursula Hinz, Anne Estreicher, Kenta Nakai, Izabela Makalowska, Winston Hide, Nicola Tiffin, Laurens Wilming, Ranajit Chakraborty, Marcelo Bento Soares, Maria Luisa Chiusano, Yutaka Suzuki, Charles Auffray, Yumi Yamaguchi-Kabata, Takeshi Itoh, Teruyoshi Hishiki, Satoshi Fukuchi, Ken Nishikawa, Sumio Sugano, Nobuo Nomura, Yoshio Tateno, Tadashi Imanishi, and Takashi Gojobori. [2008] 2008. “The H-Invitational Database (H-InvDB), a Comprehensive Annotation Resource for Human Genes and Transcripts..” Nucleic Acids Research 36(Database issue):D793-9.

Here we report the new features and improvements in our latest release of the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/), a comprehensive annotation resource for human genes and transcripts. H-InvDB, originally developed as an integrated database of the human transcriptome based on extensive annotation of large sets of full-length cDNA (FLcDNA) clones, now provides annotation for 120 558 human mRNAs extracted from the International Nucleotide Sequence Databases (INSD), in addition to 54 978 human FLcDNAs, in the latest release H-InvDB_4.6. We mapped those human transcripts onto the human genome sequences (NCBI build 36.1) and determined 34 699 human gene clusters, which could define 34 057 (98.1%) protein-coding and 642 (1.9%) non-protein-coding loci; 858 (2.5%) transcribed loci overlapped with predicted pseudogenes. For all these transcripts and genes, we provide comprehensive annotation including gene structures, gene functions, alternative splicing variants, functional non-protein-coding RNAs, functional domains, predicted sub cellular localizations, metabolic pathways, predictions of protein 3D structure, mapping of SNPs and microsatellite repeat motifs, co-localization with orphan diseases, gene expression profiles, orthologous genes, protein-protein interactions (PPI) and annotation for gene families. The current H-InvDB annotation resources consist of two main views: Transcript view and Locus view and eight sub-databases: the DiseaseInfo Viewer, H-ANGEL, the Clustering Viewer, G-integra, the TOPO Viewer, Evola, the PPI view and the Gene family/group.

Mathivanan, Suresh, Mukhtar Ahmed, Natalie G Ahn, Hainard Alexandre, Ramars Amanchy, Philip C Andrews, Joel S Bader, Brian M Balgley, Marcus Bantscheff, Keiryn L Bennett, Erik Björling, Blagoy Blagoev, Ron Bose, Samir K Brahmachari, Alma S Burlingame, Xosé R Bustelo, Gerard Cagney, Greg T Cantin, Helene L Cardasis, Julio E Celis, Raghothama Chaerkady, Feixia Chu, Philip A Cole, Catherine E Costello, Robert J Cotter, David Crockett, James P DeLany, Angelo M De Marzo, Leroi DeSouza V, Eric W Deutsch, Eric Dransfield, Gerard Drewes, Arnaud Droit, Michael J Dunn, Kojo Elenitoba-Johnson, Rob M Ewing, Jennifer Van Eyk, Vitor Faca, Jayson Falkner, Xiangming Fang, Catherine Fenselau, Daniel Figeys, Pierre Gagné, Cecilia Gelfi, Kris Gevaert, Jeffrey M Gimble, Florian Gnad, Renu Goel, Pavel Gromov, Samir M Hanash, William S Hancock, H C Harsha, Gerald Hart, Faith Hays, Fuchu He, Prashantha Hebbar, Kenny Helsens, Heiko Hermeking, Winston Hide, Karin Hjernø, Denis F Hochstrasser, Oliver Hofmann, David M Horn, Ralph H Hruban, Nieves Ibarrola, Peter James, Ole N Jensen, Pia Hønnerup Jensen, Peter Jung, Kumaran Kandasamy, Indu Kheterpal, Reiko F Kikuno, Ulrike Korf, Roman Körner, Bernhard Kuster, Min-Seok Kwon, Hyoung-Joo Lee, Young-Jin Lee, Michael Lefevre, Minna Lehvaslaiho, Pierre Lescuyer, Fredrik Levander, Megan S Lim, Christian Löbke, Joseph A Loo, Matthias Mann, Lennart Martens, Juan Martinez-Heredia, Mark McComb, James McRedmond, Alexander Mehrle, Rajasree Menon, Christine A Miller, Harald Mischak, Sujatha Mohan, Riaz Mohmood, Henrik Molina, Michael F Moran, James D Morgan, Robert Moritz, Martine Morzel, David C Muddiman, Anuradha Nalli, Daniel Navarro, Thomas A Neubert, Osamu Ohara, Rafael Oliva, Gilbert S Omenn, Masaaki Oyama, Young-Ki Paik, Kyla Pennington, Rainer Pepperkok, Balamurugan Periaswamy, Emanuel F Petricoin, Guy G Poirier, T S Keshava Prasad, Samuel O Purvine, Abdul Rahiman, Prasanna Ramachandran, Y L Ramachandra, Robert H Rice, Jens Rick, Ragna H Ronnholm, Johanna Salonen, Jean-Charles Sanchez, Thierry Sayd, Beerelli Seshi, Kripa Shankari, Shi Jun Sheng, Vivekananda Shetty, K Shivakumar, Richard J Simpson, Ravi Sirdeshmukh, K W Michael Siu, Jeffrey C Smith, Richard D Smith, David J States, Sumio Sugano, Matthew Sullivan, Giulio Superti-Furga, Maarit Takatalo, Visith Thongboonkerd, Jonathan C Trinidad, Mathias Uhlen, Joël Vandekerckhove, Julian Vasilescu, Timothy D Veenstra, José-Manuel Vidal-Taboada, Mauno Vihinen, Robin Wait, Xiaoyue Wang, Stefan Wiemann, Billy Wu, Tao Xu, John R Yates, Jun Zhong, Ming Zhou, Yunping Zhu, Petra Zurbig, and Akhilesh Pandey. [2008] 2008. “Human Proteinpedia Enables Sharing of Human Protein Data..” Nature Biotechnology 26(2):164-7. doi: 10.1038/nbt0208-164.
Chopera, Denis R, Zenda Woodman, Koleka Mlisana, Mandla Mlotshwa, Darren P Martin, Cathal Seoighe, Florette Treurnicht, Debra Assis de Rosa, Winston Hide, Salim Abdool Karim, Clive M Gray, Carolyn Williamson, and CAPRISA 002 Study Team. [2008] 2008. “Transmission of HIV-1 CTL Escape Variants Provides HLA-Mismatched Recipients With a Survival Advantage..” PLoS Pathogens 4(3):e1000033. doi: 10.1371/journal.ppat.1000033.

One of the most important genetic factors known to affect the rate of disease progression in HIV-infected individuals is the genotype at the Class I Human Leukocyte Antigen (HLA) locus, which determines the HIV peptides targeted by cytotoxic T-lymphocytes (CTLs). Individuals with HLA-B*57 or B*5801 alleles, for example, target functionally important parts of the Gag protein. Mutants that escape these CTL responses may have lower fitness than the wild-type and can be associated with slower disease progression. Transmission of the escape variant to individuals without these HLA alleles is associated with rapid reversion to wild-type. However, the question of whether infection with an escape mutant offers an advantage to newly infected hosts has not been addressed. Here we investigate the relationship between the genotypes of transmitted viruses and prognostic markers of disease progression and show that infection with HLA-B*57/B*5801 escape mutants is associated with lower viral load and higher CD4+ counts.

Hazelhurst, Scott, Winston Hide, Zsuzsanna Lipták, Ramon Nogueira, and Richard Starfield. [2008] 2008. “An Overview of the Wcd EST Clustering Tool..” Bioinformatics (Oxford, England) 24(13):1542-6. doi: 10.1093/bioinformatics/btn203.

UNLABELLED: The wcd system is an open source tool for clustering expressed sequence tags (EST) and other DNA and RNA sequences. wcd allows efficient all-versus-all comparison of ESTs using either the d(2) distance function or edit distance, improving existing implementations of d(2). It supports merging, refinement and reclustering of clusters. It is 'drop in' compatible with the StackPack clustering package. wcd supports parallelization under both shared memory and cluster architectures. It is distributed with an EMBOSS wrapper allowing wcd to be installed as part of an EMBOSS installation (and so provided by a web server).

AVAILABILITY: wcd is distributed under a GPL licence and is available from http://code.google.com/p/wcdest.

SUPPLEMENTARY INFORMATION: Additional experimental results. The wcd manual, a companion paper describing underlying algorithms, and all datasets used for experimentation can also be found at www.bioinf.wits.ac.za/ scott/wcdsupp.html.

Howe, Doug, Maria Costanzo, Petra Fey, Takashi Gojobori, Linda Hannick, Winston Hide, David P Hill, Renate Kania, Mary Schaeffer, Susan St Pierre, Simon Twigger, Owen White, and Seung Yon Rhee. [2008] 2008. “Big Data: The Future of Biocuration..” Nature 455(7209):47-50. doi: 10.1038/455047a.
Kaur, Mandeep, Sebastian Schmeier, Cameron R MacPherson, Oliver Hofmann, Winston A Hide, Stephen Taylor, Nick Willcox, and Vladimir B Bajic. [2008] 2008. “Prioritizing Genes of Potential Relevance to Diseases Affected by Sex Hormones: An Example of Myasthenia Gravis..” BMC Genomics 9:481. doi: 10.1186/1471-2164-9-481.

BACKGROUND: About 5% of western populations are afflicted by autoimmune diseases many of which are affected by sex hormones. Autoimmune diseases are complex and involve many genes. Identifying these disease-associated genes contributes to development of more effective therapies. Also, association studies frequently imply genomic regions that contain disease-associated genes but fall short of pinpointing these genes. The identification of disease-associated genes has always been challenging and to date there is no universal and effective method developed.

RESULTS: We have developed a method to prioritize disease-associated genes for diseases affected strongly by sex hormones. Our method uses various types of information available for the genes, but no information that directly links genes with the disease. It generates a score for each of the considered genes and ranks genes based on that score. We illustrate our method on early-onset myasthenia gravis (MG) using genes potentially controlled by estrogen and localized in a genomic segment (which contains the MHC and surrounding region) strongly associated with MG. Based on the considered genomic segment 283 genes are ranked for their relevance to MG and responsiveness to estrogen. The top three ranked genes, HLA-G, TAP2 and HLA-DRB1, are implicated in autoimmune diseases, while TAP2 is associated with SNPs characteristic for MG. Within the top 35 prioritized genes our method identifies 90% of the 10 already known MG-associated genes from the considered region without using any information that directly links genes to MG. Among the top eight genes we identified HLA-G and TUBB as new candidates. We show that our ab-initio approach outperforms the other methods for prioritizing disease-associated genes.

CONCLUSION: We have developed a method to prioritize disease-associated genes under the potential control of sex hormones. We demonstrate the success of this method by prioritizing the genes localized in the MHC and surrounding region and evaluating the role of these genes as potential candidates for estrogen control as well as MG. We show that our method outperforms the other methods. The method has a potential to be adapted to prioritize genes relevant to other diseases.

Hofmann, Oliver, Otavia L Caballero, Brian J Stevenson, Yao-Tseng Chen, Tzeela Cohen, Ramon Chua, Christopher A Maher, Sumir Panji, Ulf Schaefer, Adele Kruger, Minna Lehvaslaiho, Piero Carninci, Yoshihide Hayashizaki, Victor Jongeneel, Andrew J G Simpson, Lloyd J Old, and Winston Hide. [2008] 2008. “Genome-Wide Analysis of Cancer/Testis Gene Expression..” Proceedings of the National Academy of Sciences of the United States of America 105(51):20422-7. doi: 10.1073/pnas.0810777105.

Cancer/Testis (CT) genes, normally expressed in germ line cells but also activated in a wide range of cancer types, often encode antigens that are immunogenic in cancer patients, and present potential for use as biomarkers and targets for immunotherapy. Using multiple in silico gene expression analysis technologies, including twice the number of expressed sequence tags used in previous studies, we have performed a comprehensive genome-wide survey of expression for a set of 153 previously described CT genes in normal and cancer expression libraries. We find that although they are generally highly expressed in testis, these genes exhibit heterogeneous gene expression profiles, allowing their classification into testis-restricted (39), testis/brain-restricted (14), and a testis-selective (85) group of genes that show additional expression in somatic tissues. The chromosomal distribution of these genes confirmed the previously observed dominance of X chromosome location, with CT-X genes being significantly more testis-restricted than non-X CT. Applying this core classification in a genome-wide survey we identified >30 CT candidate genes; 3 of them, PEPP-2, OTOA, and AKAP4, were confirmed as testis-restricted or testis-selective using RT-PCR, with variable expression frequencies observed in a panel of cancer cell lines. Our classification provides an objective ranking for potential CT genes, which is useful in guiding further identification and characterization of these potentially important diagnostic and therapeutic targets.

2007

Kruger, Adele, Oliver Hofmann, Piero Carninci, Yoshihide Hayashizaki, and Winston Hide. [2007] 2007. “Simplified Ontologies Allowing Comparison of Developmental Mammalian Gene Expression..” Genome Biology 8(10):R229.

Model organisms represent an important resource for understanding the fundamental aspects of mammalian biology. Mapping of biological phenomena between model organisms is complex and if it is to be meaningful, a simplified representation can be a powerful means for comparison. The Developmental eVOC ontologies presented here are simplified orthogonal ontologies describing the temporal and spatial distribution of developmental human and mouse anatomy. We demonstrate the ontologies by identifying genes showing a bias for developmental brain expression in human and mouse.

Stevenson, Brian J, Christian Iseli, Sumir Panji, Monique Zahn-Zabal, Winston Hide, Lloyd J Old, Andrew J Simpson, and Victor Jongeneel. [2007] 2007. “Rapid Evolution of Cancer/Testis Genes on the X Chromosome..” BMC Genomics 8:129.

BACKGROUND: Cancer/testis (CT) genes are normally expressed only in germ cells, but can be activated in the cancer state. This unusual property, together with the finding that many CT proteins elicit an antigenic response in cancer patients, has established a role for this class of genes as targets in immunotherapy regimes. Many families of CT genes have been identified in the human genome, but their biological function for the most part remains unclear. While it has been shown that some CT genes are under diversifying selection, this question has not been addressed before for the class as a whole.

RESULTS: To shed more light on this interesting group of genes, we exploited the generation of a draft chimpanzee (Pan troglodytes) genomic sequence to examine CT genes in an organism that is closely related to human, and generated a high-quality, manually curated set of human:chimpanzee CT gene alignments. We find that the chimpanzee genome contains homologues to most of the human CT families, and that the genes are located on the same chromosome and at a similar copy number to those in human. Comparison of putative human:chimpanzee orthologues indicates that CT genes located on chromosome X are diverging faster and are undergoing stronger diversifying selection than those on the autosomes or than a set of control genes on either chromosome X or autosomes.

CONCLUSION: Given their high level of diversifying selection, we suggest that CT genes are primarily responsible for the observed rapid evolution of protein-coding genes on the X chromosome.