Publications

2010

Rocca-Serra, Philippe, Marco Brandizi, Eamonn Maguire, Nataliya Sklyar, Chris Taylor, Kimberly Begley, Dawn Field, Stephen Harris, Winston Hide, Oliver Hofmann, Steffen Neumann, Peter Sterk, Weida Tong, and Susanna-Assunta Sansone. [2010] 2010. “ISA Software Suite: Supporting Standards-Compliant Experimental Annotation and Enabling Curation at the Community Level..” Bioinformatics (Oxford, England) 26(18):2354-6. doi: 10.1093/bioinformatics/btq415.

Publisher's Version

UNLABELLED: The first open source software suite for experimentalists and curators that (i) assists in the annotation and local management of experimental metadata from high-throughput studies employing one or a combination of omics and other technologies; (ii) empowers users to uptake community-defined checklists and ontologies; and (iii) facilitates submission to international public repositories.

AVAILABILITY AND IMPLEMENTATION: Software, documentation, case studies and implementations at http://www.isa-tools.org.

2009

Koscielny, Gautier, Vincent Le Texier, Chellappa Gopalakrishnan, Vasudev Kumanduri, Jean-Jack Riethoven, Francesco Nardone, Eleanor Stanley, Christine Fallsehr, Oliver Hofmann, Meelis Kull, Eoghan Harrington, Stéphanie Boué, Eduardo Eyras, Mireya Plass, Fabrice Lopez, William Ritchie, Virginie Moucadel, Takeshi Ara, Heike Pospisil, Alexander Herrmann, Jens G Reich, Roderic Guigó, Peer Bork, Magnus von Knebel Doeberitz, Jaak Vilo, Winston Hide, Rolf Apweiler, Thangavel Alphonse Thanaraj, and Daniel Gautheret. [2009] 2009. “ASTD: The Alternative Splicing and Transcript Diversity Database..” Genomics 93(3):213-20. doi: 10.1016/j.ygeno.2008.11.003.

Publisher's Version

The Alternative Splicing and Transcript Diversity database (ASTD) gives access to a vast collection of alternative transcripts that integrate transcription initiation, polyadenylation and splicing variant data. Alternative transcripts are derived from the mapping of transcribed sequences to the complete human, mouse and rat genomes using an extension of the computational pipeline developed for the ASD (Alternative Splicing Database) and ATD (Alternative Transcript Diversity) databases, which are now superseded by ASTD. For the human genome, ASTD identifies splicing variants, transcription initiation variants and polyadenylation variants in 68%, 68% and 62% of the gene set, respectively, consistent with current estimates for transcription variation. Users can access ASTD through a variety of browsing and query tools, including expression state-based queries for the identification of tissue-specific isoforms. Participating laboratories have experimentally validated a subset of ASTD-predicted alternative splice forms and alternative polyadenylation forms that were not previously reported. The ASTD database can be accessed at http://www.ebi.ac.uk/astd.

Consortium, FANTOM, Harukazu Suzuki, Alistair R R Forrest, Erik van Nimwegen, Carsten O Daub, Piotr J Balwierz, Katharine M Irvine, Timo Lassmann, Timothy Ravasi, Yuki Hasegawa, Michiel J L de Hoon, Shintaro Katayama, Kate Schroder, Piero Carninci, Yasuhiro Tomaru, Mutsumi Kanamori-Katayama, Atsutaka Kubosaki, Altuna Akalin, Yoshinari Ando, Erik Arner, Maki Asada, Hiroshi Asahara, Timothy Bailey, Vladimir B Bajic, Denis Bauer, Anthony G Beckhouse, Nicolas Bertin, Johan Björkegren, Frank Brombacher, Erika Bulger, Alistair M Chalk, Joe Chiba, Nicole Cloonan, Adam Dawe, Josee Dostie, Pär G Engström, Magbubah Essack, Geoffrey J Faulkner, Lynn Fink, David Fredman, Ko Fujimori, Masaaki Furuno, Takashi Gojobori, Julian Gough, Sean M Grimmond, Mika Gustafsson, Megumi Hashimoto, Takehiro Hashimoto, Mariko Hatakeyama, Susanne Heinzel, Winston Hide, Oliver Hofmann, Michael Hörnquist, Lukasz Huminiecki, Kazuho Ikeo, Naoko Imamoto, Satoshi Inoue, Yusuke Inoue, Ryoko Ishihara, Takao Iwayanagi, Anders Jacobsen, Mandeep Kaur, Hideya Kawaji, Markus C Kerr, Ryuichiro Kimura, Syuhei Kimura, Yasumasa Kimura, Hiroaki Kitano, Hisashi Koga, Toshio Kojima, Shinji Kondo, Takeshi Konno, Anders Krogh, Adele Kruger, Ajit Kumar, Boris Lenhard, Andreas Lennartsson, Morten Lindow, Marina Lizio, Cameron Macpherson, Norihiro Maeda, Christopher A Maher, Monique Maqungo, Jessica Mar, Nicholas A Matigian, Hideo Matsuda, John S Mattick, Stuart Meier, Sei Miyamoto, Etsuko Miyamoto-Sato, Kazuhiko Nakabayashi, Yutaka Nakachi, Mika Nakano, Sanne Nygaard, Toshitsugu Okayama, Yasushi Okazaki, Haruka Okuda-Yabukami, Valerio Orlando, Jun Otomo, Mikhail Pachkov, Nikolai Petrovsky, Charles Plessy, John Quackenbush, Aleksandar Radovanovic, Michael Rehli, Rintaro Saito, Albin Sandelin, Sebastian Schmeier, Christian Schönbach, Ariel S Schwartz, Colin A Semple, Miho Sera, Jessica Severin, Katsuhiko Shirahige, Cas Simons, George St Laurent, Masanori Suzuki, Takahiro Suzuki, Matthew J Sweet, Ryan J Taft, Shizu Takeda, Yoichi Takenaka, Kai Tan, Martin S Taylor, Rohan D Teasdale, Jesper Tegnér, Sarah Teichmann, Eivind Valen, Claes Wahlestedt, Kazunori Waki, Andrew Waterhouse, Christine A Wells, Ole Winther, Linda Wu, Kazumi Yamaguchi, Hiroshi Yanagawa, Jun Yasuda, Mihaela Zavolan, David A Hume, Riken Omics Science Center, Takahiro Arakawa, Shiro Fukuda, Kengo Imamura, Chikatoshi Kai, Ai Kaiho, Tsugumi Kawashima, Chika Kawazu, Yayoi Kitazume, Miki Kojima, Hisashi Miura, Kayoko Murakami, Mitsuyoshi Murata, Noriko Ninomiya, Hiromi Nishiyori, Shohei Noma, Chihiro Ogawa, Takuma Sano, Christophe Simon, Michihira Tagami, Yukari Takahashi, Jun Kawai, and Yoshihide Hayashizaki. [2009] 2009. “The Transcriptional Network That Controls Growth Arrest and Differentiation in a Human Myeloid Leukemia Cell Line..” Nature Genetics 41(5):553-62. doi: 10.1038/ng.375.

Publisher's Version

Using deep sequencing (deepCAGE), the FANTOM4 study measured the genome-wide dynamics of transcription-start-site usage in the human monocytic cell line THP-1 throughout a time course of growth arrest and differentiation. Modeling the expression dynamics in terms of predicted cis-regulatory sites, we identified the key transcription regulators, their time-dependent activities and target genes. Systematic siRNA knockdown of 52 transcription factors confirmed the roles of individual factors in the regulatory network. Our results indicate that cellular states are constrained by complex networks involving both positive and negative regulatory interactions among substantial numbers of transcription factors and that no single transcription factor is both necessary and sufficient to drive the differentiation process.

Lal, Ashish, Francisco Navarro, Christopher A Maher, Laura E Maliszewski, Nan Yan, Elizabeth O’Day, Dipanjan Chowdhury, Derek M Dykxhoorn, Perry Tsai, Oliver Hofmann, Kevin G Becker, Myriam Gorospe, Winston Hide, and Judy Lieberman. [2009] 2009. “MiR-24 Inhibits Cell Proliferation by Targeting E2F2, MYC, and Other Cell-Cycle Genes via Binding to ‘seedless’ 3’UTR MicroRNA Recognition Elements..” Molecular Cell 35(5):610-25. doi: 10.1016/j.molcel.2009.08.020.

Publisher's Version

miR-24, upregulated during terminal differentiation of multiple lineages, inhibits cell-cycle progression. Antagonizing miR-24 restores postmitotic cell proliferation and enhances fibroblast proliferation, whereas overexpressing miR-24 increases the G1 compartment. The 248 mRNAs downregulated upon miR-24 overexpression are highly enriched for DNA repair and cell-cycle regulatory genes that form a direct interaction network with prominent nodes at genes that enhance (MYC, E2F2, CCNB1, and CDC2) or inhibit (p27Kip1 and VHL) cell-cycle progression. miR-24 directly regulates MYC and E2F2 and some genes that they transactivate. Enhanced proliferation from antagonizing miR-24 is abrogated by knocking down E2F2, but not MYC, and cell proliferation, inhibited by miR-24 overexpression, is rescued by miR-24-insensitive E2F2. Therefore, E2F2 is a critical miR-24 target. The E2F2 3'UTR lacks a predicted miR-24 recognition element. In fact, miR-24 regulates expression of E2F2, MYC, AURKB, CCNA2, CDC2, CDK4, and FEN1 by recognizing seedless but highly complementary sequences.

2008

H-Invitational, Genome Information Integration Project And, 2, Chisato Yamasaki, Katsuhiko Murakami, Yasuyuki Fujii, Yoshiharu Sato, Erimi Harada, Jun-ichi Takeda, Takayuki Taniya, Ryuichi Sakate, Shingo Kikugawa, Makoto Shimada, Motohiko Tanino, Kanako O Koyanagi, Roberto A Barrero, Craig Gough, Hong-Woo Chun, Takuya Habara, Hideki Hanaoka, Yosuke Hayakawa, Phillip B Hilton, Yayoi Kaneko, Masako Kanno, Yoshihiro Kawahara, Toshiyuki Kawamura, Akihiro Matsuya, Naoki Nagata, Kensaku Nishikata, Akiko Ogura Noda, Shin Nurimoto, Naomi Saichi, Hiroaki Sakai, Ryoko Sanbonmatsu, Rie Shiba, Mami Suzuki, Kazuhiko Takabayashi, Aiko Takahashi, Takuro Tamura, Masayuki Tanaka, Susumu Tanaka, Fusano Todokoro, Kaori Yamaguchi, Naoyuki Yamamoto, Toshihisa Okido, Jun Mashima, Aki Hashizume, Lihua Jin, Kyung-Bum Lee, Yi-Chueh Lin, Asami Nozaki, Katsunaga Sakai, Masahito Tada, Satoru Miyazaki, Takashi Makino, Hajime Ohyanagi, Naoki Osato, Nobuhiko Tanaka, Yoshiyuki Suzuki, Kazuho Ikeo, Naruya Saitou, Hideaki Sugawara, Claire O’Donovan, Tamara Kulikova, Eleanor Whitfield, Brian Halligan, Mary Shimoyama, Simon Twigger, Kei Yura, Kouichi Kimura, Tomohiro Yasuda, Tetsuo Nishikawa, Yutaka Akiyama, Chie Motono, Yuri Mukai, Hideki Nagasaki, Makiko Suwa, Paul Horton, Reiko Kikuno, Osamu Ohara, Doron Lancet, Eric Eveno, Esther Graudens, Sandrine Imbeaud, Marie Anne Debily, Yoshihide Hayashizaki, Clara Amid, Michael Han, Andreas Osanger, Toshinori Endo, Michael A Thomas, Mika Hirakawa, Wojciech Makalowski, Mitsuteru Nakao, Nam-Soon Kim, Hyang-Sook Yoo, Sandro J De Souza, Maria de Fatima Bonaldo, Yoshihito Niimura, Vladimir Kuryshev, Ingo Schupp, Stefan Wiemann, Matthew Bellgard, Masafumi Shionyu, Libin Jia, Danielle Thierry-Mieg, Jean Thierry-Mieg, Lukas Wagner, Qinghua Zhang, Mitiko Go, Shinsei Minoshima, Masafumi Ohtsubo, Kousuke Hanada, Peter Tonellato, Takao Isogai, Ji Zhang, Boris Lenhard, Sangsoo Kim, Zhu Chen, Ursula Hinz, Anne Estreicher, Kenta Nakai, Izabela Makalowska, Winston Hide, Nicola Tiffin, Laurens Wilming, Ranajit Chakraborty, Marcelo Bento Soares, Maria Luisa Chiusano, Yutaka Suzuki, Charles Auffray, Yumi Yamaguchi-Kabata, Takeshi Itoh, Teruyoshi Hishiki, Satoshi Fukuchi, Ken Nishikawa, Sumio Sugano, Nobuo Nomura, Yoshio Tateno, Tadashi Imanishi, and Takashi Gojobori. [2008] 2008. “The H-Invitational Database (H-InvDB), a Comprehensive Annotation Resource for Human Genes and Transcripts..” Nucleic Acids Research 36(Database issue):D793-9.

Publisher's Version

Here we report the new features and improvements in our latest release of the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/), a comprehensive annotation resource for human genes and transcripts. H-InvDB, originally developed as an integrated database of the human transcriptome based on extensive annotation of large sets of full-length cDNA (FLcDNA) clones, now provides annotation for 120 558 human mRNAs extracted from the International Nucleotide Sequence Databases (INSD), in addition to 54 978 human FLcDNAs, in the latest release H-InvDB_4.6. We mapped those human transcripts onto the human genome sequences (NCBI build 36.1) and determined 34 699 human gene clusters, which could define 34 057 (98.1%) protein-coding and 642 (1.9%) non-protein-coding loci; 858 (2.5%) transcribed loci overlapped with predicted pseudogenes. For all these transcripts and genes, we provide comprehensive annotation including gene structures, gene functions, alternative splicing variants, functional non-protein-coding RNAs, functional domains, predicted sub cellular localizations, metabolic pathways, predictions of protein 3D structure, mapping of SNPs and microsatellite repeat motifs, co-localization with orphan diseases, gene expression profiles, orthologous genes, protein-protein interactions (PPI) and annotation for gene families. The current H-InvDB annotation resources consist of two main views: Transcript view and Locus view and eight sub-databases: the DiseaseInfo Viewer, H-ANGEL, the Clustering Viewer, G-integra, the TOPO Viewer, Evola, the PPI view and the Gene family/group.

Mathivanan, Suresh, Mukhtar Ahmed, Natalie G Ahn, Hainard Alexandre, Ramars Amanchy, Philip C Andrews, Joel S Bader, Brian M Balgley, Marcus Bantscheff, Keiryn L Bennett, Erik Björling, Blagoy Blagoev, Ron Bose, Samir K Brahmachari, Alma S Burlingame, Xosé R Bustelo, Gerard Cagney, Greg T Cantin, Helene L Cardasis, Julio E Celis, Raghothama Chaerkady, Feixia Chu, Philip A Cole, Catherine E Costello, Robert J Cotter, David Crockett, James P DeLany, Angelo M De Marzo, Leroi DeSouza V, Eric W Deutsch, Eric Dransfield, Gerard Drewes, Arnaud Droit, Michael J Dunn, Kojo Elenitoba-Johnson, Rob M Ewing, Jennifer Van Eyk, Vitor Faca, Jayson Falkner, Xiangming Fang, Catherine Fenselau, Daniel Figeys, Pierre Gagné, Cecilia Gelfi, Kris Gevaert, Jeffrey M Gimble, Florian Gnad, Renu Goel, Pavel Gromov, Samir M Hanash, William S Hancock, H C Harsha, Gerald Hart, Faith Hays, Fuchu He, Prashantha Hebbar, Kenny Helsens, Heiko Hermeking, Winston Hide, Karin Hjernø, Denis F Hochstrasser, Oliver Hofmann, David M Horn, Ralph H Hruban, Nieves Ibarrola, Peter James, Ole N Jensen, Pia Hønnerup Jensen, Peter Jung, Kumaran Kandasamy, Indu Kheterpal, Reiko F Kikuno, Ulrike Korf, Roman Körner, Bernhard Kuster, Min-Seok Kwon, Hyoung-Joo Lee, Young-Jin Lee, Michael Lefevre, Minna Lehvaslaiho, Pierre Lescuyer, Fredrik Levander, Megan S Lim, Christian Löbke, Joseph A Loo, Matthias Mann, Lennart Martens, Juan Martinez-Heredia, Mark McComb, James McRedmond, Alexander Mehrle, Rajasree Menon, Christine A Miller, Harald Mischak, Sujatha Mohan, Riaz Mohmood, Henrik Molina, Michael F Moran, James D Morgan, Robert Moritz, Martine Morzel, David C Muddiman, Anuradha Nalli, Daniel Navarro, Thomas A Neubert, Osamu Ohara, Rafael Oliva, Gilbert S Omenn, Masaaki Oyama, Young-Ki Paik, Kyla Pennington, Rainer Pepperkok, Balamurugan Periaswamy, Emanuel F Petricoin, Guy G Poirier, T S Keshava Prasad, Samuel O Purvine, Abdul Rahiman, Prasanna Ramachandran, Y L Ramachandra, Robert H Rice, Jens Rick, Ragna H Ronnholm, Johanna Salonen, Jean-Charles Sanchez, Thierry Sayd, Beerelli Seshi, Kripa Shankari, Shi Jun Sheng, Vivekananda Shetty, K Shivakumar, Richard J Simpson, Ravi Sirdeshmukh, K W Michael Siu, Jeffrey C Smith, Richard D Smith, David J States, Sumio Sugano, Matthew Sullivan, Giulio Superti-Furga, Maarit Takatalo, Visith Thongboonkerd, Jonathan C Trinidad, Mathias Uhlen, Joël Vandekerckhove, Julian Vasilescu, Timothy D Veenstra, José-Manuel Vidal-Taboada, Mauno Vihinen, Robin Wait, Xiaoyue Wang, Stefan Wiemann, Billy Wu, Tao Xu, John R Yates, Jun Zhong, Ming Zhou, Yunping Zhu, Petra Zurbig, and Akhilesh Pandey. [2008] 2008. “Human Proteinpedia Enables Sharing of Human Protein Data..” Nature Biotechnology 26(2):164-7. doi: 10.1038/nbt0208-164.

Publisher's Version

Chopera, Denis R, Zenda Woodman, Koleka Mlisana, Mandla Mlotshwa, Darren P Martin, Cathal Seoighe, Florette Treurnicht, Debra Assis de Rosa, Winston Hide, Salim Abdool Karim, Clive M Gray, Carolyn Williamson, and CAPRISA 002 Study Team. [2008] 2008. “Transmission of HIV-1 CTL Escape Variants Provides HLA-Mismatched Recipients With a Survival Advantage..” PLoS Pathogens 4(3):e1000033. doi: 10.1371/journal.ppat.1000033.

Publisher's Version

One of the most important genetic factors known to affect the rate of disease progression in HIV-infected individuals is the genotype at the Class I Human Leukocyte Antigen (HLA) locus, which determines the HIV peptides targeted by cytotoxic T-lymphocytes (CTLs). Individuals with HLA-B*57 or B*5801 alleles, for example, target functionally important parts of the Gag protein. Mutants that escape these CTL responses may have lower fitness than the wild-type and can be associated with slower disease progression. Transmission of the escape variant to individuals without these HLA alleles is associated with rapid reversion to wild-type. However, the question of whether infection with an escape mutant offers an advantage to newly infected hosts has not been addressed. Here we investigate the relationship between the genotypes of transmitted viruses and prognostic markers of disease progression and show that infection with HLA-B*57/B*5801 escape mutants is associated with lower viral load and higher CD4+ counts.

Hazelhurst, Scott, Winston Hide, Zsuzsanna Lipták, Ramon Nogueira, and Richard Starfield. [2008] 2008. “An Overview of the Wcd EST Clustering Tool..” Bioinformatics (Oxford, England) 24(13):1542-6. doi: 10.1093/bioinformatics/btn203.

Publisher's Version

UNLABELLED: The wcd system is an open source tool for clustering expressed sequence tags (EST) and other DNA and RNA sequences. wcd allows efficient all-versus-all comparison of ESTs using either the d(2) distance function or edit distance, improving existing implementations of d(2). It supports merging, refinement and reclustering of clusters. It is 'drop in' compatible with the StackPack clustering package. wcd supports parallelization under both shared memory and cluster architectures. It is distributed with an EMBOSS wrapper allowing wcd to be installed as part of an EMBOSS installation (and so provided by a web server).

AVAILABILITY: wcd is distributed under a GPL licence and is available from http://code.google.com/p/wcdest.

SUPPLEMENTARY INFORMATION: Additional experimental results. The wcd manual, a companion paper describing underlying algorithms, and all datasets used for experimentation can also be found at www.bioinf.wits.ac.za/ scott/wcdsupp.html.

Howe, Doug, Maria Costanzo, Petra Fey, Takashi Gojobori, Linda Hannick, Winston Hide, David P Hill, Renate Kania, Mary Schaeffer, Susan St Pierre, Simon Twigger, Owen White, and Seung Yon Rhee. [2008] 2008. “Big Data: The Future of Biocuration..” Nature 455(7209):47-50. doi: 10.1038/455047a.

Publisher's Version

Kaur, Mandeep, Sebastian Schmeier, Cameron R MacPherson, Oliver Hofmann, Winston A Hide, Stephen Taylor, Nick Willcox, and Vladimir B Bajic. [2008] 2008. “Prioritizing Genes of Potential Relevance to Diseases Affected by Sex Hormones: An Example of Myasthenia Gravis..” BMC Genomics 9:481. doi: 10.1186/1471-2164-9-481.

Publisher's Version

BACKGROUND: About 5% of western populations are afflicted by autoimmune diseases many of which are affected by sex hormones. Autoimmune diseases are complex and involve many genes. Identifying these disease-associated genes contributes to development of more effective therapies. Also, association studies frequently imply genomic regions that contain disease-associated genes but fall short of pinpointing these genes. The identification of disease-associated genes has always been challenging and to date there is no universal and effective method developed.

RESULTS: We have developed a method to prioritize disease-associated genes for diseases affected strongly by sex hormones. Our method uses various types of information available for the genes, but no information that directly links genes with the disease. It generates a score for each of the considered genes and ranks genes based on that score. We illustrate our method on early-onset myasthenia gravis (MG) using genes potentially controlled by estrogen and localized in a genomic segment (which contains the MHC and surrounding region) strongly associated with MG. Based on the considered genomic segment 283 genes are ranked for their relevance to MG and responsiveness to estrogen. The top three ranked genes, HLA-G, TAP2 and HLA-DRB1, are implicated in autoimmune diseases, while TAP2 is associated with SNPs characteristic for MG. Within the top 35 prioritized genes our method identifies 90% of the 10 already known MG-associated genes from the considered region without using any information that directly links genes to MG. Among the top eight genes we identified HLA-G and TUBB as new candidates. We show that our ab-initio approach outperforms the other methods for prioritizing disease-associated genes.

CONCLUSION: We have developed a method to prioritize disease-associated genes under the potential control of sex hormones. We demonstrate the success of this method by prioritizing the genes localized in the MHC and surrounding region and evaluating the role of these genes as potential candidates for estrogen control as well as MG. We show that our method outperforms the other methods. The method has a potential to be adapted to prioritize genes relevant to other diseases.