Publications

2015

Karp, Peter D, Bonnie Berger, Diane Kovats, Thomas Lengauer, Michal Linial, Pardis Sabeti, Winston Hide, and Burkhard Rost. [2015] 2015. “ISCB Ebola Award for Important Future Research on the Computational Biology of Ebola Virus..” F1000Research 4:12. doi: 10.12688/f1000research.6038.1.

Publisher's Version

Speed is of the essence in combating Ebola; thus, computational approaches should form a significant component of Ebola research. As for the development of any modern drug, computational biology is uniquely positioned to contribute through comparative analysis of the genome sequences of Ebola strains as well as 3-D protein modeling. Other computational approaches to Ebola may include large-scale docking studies of Ebola proteins with human proteins and with small-molecule libraries, computational modeling of the spread of the virus, computational mining of the Ebola literature, and creation of a curated Ebola database. Taken together, such computational efforts could significantly accelerate traditional scientific approaches. In recognition of the need for important and immediate solutions from the field of computational biology against Ebola, the International Society for Computational Biology (ISCB) announces a prize for an important computational advance in fighting the Ebola virus. ISCB will confer the ISCB Fight against Ebola Award, along with a prize of US$2,000, at its July 2016 annual meeting (ISCB Intelligent Systems for Molecular Biology (ISMB) 2016, Orlando, Florida).

Lizio, Marina, Jayson Harshbarger, Hisashi Shimoji, Jessica Severin, Takeya Kasukawa, Serkan Sahin, Imad Abugessaisa, Shiro Fukuda, Fumi Hori, Sachi Ishikawa-Kato, Christopher J Mungall, Erik Arner, Kenneth Baillie, Nicolas Bertin, Hidemasa Bono, Michiel de Hoon, Alexander D Diehl, Emmanuel Dimont, Tom C Freeman, Kaori Fujieda, Winston Hide, Rajaram Kaliyaperumal, Toshiaki Katayama, Timo Lassmann, Terrence F Meehan, Koro Nishikata, Hiromasa Ono, Michael Rehli, Albin Sandelin, Erik A Schultes, Peter A C ’t Hoen, Zuotian Tatum, Mark Thompson, Tetsuro Toyoda, Derek W Wright, Carsten O Daub, Masayoshi Itoh, Piero Carninci, Yoshihide Hayashizaki, Alistair R R Forrest, Hideya Kawaji, and FANTOM Consortium. [2015] 2015. “Gateways to the FANTOM5 Promoter Level Mammalian Expression Atlas..” Genome Biology 16(1):22. doi: 10.1186/s13059-014-0560-6.

Publisher's Version

The FANTOM5 project investigates transcription initiation activities in more than 1,000 human and mouse primary cells, cell lines and tissues using CAGE. Based on manual curation of sample information and development of an ontology for sample classification, we assemble the resulting data into a centralized data resource (http://fantom.gsc.riken.jp/5/). This resource contains web-based tools and data-access points for the research community to search and extract data related to samples, genes, promoter activities, transcription factors and enhancers across the FANTOM5 atlas.

Dimont, Emmanuel, Jiantao Shi, Rory Kirchner, and Winston Hide. [2015] 2015. “EdgeRun: an R Package for Sensitive, Functionally Relevant Differential Expression Discovery Using an Unconditional Exact Test..” Bioinformatics (Oxford, England) 31(15):2589-90. doi: 10.1093/bioinformatics/btv209.

Publisher's Version

Next-generation sequencing platforms for measuring digital expression such as RNA-Seq are displacing traditional microarray-based methods in biological experiments. The detection of differentially expressed genes between groups of biological conditions has led to the development of numerous bioinformatics tools, but so far, few exploit the expanded dynamic range afforded by the new technologies. We present edgeRun, an R package that implements an unconditional exact test that is a more powerful version of the exact test in edgeR. This increase in power is especially pronounced for experiments with as few as two replicates per condition, for genes with low total expression and with large biological coefficient of variation. In comparison with a panel of other tools, edgeRun consistently captures functionally similar differentially expressed genes.

Karp, Peter D, Bonnie Berger, Diane Kovats, Thomas Lengauer, Michal Linial, Pardis Sabeti, Winston Hide, and Burkhard Rost. [2015] 2015. “Message from the ISCB: ISCB Ebola Award for Important Future Research on the Computational Biology of Ebola Virus..” Bioinformatics (Oxford, England) 31(4):616-7. doi: 10.1093/bioinformatics/btv019.

Publisher's Version

UNLABELLED: Speed is of the essence in combating Ebola; thus, computational approaches should form a significant component of Ebola research. As for the development of any modern drug, computational biology is uniquely positioned to contribute through comparative analysis of the genome sequences of Ebola strains and three-dimensional protein modeling. Other computational approaches to Ebola may include large-scale docking studies of Ebola proteins with human proteins and with small-molecule libraries, computational modeling of the spread of the virus, computational mining of the Ebola literature and creation of a curated Ebola database. Taken together, such computational efforts could significantly accelerate traditional scientific approaches. In recognition of the need for important and immediate solutions from the field of computational biology against Ebola, the International Society for Computational Biology (ISCB) announces a prize for an important computational advance in fighting the Ebola virus. ISCB will confer the ISCB Fight against Ebola Award, along with a prize of US$2000, at its July 2016 annual meeting (ISCB Intelligent Systems for Molecular Biology 2016, Orlando, FL).

CONTACT: dkovats@iscb.org or rost@in.tum.de.

Nishi, Yuichi, Xiaoxiao Zhang, Jieun Jeong, Kevin A Peterson, Anastasia Vedenko, Martha L Bulyk, Winston A Hide, and Andrew P McMahon. [2015] 2015. “A Direct Fate Exclusion Mechanism by Sonic Hedgehog-Regulated Transcriptional Repressors..” Development (Cambridge, England) 142(19):3286-93. doi: 10.1242/dev.124636.

Publisher's Version

Sonic hedgehog (Shh) signaling patterns the vertebrate spinal cord by activating a group of transcriptional repressors in distinct neural progenitors of somatic motor neuron and interneuron subtypes. To identify the action of this network, we performed a genome-wide analysis of the regulatory actions of three key ventral determinants in mammalian neural tube patterning: Nkx2.2, Nkx6.1 and Olig2. Previous studies have demonstrated that each factor acts predominantly as a transcriptional repressor, at least in part, to inhibit alternative progenitor fate choices. Here, we reveal broad and direct repression of multiple alternative fates as a general mechanism of repressor action. Additionally, the repressor network targets multiple Shh signaling components providing negative feedback to ongoing Shh signaling. Analysis of chromatin organization around Nkx2.2-, Nkx6.1- and Olig2-bound regions, together with co-analysis of engagement of the transcriptional activator Sox2, indicate that repressors bind to, and probably modulate the action of, neural enhancers. Together, the data suggest a model for neural progenitor specification downstream of Shh signaling, in which Nkx2.2 and Olig2 direct repression of alternative neural progenitor fate determinants, an action augmented by the overlapping activity of Nkx6.1 in each cell type. Integration of repressor and activator inputs, notably activator inputs mediated by Sox2, is probably a key mechanism in achieving cell type-specific transcriptional outcomes in mammalian neural progenitor fate specification.

Singh, Nakul, Arun D Singh, and Winston Hide. [2015] 2015. “Inferring an Evolutionary Tree of Uveal Melanoma From Genomic Copy Number Aberrations..” Investigative Ophthalmology & Visual Science 56(11):6801-9. doi: 10.1167/iovs.15-16822.

Publisher's Version

PURPOSE: The purpose of this study is to study the genomic evolution of primary uveal melanoma.

METHODS: Primary uveal melanoma genomic DNA was assayed on the Illumina Human660W-Quad v1.0 DNA Analysis BeadChip. Raw signal intensity data were quantile normalized to estimate copy number aberration with the Genome Alteration Print algorithm. Distance between samples was calculated as the Manhattan distance between the copy number profiles of the tumors. From the distance matrix, a phylogenetic network (evolutionary relationship inference) was estimated using SplitsTree4.

RESULTS: Of the 57 tumors, one (1.8%) was discarded because of a failed assay, and seven (12.3%) were revealed to be mixtures of several cell populations that could not be resolved. Three clades of tumor were identified (A [59.2%], B [32.7%], and C [6.1%]), each following a distinct evolutionary path and each associated with metastatic status (P = 0.01). One tumor (2.0%) did not fit into any clade. From a normal diploid melanocyte, a few tumors (clade C) lose a large portion of chromosome 6q, but do not develop any mutations on 8q. In an alternate path, the vast majority of tumors (clade A and clade B [91.9%]) gain a copy of the telomeric half of 8q. A majority of these tumors (clade A) subsequently lose a copy of chromosome 3, as well as gain the centromeric half of 8q. The other tumors (clade B) gain copies of 6p, as well as regions on 11p and 22q.

CONCLUSIONS: Our data suggest that there is little overlap in the subtypes of uveal melanoma after divergence (identified as clades A and B) and that these distinct subtypes are not likely to crossover or transform from one major clade to another.

2014

Zook, Justin M, Brad Chapman, Jason Wang, David Mittelman, Oliver Hofmann, Winston Hide, and Marc Salit. [2014] 2014. “Integrating Human Sequence Data Sets Provides a Resource of Benchmark SNP and Indel Genotype Calls..” Nature Biotechnology 32(3):246-51. doi: 10.1038/nbt.2835.

Publisher's Version

Clinical adoption of human genome sequencing requires methods that output genotypes with known accuracy at millions or billions of positions across a genome. Because of substantial discordance among calls made by existing sequencing methods and algorithms, there is a need for a highly accurate set of genotypes across a genome that can be used as a benchmark. Here we present methods to make high-confidence, single-nucleotide polymorphism (SNP), indel and homozygous reference genotype calls for NA12878, the pilot genome for the Genome in a Bottle Consortium. We minimize bias toward any method by integrating and arbitrating between 14 data sets from five sequencing technologies, seven read mappers and three variant callers. We identify regions for which no confident genotype call could be made, and classify them into different categories based on reasons for uncertainty. Our genotype calls are publicly available on the Genome Comparison and Analytic Testing website to enable real-time benchmarking of any method.

Liu, Jing, Michaela Krautzberger, Shannan H Sui, Oliver M Hofmann, Ying Chen, Manfred Baetscher, Ivica Grgic, Sanjeev Kumar, Benjamin D Humphreys, Winston A Hide, and Andrew P McMahon. [2014] 2014. “Cell-Specific Translational Profiling in Acute Kidney Injury..” The Journal of Clinical Investigation 124(3):1242-54. doi: 10.1172/JCI72126.

Publisher's Version

Acute kidney injury (AKI) promotes an abrupt loss of kidney function that results in substantial morbidity and mortality. Considerable effort has gone toward identification of diagnostic biomarkers and analysis of AKI-associated molecular events; however, most studies have adopted organ-wide approaches and have not elucidated the interplay among different cell types involved in AKI pathophysiology. To better characterize AKI-associated molecular and cellular events, we developed a mouse line that enables the identification of translational profiles in specific cell types. This strategy relies on CRE recombinase-dependent activation of an EGFP-tagged L10a ribosomal protein subunit, which allows translating ribosome affinity purification (TRAP) of mRNA populations in CRE-expressing cells. Combining this mouse line with cell type-specific CRE-driver lines, we identified distinct cellular responses in an ischemia reperfusion injury (IRI) model of AKI. Twenty-four hours following IRI, distinct translational signatures were identified in the nephron, kidney interstitial cell populations, vascular endothelium, and macrophages/monocytes. Furthermore, TRAP captured known IRI-associated markers, validating this approach. Biological function annotation, canonical pathway analysis, and in situ analysis of identified response genes provided insight into cell-specific injury signatures. Our study provides a deep, cell-based view of early injury-associated molecular events in AKI and documents a versatile, genetic tool to monitor cell-specific and temporal-specific biological processes in disease modeling.

Li, Jonathan Z, Brad Chapman, Patrick Charlebois, Oliver Hofmann, Brian Weiner, Alyssa J Porter, Reshmi Samuel, Saran Vardhanabhuti, Lu Zheng, Joseph Eron, Babafemi Taiwo, Michael C Zody, Matthew R Henn, Daniel R Kuritzkes, Winston Hide, ACTG A5262 Study Team, Cara C Wilson, Baiba I Berzins, Edward P Acosta, Barbara Bastow, Peter S Kim, Sarah W Read, Jennifer Janik, Debra S Meres, Michael M Lederman, Lori Mong-Kryspin, Karl E Shaw, Louis G Zimmerman, Randi Leavitt, Guy De La Rosa, and Amy Jennings. [2014] 2014. “Comparison of Illumina and 454 Deep Sequencing in Participants Failing Raltegravir-Based Antiretroviral Therapy..” PloS One 9(3):e90485. doi: 10.1371/journal.pone.0090485.

Publisher's Version

BACKGROUND: The impact of raltegravir-resistant HIV-1 minority variants (MVs) on raltegravir treatment failure is unknown. Illumina sequencing offers greater throughput than 454, but sequence analysis tools for viral sequencing are needed. We evaluated Illumina and 454 for the detection of HIV-1 raltegravir-resistant MVs.

METHODS: A5262 was a single-arm study of raltegravir and darunavir/ritonavir in treatment-naïve patients. Pre-treatment plasma was obtained from 5 participants with raltegravir resistance at the time of virologic failure. A control library was created by pooling integrase clones at predefined proportions. Multiplexed sequencing was performed with Illumina and 454 platforms at comparable costs. Illumina sequence analysis was performed with the novel snp-assess tool and 454 sequencing was analyzed with V-Phaser.

RESULTS: Illumina sequencing resulted in significantly higher sequence coverage and a 0.095% limit of detection. Illumina accurately detected all MVs in the control library at ≥0.5% and 7/10 MVs expected at 0.1%. 454 sequencing failed to detect any MVs at 0.1% with 5 false positive calls. For MVs detected in the patient samples by both 454 and Illumina, the correlation in the detected variant frequencies was high (R2 = 0.92, P<0.001). Illumina sequencing detected 2.4-fold greater nucleotide MVs and 2.9-fold greater amino acid MVs compared to 454. The only raltegravir-resistant MV detected was an E138K mutation in one participant by Illumina sequencing, but not by 454.

CONCLUSIONS: In participants of A5262 with raltegravir resistance at virologic failure, baseline raltegravir-resistant MVs were rarely detected. At comparable costs to 454 sequencing, Illumina demonstrated greater depth of coverage, increased sensitivity for detecting HIV MVs, and fewer false positive variant calls.

CLST, FANTOM Consortium and the RIKEN PMI and, Alistair R R Forrest, Hideya Kawaji, Michael Rehli, Kenneth Baillie, Michiel J L de Hoon, Vanja Haberle, Timo Lassmann, Ivan Kulakovskiy V, Marina Lizio, Masayoshi Itoh, Robin Andersson, Christopher J Mungall, Terrence F Meehan, Sebastian Schmeier, Nicolas Bertin, Mette Jørgensen, Emmanuel Dimont, Erik Arner, Christian Schmidl, Ulf Schaefer, Yulia A Medvedeva, Charles Plessy, Morana Vitezic, Jessica Severin, Colin A Semple, Yuri Ishizu, Robert S Young, Margherita Francescatto, Intikhab Alam, Davide Albanese, Gabriel M Altschuler, Takahiro Arakawa, John A C Archer, Peter Arner, Magda Babina, Sarah Rennie, Piotr J Balwierz, Anthony G Beckhouse, Swati Pradhan-Bhatt, Judith A Blake, Antje Blumenthal, Beatrice Bodega, Alessandro Bonetti, James Briggs, Frank Brombacher, Maxwell Burroughs, Andrea Califano, Carlo Cannistraci V, Daniel Carbajo, Yun Chen, Marco Chierici, Yari Ciani, Hans C Clevers, Emiliano Dalla, Carrie A Davis, Michael Detmar, Alexander D Diehl, Taeko Dohi, Finn Drabløs, Albert S B Edge, Matthias Edinger, Karl Ekwall, Mitsuhiro Endoh, Hideki Enomoto, Michela Fagiolini, Lynsey Fairbairn, Hai Fang, Mary C Farach-Carson, Geoffrey J Faulkner, Alexander Favorov V, Malcolm E Fisher, Martin C Frith, Rie Fujita, Shiro Fukuda, Cesare Furlanello, Masaaki Furino, Jun-ichi Furusawa, Teunis B Geijtenbeek, Andrew P Gibson, Thomas Gingeras, Daniel Goldowitz, Julian Gough, Sven Guhl, Reto Guler, Stefano Gustincich, Thomas J Ha, Masahide Hamaguchi, Mitsuko Hara, Matthias Harbers, Jayson Harshbarger, Akira Hasegawa, Yuki Hasegawa, Takehiro Hashimoto, Meenhard Herlyn, Kelly J Hitchens, Shannan J Ho Sui, Oliver M Hofmann, Ilka Hoof, Furni Hori, Lukasz Huminiecki, Kei Iida, Tomokatsu Ikawa, Boris R Jankovic, Hui Jia, Anagha Joshi, Giuseppe Jurman, Bogumil Kaczkowski, Chieko Kai, Kaoru Kaida, Ai Kaiho, Kazuhiro Kajiyama, Mutsumi Kanamori-Katayama, Artem S Kasianov, Takeya Kasukawa, Shintaro Katayama, Sachi Kato, Shuji Kawaguchi, Hiroshi Kawamoto, Yuki I Kawamura, Tsugumi Kawashima, Judith S Kempfle, Tony J Kenna, Juha Kere, Levon M Khachigian, Toshio Kitamura, Peter Klinken, Alan J Knox, Miki Kojima, Soichi Kojima, Naoto Kondo, Haruhiko Koseki, Shigeo Koyasu, Sarah Krampitz, Atsutaka Kubosaki, Andrew T Kwon, Jeroen F J Laros, Weonju Lee, Andreas Lennartsson, Kang Li, Berit Lilje, Leonard Lipovich, Alan Mackay-Sim, Ri-ichiroh Manabe, Jessica C Mar, Benoit Marchand, Anthony Mathelier, Niklas Mejhert, Alison Meynert, Yosuke Mizuno, David A de Lima Morais, Hiromasa Morikawa, Mitsuru Morimoto, Kazuyo Moro, Efthymios Motakis, Hozumi Motohashi, Christine L Mummery, Mitsuyoshi Murata, Sayaka Nagao-Sato, Yutaka Nakachi, Fumio Nakahara, Toshiyuki Nakamura, Yukio Nakamura, Kenichi Nakazato, Erik van Nimwegen, Noriko Ninomiya, Hiromi Nishiyori, Shohei Noma, Shohei Noma, Tadasuke Noazaki, Soichi Ogishima, Naganari Ohkura, Hiroko Ohimiya, Hiroshi Ohno, Mitsuhiro Ohshima, Mariko Okada-Hatakeyama, Yasushi Okazaki, Valerio Orlando, Dmitry A Ovchinnikov, Arnab Pain, Robert Passier, Margaret Patrikakis, Helena Persson, Silvano Piazza, James G D Prendergast, Owen J L Rackham, Jordan A Ramilowski, Mamoon Rashid, Timothy Ravasi, Patrizia Rizzu, Marco Roncador, Sugata Roy, Morten B Rye, Eri Saijyo, Antti Sajantila, Akiko Saka, Shimon Sakaguchi, Mizuho Sakai, Hiroki Sato, Suzana Savvi, Alka Saxena, Claudio Schneider, Erik A Schultes, Gundula G Schulze-Tanzil, Anita Schwegmann, Thierry Sengstag, Guojun Sheng, Hisashi Shimoji, Yishai Shimoni, Jay W Shin, Christophe Simon, Daisuke Sugiyama, Takaai Sugiyama, Masanori Suzuki, Naoko Suzuki, Rolf K Swoboda, Peter A C ’t Hoen, Michihira Tagami, Naoko Takahashi, Jun Takai, Hiroshi Tanaka, Hideki Tatsukawa, Zuotian Tatum, Mark Thompson, Hiroo Toyodo, Tetsuro Toyoda, Elvind Valen, Marc van de Wetering, Linda M van den Berg, Roberto Verado, Dipti Vijayan, Ilya E Vorontsov, Wyeth W Wasserman, Shoko Watanabe, Christine A Wells, Louise N Winteringham, Ernst Wolvetang, Emily J Wood, Yoko Yamaguchi, Masayuki Yamamoto, Misako Yoneda, Yohei Yonekura, Shigehiro Yoshida, Susan E Zabierowski, Peter G Zhang, Xiaobei Zhao, Silvia Zucchelli, Kim M Summers, Harukazu Suzuki, Carsten O Daub, Jun Kawai, Peter Heutink, Winston Hide, Tom C Freeman, Boris Lenhard, Vladimir B Bajic, Martin S Taylor, Vsevolod J Makeev, Albin Sandelin, David A Hume, Piero Carninci, and Yoshihide Hayashizaki. [2014] 2014. “A Promoter-Level Mammalian Expression Atlas..” Nature 507(7493):462-70. doi: 10.1038/nature13182.

Publisher's Version

Regulated transcription controls the diversity, developmental pathways and spatial organization of the hundreds of cell types that make up a mammal. Using single-molecule cDNA sequencing, we mapped transcription start sites (TSSs) and their usage in human and mouse primary cells, cell lines and tissues to produce a comprehensive overview of mammalian gene expression across the human body. We find that few genes are truly 'housekeeping', whereas many mammalian promoters are composite entities composed of several closely separated TSSs, with independent cell-type-specific expression profiles. TSSs specific to different cell types evolve at different rates, whereas promoters of broadly expressed genes are the most conserved. Promoter-based expression analysis reveals key transcription factors defining cell states and links them to binding-site motifs. The functions of identified novel transcripts can be predicted by coexpression and sample ontology enrichment analyses. The functional annotation of the mammalian genome 5 (FANTOM5) project provides comprehensive expression profiles and functional annotation of mammalian cell-type-specific transcriptomes with wide applications in biomedical research.