Speed is of the essence in combating Ebola; thus, computational approaches should form a significant component of Ebola research. As for the development of any modern drug, computational biology is uniquely positioned to contribute through comparative analysis of the genome sequences of Ebola strains as well as 3-D protein modeling. Other computational approaches to Ebola may include large-scale docking studies of Ebola proteins with human proteins and with small-molecule libraries, computational modeling of the spread of the virus, computational mining of the Ebola literature, and creation of a curated Ebola database. Taken together, such computational efforts could significantly accelerate traditional scientific approaches. In recognition of the need for important and immediate solutions from the field of computational biology against Ebola, the International Society for Computational Biology (ISCB) announces a prize for an important computational advance in fighting the Ebola virus. ISCB will confer the ISCB Fight against Ebola Award, along with a prize of US$2,000, at its July 2016 annual meeting (ISCB Intelligent Systems for Molecular Biology (ISMB) 2016, Orlando, Florida).
Publications
2015
The FANTOM5 project investigates transcription initiation activities in more than 1,000 human and mouse primary cells, cell lines and tissues using CAGE. Based on manual curation of sample information and development of an ontology for sample classification, we assemble the resulting data into a centralized data resource (http://fantom.gsc.riken.jp/5/). This resource contains web-based tools and data-access points for the research community to search and extract data related to samples, genes, promoter activities, transcription factors and enhancers across the FANTOM5 atlas.
Next-generation sequencing platforms for measuring digital expression such as RNA-Seq are displacing traditional microarray-based methods in biological experiments. The detection of differentially expressed genes between groups of biological conditions has led to the development of numerous bioinformatics tools, but so far, few exploit the expanded dynamic range afforded by the new technologies. We present edgeRun, an R package that implements an unconditional exact test that is a more powerful version of the exact test in edgeR. This increase in power is especially pronounced for experiments with as few as two replicates per condition, for genes with low total expression and with large biological coefficient of variation. In comparison with a panel of other tools, edgeRun consistently captures functionally similar differentially expressed genes.
UNLABELLED: Speed is of the essence in combating Ebola; thus, computational approaches should form a significant component of Ebola research. As for the development of any modern drug, computational biology is uniquely positioned to contribute through comparative analysis of the genome sequences of Ebola strains and three-dimensional protein modeling. Other computational approaches to Ebola may include large-scale docking studies of Ebola proteins with human proteins and with small-molecule libraries, computational modeling of the spread of the virus, computational mining of the Ebola literature and creation of a curated Ebola database. Taken together, such computational efforts could significantly accelerate traditional scientific approaches. In recognition of the need for important and immediate solutions from the field of computational biology against Ebola, the International Society for Computational Biology (ISCB) announces a prize for an important computational advance in fighting the Ebola virus. ISCB will confer the ISCB Fight against Ebola Award, along with a prize of US$2000, at its July 2016 annual meeting (ISCB Intelligent Systems for Molecular Biology 2016, Orlando, FL).
CONTACT: dkovats@iscb.org or rost@in.tum.de.
Sonic hedgehog (Shh) signaling patterns the vertebrate spinal cord by activating a group of transcriptional repressors in distinct neural progenitors of somatic motor neuron and interneuron subtypes. To identify the action of this network, we performed a genome-wide analysis of the regulatory actions of three key ventral determinants in mammalian neural tube patterning: Nkx2.2, Nkx6.1 and Olig2. Previous studies have demonstrated that each factor acts predominantly as a transcriptional repressor, at least in part, to inhibit alternative progenitor fate choices. Here, we reveal broad and direct repression of multiple alternative fates as a general mechanism of repressor action. Additionally, the repressor network targets multiple Shh signaling components providing negative feedback to ongoing Shh signaling. Analysis of chromatin organization around Nkx2.2-, Nkx6.1- and Olig2-bound regions, together with co-analysis of engagement of the transcriptional activator Sox2, indicate that repressors bind to, and probably modulate the action of, neural enhancers. Together, the data suggest a model for neural progenitor specification downstream of Shh signaling, in which Nkx2.2 and Olig2 direct repression of alternative neural progenitor fate determinants, an action augmented by the overlapping activity of Nkx6.1 in each cell type. Integration of repressor and activator inputs, notably activator inputs mediated by Sox2, is probably a key mechanism in achieving cell type-specific transcriptional outcomes in mammalian neural progenitor fate specification.
PURPOSE: The purpose of this study is to study the genomic evolution of primary uveal melanoma.
METHODS: Primary uveal melanoma genomic DNA was assayed on the Illumina Human660W-Quad v1.0 DNA Analysis BeadChip. Raw signal intensity data were quantile normalized to estimate copy number aberration with the Genome Alteration Print algorithm. Distance between samples was calculated as the Manhattan distance between the copy number profiles of the tumors. From the distance matrix, a phylogenetic network (evolutionary relationship inference) was estimated using SplitsTree4.
RESULTS: Of the 57 tumors, one (1.8%) was discarded because of a failed assay, and seven (12.3%) were revealed to be mixtures of several cell populations that could not be resolved. Three clades of tumor were identified (A [59.2%], B [32.7%], and C [6.1%]), each following a distinct evolutionary path and each associated with metastatic status (P = 0.01). One tumor (2.0%) did not fit into any clade. From a normal diploid melanocyte, a few tumors (clade C) lose a large portion of chromosome 6q, but do not develop any mutations on 8q. In an alternate path, the vast majority of tumors (clade A and clade B [91.9%]) gain a copy of the telomeric half of 8q. A majority of these tumors (clade A) subsequently lose a copy of chromosome 3, as well as gain the centromeric half of 8q. The other tumors (clade B) gain copies of 6p, as well as regions on 11p and 22q.
CONCLUSIONS: Our data suggest that there is little overlap in the subtypes of uveal melanoma after divergence (identified as clades A and B) and that these distinct subtypes are not likely to crossover or transform from one major clade to another.
2014
Clinical adoption of human genome sequencing requires methods that output genotypes with known accuracy at millions or billions of positions across a genome. Because of substantial discordance among calls made by existing sequencing methods and algorithms, there is a need for a highly accurate set of genotypes across a genome that can be used as a benchmark. Here we present methods to make high-confidence, single-nucleotide polymorphism (SNP), indel and homozygous reference genotype calls for NA12878, the pilot genome for the Genome in a Bottle Consortium. We minimize bias toward any method by integrating and arbitrating between 14 data sets from five sequencing technologies, seven read mappers and three variant callers. We identify regions for which no confident genotype call could be made, and classify them into different categories based on reasons for uncertainty. Our genotype calls are publicly available on the Genome Comparison and Analytic Testing website to enable real-time benchmarking of any method.
Acute kidney injury (AKI) promotes an abrupt loss of kidney function that results in substantial morbidity and mortality. Considerable effort has gone toward identification of diagnostic biomarkers and analysis of AKI-associated molecular events; however, most studies have adopted organ-wide approaches and have not elucidated the interplay among different cell types involved in AKI pathophysiology. To better characterize AKI-associated molecular and cellular events, we developed a mouse line that enables the identification of translational profiles in specific cell types. This strategy relies on CRE recombinase-dependent activation of an EGFP-tagged L10a ribosomal protein subunit, which allows translating ribosome affinity purification (TRAP) of mRNA populations in CRE-expressing cells. Combining this mouse line with cell type-specific CRE-driver lines, we identified distinct cellular responses in an ischemia reperfusion injury (IRI) model of AKI. Twenty-four hours following IRI, distinct translational signatures were identified in the nephron, kidney interstitial cell populations, vascular endothelium, and macrophages/monocytes. Furthermore, TRAP captured known IRI-associated markers, validating this approach. Biological function annotation, canonical pathway analysis, and in situ analysis of identified response genes provided insight into cell-specific injury signatures. Our study provides a deep, cell-based view of early injury-associated molecular events in AKI and documents a versatile, genetic tool to monitor cell-specific and temporal-specific biological processes in disease modeling.
BACKGROUND: The impact of raltegravir-resistant HIV-1 minority variants (MVs) on raltegravir treatment failure is unknown. Illumina sequencing offers greater throughput than 454, but sequence analysis tools for viral sequencing are needed. We evaluated Illumina and 454 for the detection of HIV-1 raltegravir-resistant MVs.
METHODS: A5262 was a single-arm study of raltegravir and darunavir/ritonavir in treatment-naïve patients. Pre-treatment plasma was obtained from 5 participants with raltegravir resistance at the time of virologic failure. A control library was created by pooling integrase clones at predefined proportions. Multiplexed sequencing was performed with Illumina and 454 platforms at comparable costs. Illumina sequence analysis was performed with the novel snp-assess tool and 454 sequencing was analyzed with V-Phaser.
RESULTS: Illumina sequencing resulted in significantly higher sequence coverage and a 0.095% limit of detection. Illumina accurately detected all MVs in the control library at ≥0.5% and 7/10 MVs expected at 0.1%. 454 sequencing failed to detect any MVs at 0.1% with 5 false positive calls. For MVs detected in the patient samples by both 454 and Illumina, the correlation in the detected variant frequencies was high (R2 = 0.92, P<0.001). Illumina sequencing detected 2.4-fold greater nucleotide MVs and 2.9-fold greater amino acid MVs compared to 454. The only raltegravir-resistant MV detected was an E138K mutation in one participant by Illumina sequencing, but not by 454.
CONCLUSIONS: In participants of A5262 with raltegravir resistance at virologic failure, baseline raltegravir-resistant MVs were rarely detected. At comparable costs to 454 sequencing, Illumina demonstrated greater depth of coverage, increased sensitivity for detecting HIV MVs, and fewer false positive variant calls.
Regulated transcription controls the diversity, developmental pathways and spatial organization of the hundreds of cell types that make up a mammal. Using single-molecule cDNA sequencing, we mapped transcription start sites (TSSs) and their usage in human and mouse primary cells, cell lines and tissues to produce a comprehensive overview of mammalian gene expression across the human body. We find that few genes are truly 'housekeeping', whereas many mammalian promoters are composite entities composed of several closely separated TSSs, with independent cell-type-specific expression profiles. TSSs specific to different cell types evolve at different rates, whereas promoters of broadly expressed genes are the most conserved. Promoter-based expression analysis reveals key transcription factors defining cell states and links them to binding-site motifs. The functions of identified novel transcripts can be predicted by coexpression and sample ontology enrichment analyses. The functional annotation of the mammalian genome 5 (FANTOM5) project provides comprehensive expression profiles and functional annotation of mammalian cell-type-specific transcriptomes with wide applications in biomedical research.