In this study, we performed an in-depth characterization of the male pediatric infant urinary proteome by parallel proteomic analysis of normal healthy adult (n=6) and infant (n=6) males and comparison to available published data. A total of 1584 protein groups were identified. Of these, 708 proteins were identified in samples from both cohorts. Although present in both cohorts, 136 of these common proteins were significantly enriched in urine from adults and 94 proteins were significantly enriched in urine from infants. Using Gene Ontology, we found that the infant-enriched or specific subproteome (743 proteins) had an overrepresentation of proteins that are involved in translation and transcription, cellular growth and metabolic processes. In contrast, the adult enriched or specific subproteome (364 proteins) showed an overexpression of proteins involved in immune response and cell adhesion. This study demonstrates that the non-diseased male urinary proteome is quantitatively affected by age, has age-specific subproteomes, and identifies a common subproteome with no age-dependent abundance variations. These findings highlight the importance of age-matching in urinary proteomics. This article is part of a Special Issue entitled: Biomarkers: A Proteomic Challenge.
Publications
2014
UNLABELLED: Alternate promoter usage is an important molecular mechanism for generating RNA and protein diversity. Cap Analysis Gene Expression (CAGE) is a powerful approach for revealing the multiplicity of transcription start site (TSS) events across experiments and conditions. An understanding of the dynamics of TSS choice across these conditions requires both sensitive quantification and comparative visualization. We have developed CAGExploreR, an R package to detect and visualize changes in the use of specific TSS in wider promoter regions in the context of changes in overall gene expression when comparing different CAGE samples. These changes provide insight into the modification of transcript isoform generation and regulatory network alterations associated with cell types and conditions. CAGExploreR is based on the FANTOM5 and MPromDb promoter set definitions but can also work with user-supplied regions. The package compares multiple CAGE libraries simultaneously. Supplementary Materials describe methods in detail, and a vignette demonstrates a workflow with a real data example.
AVAILABILITY AND IMPLEMENTATION: The package is freely available under the MIT license from CRAN (http://cran.r-project.org/web/packages/CAGExploreR).
CONTACT: edimont@mail.harvard.edu Supplementary information: Supplementary data are available at Bioinformatics online.
Identifying microRNA (miRNA)-regulated genes is key to understanding miRNA function. However, many miRNA recognition elements (MREs) do not follow canonical "seed" base-pairing rules, making identification of bona fide targets challenging. Here, we apply an unbiased sequencing-based systems approach to characterize miR-522, a member of the oncogenic primate-specific chromosome 19 miRNA cluster, highly expressed in poorly differentiated cancers. To identify miRNA targets, we sequenced full-length transcripts captured by a biotinylated miRNA mimic. Within these targets, mostly noncanonical MREs were identified by sequencing RNase-resistant fragments. miR-522 overexpression reduced mRNA, protein levels, and luciferase activity of >70% of a random list of candidate target genes and MREs. Bioinformatic analysis suggested that miR-522 regulates cell proliferation, detachment, migration, and epithelial-mesenchymal transition. miR-522 induces G1 cell-cycle arrest and causes cells to detach without anoikis, become invasive, and express mesenchymal genes. Thus, our method provides a simple but effective technique for identifying miRNA-regulated genes and biological function.
LIN28 function is fundamental to the activity and behavior of human embryonic stem cells (hESCs) and induced pluripotent stem cells. Its main roles in these cell types are the regulation of translational efficiency and let-7 miRNA maturation. However, LIN28-associated mRNA cargo shifting and resultant regulation of translational efficiency upon the initiation of differentiation remain unknown. An RNA-immunoprecipitation and microarray analysis protocol, eRIP, that has high specificity and sensitivity was developed to test endogenous LIN28-associated mRNA cargo shifting. A combined eRIP and polysome analysis of early stage differentiation of hESCs with two distinct differentiation cues revealed close similarities between the dynamics of LIN28 association and translational modulation of genes involved in the Wnt signaling, cell cycle, RNA metabolism and proteasomal pathways. Our data demonstrate that change in translational efficiency is a major contributor to early stages of differentiation of hESCs, in which LIN28 plays a central role. This implies that eRIP analysis of LIN28-associated RNA cargoes may be used for rapid functional quality control of pluripotent stem cells under manufacture for therapeutic applications.
2013
It is acknowledged that some obesity trajectories are set early in life, and that rapid weight gain in infancy is a risk factor for later development of obesity. Identifying modifiable factors associated with early rapid weight gain is a prerequisite for curtailing the growing worldwide obesity epidemic. Recently, much attention has been given to findings indicating that gut microbiota may play a role in obesity development. We aim at identifying how the development of early gut microbiota is associated with expected infant growth. We developed a novel procedure that allows for the identification of longitudinal gut microbiota patterns (corresponding to the gut ecosystem developing), which are associated with an outcome of interest, while appropriately controlling for the false discovery rate. Our method identified developmental pathways of Staphylococcus species and Escherichia coli that were associated with expected growth, and traditional methods indicated that the detection of Bacteroides species at day 30 was associated with growth. Our method should have wide future applicability for studying gut microbiota, and is particularly important for translational considerations, as it is critical to understand the timing of microbiome transitions prior to attempting to manipulate gut microbiota in early life.
Comparisons of stem cell experiments at both molecular and semantic levels remain challenging due to inconsistencies in results, data formats, and descriptions among biomedical research discoveries. The Harvard Stem Cell Institute (HSCI) has created the Stem Cell Commons (stemcellcommons.org), an open, community-based approach to data sharing. Experimental information is integrated using the Investigation-Study-Assay tabular format (ISA-Tab) used by over 30 organizations (ISA Commons, isacommons.org). The early adoption of this format permitted the novel integration of three independent systems to facilitate stem cell data storage, exchange and analysis: the Blood Genomics Repository, the Stem Cell Discovery Engine, and the new Refinery platform that links the Galaxy analytical engine to data repositories.
RATIONALE: Endotoxin is a near ubiquitous environmental exposure that that has been associated with both asthma and chronic obstructive pulmonary disease (COPD). These obstructive lung diseases have a complex pathophysiology, making them difficult to study comprehensively in the context of endotoxin. Genome-wide gene expression studies have been used to identify a molecular snapshot of the response to environmental exposures. Identification of differentially expressed genes shared across all published murine models of chronic inhaled endotoxin will provide insight into the biology underlying endotoxin-associated lung disease.
METHODS: We identified three published murine models with gene expression profiling after repeated low-dose inhaled endotoxin. All array data from these experiments were re-analyzed, annotated consistently, and tested for shared genes found to be differentially expressed. Additional functional comparison was conducted by testing for significant enrichment of differentially expressed genes in known pathways. The importance of this gene signature in smoking-related lung disease was assessed using hierarchical clustering in an independent experiment where mice were exposed to endotoxin, smoke, and endotoxin plus smoke.
RESULTS: A 101-gene signature was detected in three murine models, more than expected by chance. The three model systems exhibit additional similarity beyond shared genes when compared at the pathway level, with increasing enrichment of inflammatory pathways associated with longer duration of endotoxin exposure. Genes and pathways important in both asthma and COPD were shared across all endotoxin models. Mice exposed to endotoxin, smoke, and smoke plus endotoxin were accurately classified with the endotoxin gene signature.
CONCLUSIONS: Despite the differences in laboratory, duration of exposure, and strain of mouse used in three experimental models of chronic inhaled endotoxin, surprising similarities in gene expression were observed. The endotoxin component of tobacco smoke may play an important role in disease development.
New strategies to combat complex human disease require systems approaches to biology that integrate experiments from cell lines, primary tissues and model organisms. We have developed Pathprint, a functional approach that compares gene expression profiles in a set of pathways, networks and transcriptionally regulated targets. It can be applied universally to gene expression profiles across species. Integration of large-scale profiling methods and curation of the public repository overcomes platform, species and batch effects to yield a standard measure of functional distance between experiments. We show that pathprints combine mouse and human blood developmental lineage, and can be used to identify new prognostic indicators in acute myeloid leukemia. The code and resources are available at http://compbio.sph.harvard.edu/hidelab/pathprint.
Basal-like triple-negative breast cancers (TNBCs) have poor prognosis. To identify basal-like TNBC dependencies, a genome-wide siRNA lethality screen compared two human breast epithelial cell lines transformed with the same genes: basal-like BPLER and myoepithelial HMLER. Expression of the screen's 154 BPLER dependency genes correlated with poor prognosis in breast, but not lung or colon, cancer. Proteasome genes were overrepresented hits. Basal-like TNBC lines were selectively sensitive to proteasome inhibitor drugs relative to normal epithelial, luminal, and mesenchymal TNBC lines. Proteasome inhibition reduced growth of established basal-like TNBC tumors in mice and blocked tumor-initiating cell function and macrometastasis. Proteasome addiction in basal-like TNBCs was mediated by NOXA and linked to MCL-1 dependence.