Publications

2011

Wang, Kenneth C, Anthony Jeanmenne, Griffin M Weber, Shrey K Thawait, and John A Carrino. (2011) 2011. “An Online Evidence-Based Decision Support System for Distinguishing Benign from Malignant Vertebral Compression Fractures by Magnetic Resonance Imaging Feature Analysis.”. Journal of Digital Imaging 24 (3): 507-15. https://doi.org/10.1007/s10278-010-9316-3.

Decision support systems have been used to promote the practice of evidence-based medicine. Computer-assisted diagnosis can serve as one element of evidence-based radiology. One area where such tools may provide benefit is analysis of vertebral compression fractures (VCFs), which can be a challenge in MRI interpretation. VCFs may be benign or malignant in etiology, and several MRI features may help to make this important distinction. We describe a web-based decision support system for discriminating benign from malignant VCFs as a prototype for a more general diagnostic decision support framework for radiologists. The system has three components: a feature checklist with an image gallery derived from proven reference cases, a prediction model, and a reporting mechanism. The website allows users to input the findings for a case to be interpreted using a structured feature checklist. The image gallery complements the checklist, for clarity and training purposes. The input from the checklist is then used to calculate the likelihood of malignancy by a logistic regression prediction model. Standardized report text is generated that summarizes pertinent positive and negative findings. This computer-assisted diagnosis system demonstrates the integration of three areas where diagnostic decision support can aid radiologists: first, in image interpretation, through feature checklists and illustrative image galleries; second, in feature-based prediction modeling; and third, in structured reporting. We present a diagnostic decision support tool that provides radiologists with evidence-based guidance for discriminating benign from malignant VCF. This model may be useful in other difficult-diagnosis situations and requires further clinical testing.

Weber, Griffin M, William Barnett, Mike Conlon, David Eichmann, Warren Kibbe, Holly Falk-Krzesinski, Michael Halaas, et al. (2011) 2011. “Direct2Experts: a Pilot National Network to Demonstrate Interoperability Among Research-Networking Platforms.”. Journal of the American Medical Informatics Association : JAMIA 18 Suppl 1 (Suppl 1): i157-60. https://doi.org/10.1136/amiajnl-2011-000200.

Research-networking tools use data-mining and social networking to enable expertise discovery, matchmaking and collaboration, which are important facets of team science and translational research. Several commercial and academic platforms have been built, and many institutions have deployed these products to help their investigators find local collaborators. Recent studies, though, have shown the growing importance of multiuniversity teams in science. Unfortunately, the lack of a standard data-exchange model and resistance of universities to share information about their faculty have presented barriers to forming an institutionally supported national network. This case report describes an initiative, which, in only 6 months, achieved interoperability among seven major research-networking products at 28 universities by taking an approach that focused on addressing institutional concerns and encouraging their participation. With this necessary groundwork in place, the second phase of this effort can begin, which will expand the network's functionality and focus on the end users.

2010

Milo, Ron, Paul Jorgensen, Uri Moran, Griffin Weber, and Michael Springer. (2010) 2010. “BioNumbers–the Database of Key Numbers in Molecular and Cell Biology.”. Nucleic Acids Research 38 (Database issue): D750-3. https://doi.org/10.1093/nar/gkp889.

BioNumbers (http://www.bionumbers.hms.harvard.edu) is a database of key numbers in molecular and cell biology–the quantitative properties of biological systems of interest to computational, systems and molecular cell biologists. Contents of the database range from cell sizes to metabolite concentrations, from reaction rates to generation times, from genome sizes to the number of mitochondria in a cell. While always of importance to biologists, having numbers in hand is becoming increasingly critical for experimenting, modeling, and analyzing biological systems. BioNumbers was motivated by an appreciation of how long it can take to find even the simplest number in the vast biological literature. All numbers are taken directly from a literature source and that reference is provided with the number. BioNumbers is designed to be highly searchable and queries can be performed by keywords or browsed by menus. BioNumbers is a collaborative community platform where registered users can add content and make comments on existing data. All new entries and commentary are curated to maintain high quality. Here we describe the database characteristics and implementation, demonstrate its use, and discuss future directions for its development.

Murphy, Shawn N, Griffin Weber, Michael Mendis, Vivian Gainer, Henry C Chueh, Susanne Churchill, and Isaac Kohane. (2010) 2010. “Serving the Enterprise and Beyond With Informatics for Integrating Biology and the Bedside (i2b2).”. Journal of the American Medical Informatics Association : JAMIA 17 (2): 124-30. https://doi.org/10.1136/jamia.2009.000893.

Informatics for Integrating Biology and the Bedside (i2b2) is one of seven projects sponsored by the NIH Roadmap National Centers for Biomedical Computing (http://www.ncbcs.org). Its mission is to provide clinical investigators with the tools necessary to integrate medical record and clinical research data in the genomics age, a software suite to construct and integrate the modern clinical research chart. i2b2 software may be used by an enterprise's research community to find sets of interesting patients from electronic patient medical record data, while preserving patient privacy through a query tool interface. Project-specific mini-databases ("data marts") can be created from these sets to make highly detailed data available on these specific patients to the investigators on the i2b2 platform, as reviewed and restricted by the Institutional Review Board. The current version of this software has been released into the public domain and is available at the URL: http://www.i2b2.org/software.

2009

Weber, Griffin M, Shawn N Murphy, Andrew J McMurry, Douglas MacFadden, Daniel J Nigrin, Susanne Churchill, and Isaac S Kohane. (2009) 2009. “The Shared Health Research Information Network (SHRINE): A Prototype Federated Query Tool for Clinical Data Repositories.”. Journal of the American Medical Informatics Association : JAMIA 16 (5): 624-30. https://doi.org/10.1197/jamia.M3191.

The authors developed a prototype Shared Health Research Information Network (SHRINE) to identify the technical, regulatory, and political challenges of creating a federated query tool for clinical data repositories. Separate Institutional Review Boards (IRBs) at Harvard's three largest affiliated health centers approved use of their data, and the Harvard Medical School IRB approved building a Query Aggregator Interface that can simultaneously send queries to each hospital and display aggregate counts of the number of matching patients. Our experience creating three local repositories using the open source Informatics for Integrating Biology and the Bedside (i2b2) platform can be used as a road map for other institutions. The authors are actively working with the IRBs and regulatory groups to develop procedures that will ultimately allow investigators to obtain identified patient data and biomaterials through SHRINE. This will guide us in creating a future technical architecture that is scalable to a national level, compliant with ethical guidelines, and protective of the interests of the participating hospitals.

2008

Shen, Lucy Q, Angie Child, Griffin M Weber, Judah Folkman, and Lloyd Paul Aiello. (2008) 2008. “Rosiglitazone and Delayed Onset of Proliferative Diabetic Retinopathy.”. Archives of Ophthalmology (Chicago, Ill. : 1960) 126 (6): 793-9. https://doi.org/10.1001/archopht.126.6.793.

OBJECTIVE: To evaluate whether rosiglitazone maleate, an oral peroxisome-proliferating activated receptor gamma agonist and oral insulin sensitizing agent with potential antiangiogenic activity, delays onset of proliferative diabetic retinopathy (PDR).

METHODS: Longitudinal medical record review of all patients treated with rosiglitazone receiving both medical and ophthalmic care at the Joslin Diabetes Center from May 1, 2002, to May 31, 2003 (N = 124), and matched control patients not taking a glitazone drug (N = 158). The mean duration of follow-up was 2.8 years (range, 0.3-9.0 years).

RESULTS: Baseline characteristics and final hemoglobin A(1c) values (7.6% and 7.8%, respectively) were similar in the rosiglitazone and control groups (P = .10). In eyes with severe nonproliferative diabetic retinopathy at baseline (rosiglitazone group, 14 eyes; control group, 24 eyes), progression to PDR over 3 years occurred in 19.2% in the rosiglitazone group and 47.4% in the control group, representing a 59% relative risk reduction (Wilcoxon, P = .045; log-rank, P = .059). Fewer eyes in the rosiglitazone group experienced 3 or more lines of visual acuity loss (P = .03). The incidence of diabetic macular edema was similar in both groups.

CONCLUSIONS: Rosiglitazone may delay the onset of PDR, possibly because of its antiangiogenic activity. Future clinical investigations should consider analysis of this potential benefit along with ongoing evaluation of potential cardiac risk in studies where the risk-benefit profiles are deemed appropriate.

2006

Weber, Griffin, Lucila Ohno-Machado, and Stuart Shieber. (2006) 2006. “Representation in Stochastic Search for Phylogenetic Tree Reconstruction.”. Journal of Biomedical Informatics 39 (1): 43-50.

Phylogenetic tree reconstruction is a process in which the ancestral relationships among a group of organisms are inferred from their DNA sequences. For all but trivial sized data sets, finding the optimal tree is computationally intractable. Many heuristic algorithms exist, but the branch-swapping algorithm used in the software package PAUP* is the most popular. This method performs a stochastic search over the space of trees, using a branch-swapping operation to construct neighboring trees in the search space. This study introduces a new stochastic search algorithm that operates over an alternative representation of trees, namely as permutations of taxa giving the order in which they are processed during stepwise addition. Experiments on several data sets suggest that this algorithm for generating an initial tree, when followed by branch-swapping, can produce better trees for a given total amount of time.

2004

Col, Nananda F, Griffin Weber, Anne Stiggelbout, John Chuo, Ralph D’Agostino, and Phaedra Corso. (2004) 2004. “Short-Term Menopausal Hormone Therapy for Symptom Relief: An Updated Decision Model.”. Archives of Internal Medicine 164 (15): 1634-40.

BACKGROUND: Hormone therapy (HT) provides the most effective relief of menopausal symptoms. This therapy is associated with a decreased risk of osteoporosis and colorectal cancer but increased risks of cardiovascular disease (CVD), venous thrombosis, and breast cancer. Our objective was to identify which women should benefit from short-term HT by exploring the trade-off between symptom relief and risks of inducing disease.

METHODS: A Markov model simulates the effect of short-term (2 years) estrogen and progestin HT on life expectancy and quality-adjusted life expectancy (QALE) among 50-year-old menopausal women with intact uteri, using findings from the Women's Health Initiative. Quality-of-life (QOL) utility scores were derived from the literature. We assumed HT-affected QOL only during perimenopause, when it reduced symptoms by 80%.

RESULTS: Among asymptomatic women, short-term HT was associated with net losses in life expectancy and QALE of 1 to 3 months, depending on CVD risk. Women with mild or severe menopausal symptoms gained 3 to 4 months or 7 to 8 months of QALE, respectively. Among women at low risk for CVD, HT extended QALE if menopausal symptoms lowered QOL by as little as 4%. Among women at elevated CVD risk, HT extended QALE only if symptoms lowered QOL by at least 12%.

CONCLUSIONS: Hormone therapy is associated with losses in survival but gains in QALE for women with menopausal symptoms. Women expected to benefit from short-term HT can be identified by the severity of their menopausal symptoms and CVD risk.

Weber, Griffin, Staal Vinterbo, and Lucila Ohno-Machado. (2004) 2004. “Multivariate Selection of Genetic Markers in Diagnostic Classification.”. Artificial Intelligence in Medicine 31 (2): 155-67.

Analysis of gene expression data obtained from microarrays presents a new set of challenges to machine learning modeling. In this domain, in which the number of variables far exceeds the number of cases, identifying relevant genes or groups of genes that are good markers for a particular classification is as important as achieving good classification performance. Although several machine learning algorithms have been proposed to address the latter, identification of gene markers has not been systematically pursued. In this article, we investigate several algorithms for selecting gene markers for classification. We test these algorithms using logistic regression, as this is a simple and efficient supervised learning algorithm. We demonstrate, using 10 different data sets, that a conditionally univariate algorithm constitutes a viable choice if a researcher is interested in quickly determining a set of gene expression levels that can serve as markers for disease. We show that the classification performance of logistic regression is not very different from that of more sophisticated algorithms that have been applied in previous studies, and that the gene selection in the logistic regression algorithm is reasonable in both cases. Furthermore, the algorithm is simple, its theoretical basis is well established, and our user-friendly implementation is now freely available on the internet, serving as a benchmarking tool for the development of new algorithms.

Blackshaw, Seth, Sanjiv Harpavat, Jeff Trimarchi, Li Cai, Haiyan Huang, Winston P Kuo, Griffin Weber, et al. (2004) 2004. “Genomic Analysis of Mouse Retinal Development.”. PLoS Biology 2 (9): E247.

The vertebrate retina is comprised of seven major cell types that are generated in overlapping but well-defined intervals. To identify genes that might regulate retinal development, gene expression in the developing retina was profiled at multiple time points using serial analysis of gene expression (SAGE). The expression patterns of 1,051 genes that showed developmentally dynamic expression by SAGE were investigated using in situ hybridization. A molecular atlas of gene expression in the developing and mature retina was thereby constructed, along with a taxonomic classification of developmental gene expression patterns. Genes were identified that label both temporal and spatial subsets of mitotic progenitor cells. For each developing and mature major retinal cell type, genes selectively expressed in that cell type were identified. The gene expression profiles of retinal Müller glia and mitotic progenitor cells were found to be highly similar, suggesting that Müller glia might serve to produce multiple retinal cell types under the right conditions. In addition, multiple transcripts that were evolutionarily conserved that did not appear to encode open reading frames of more than 100 amino acids in length ("noncoding RNAs") were found to be dynamically and specifically expressed in developing and mature retinal cell types. Finally, many photoreceptor-enriched genes that mapped to chromosomal intervals containing retinal disease genes were identified. These data serve as a starting point for functional investigations of the roles of these genes in retinal development and physiology.