Department of Biostatistics
Quantitative Issues in Cancer Research Working Seminar
2013 - 2014
ABSTRACT: In recent years, genetic and biological markers have been examined extensively for their potential to signal progression or risk of disease. In addition to these markers, it has often been argued that short term outcomes may be helpful in making a better prediction of disease outcomes in clinical practice. Due to the potential difference in the underlying disease process, patients who have experienced a short term event of interest may have very different long term clinical outcomes from the general patient population. Most existing methods for incorporating censored short term event information in predicting long term survival focus on modeling the disease process and are derived under parametric models in a multi-state survival setting. In this talk, I will discuss prediction and estimation procedures that incorporating short term event time information up to a landmark point along with baseline covariates.
ABSTRACT: High dimensional data associated with the various genomic and other approaches are arriving at our doors frequently, whether we are biostatistician, bioinformatricians, or computational biologists. I will review some of my experiences, and explain why I think garden-variety biostatisticians have a role to play in the analyses of these data.
BACKGROUND & OBJECTIVE: Genome-wide profiles of tumors obtained using functional genomics platforms are being deposited to the public repositories at an astronomical scale, as a result of focused efforts by individual laboratories and large projects such as the Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium. Consequently, there is an urgent need for reliable tools that integrate and interpret these data in light of current knowledge and disseminate results to biomedical researchers in a user-friendly manner. We have built the canEvolve web portal to meet this need.
RESULTS: canEvolve query functionalities are designed to fulfill most frequent analysis needs of cancer researchers with a view to generate novel hypotheses. canEvolve stores gene, microRNA (miRNA) and protein expression profiles, copy number alterations for multiple cancer types, and protein-protein interaction information. canEvolve allows querying of results of primary analysis, integrative analysis and network analysis of oncogenomics data. The querying for primary analysis includes differential gene and miRNA expression as well as changes in gene copy number measured with SNP microarrays. canEvolveprovides results of integrative analysis of gene expression profiles with copy number alterations and with miRNA profiles as well as generalized integrative analysis using gene set enrichment analysis. The network analysis capability includes storage and visualization of gene co-expression, inferred gene regulatory networks and protein-protein interaction information. Finally,canEvolve provides correlations between gene expression and clinical outcomes in terms of univariate survival analysis.
CONCLUSION: At present canEvolve provides different types of information extracted from 90 cancer genomics studies comprising of more than 10,000 patients. The presence of multiple data types, novel integrative analysis for identifying regulators of oncogenesis, network analysis and ability to query gene lists/pathways are distinctive features of canEvolve. canEvolve will facilitate integrative and meta-analysis of oncogenomics datasets.
ABSTRACT: Measurement error in time to event data used as a predictor will lead to inaccurate predictions. This arises in the context of self-reported family history, a time to event predictor often measured with error, used in Mendelian risk prediction models. Using a validation data set, we propose a method to adjust for this type of measurement error. We estimate the measurement error process using a nonparametric smoothed Kaplan-Meier estimator, and use Monte Carlo integration to implement the adjustment. We apply our method to simulated data in the context of both Mendelian risk prediction models and multivariate survival prediction models, as well as illustrate our method using a data application for Mendelian risk prediction models. Results from simulations are evaluated using measures of mean squared error of prediction (MSEP), area under the response operating characteristics curve (ROC-AUC), and the ratio of observed to expected number of events. These results show that our adjusted method mitigates the effects of measurement error mainly by improving calibration and by improving total accuracy. In some scenarios discrimination is also improved.
|Back to HSPH Biostatistics||
Maintained by the
Last Update: February 20, 2014