Current Research

Risk prediction of breast cancer following diagnosis of precancerous events
(with Giovanni Parmigiani, Ph.D. and Su-Chun Cheng, Sc.D.)

1. Expansion of the BRCAPRO model to account for DCIS (Ductal Carcinoma in Situ): evaluation of the penetrance of DCIS for individuals carrying BRCA1 or BRCA2 mutations. This has been obtained as an extension of some results known in literature by applying Bayes’ theorem. Analysis of recurrence of breast cancer after a diagnosis of DCIS.

2. Risk assessment in atypical breast hyperplasia (with Dr.Kevin Hughes, M.D.): our research would make available a new, comprehensive and validated approach for risk assessment in women diagnosed with atypical hyperplasia of the breast (atypia), contributing to a more efficient, targeted and personalized evaluation of preventative approaches for invasive breast cancer; this will both increase patients’ awareness of their options and risks, and better support a rational choice between available therapeutic treatments. Submitted for NIH funding.

Microsimulation models for time of onset of precancerous lesions
(with Giovanni Parmigiani, Ph.D.)
This work proposes MCMC techniques to estimate the age-specific density of time with apparent health in a natural history model, before the onset of preclinical and clinical cancer. Also, the same technique can be used to calibrate model parameters to comply with known data (e.g. incidence or prevalence).

Methodologies for identifying diagnostic and prognostic variables
(with Giovanni Parmigiani, Ph.D. and Mahlet Tadesse, Ph.D.)
The main goal of this work is to propose an exploratory methodology on high-dimensional data to identify variables associated with a phenotype, where there could be differential association across subgroups of subjects. Such relationships can often be present in genomic datasets, where markers may either be relevant for identifying subgroups or may be associated with the phenotype of interest.

Two approaches followed:
1. Quantile sliding and hypothesis testing: selection of the diagnostic variable by means of grid searching on its quantiles, assessment of the effect of the prognostic variable in each of the two subgroups determined by quantiles of the diagnostic variable using either the significance of the interaction coefficient in a saturated GLM based on the two variables or p-value of either a t-test  or a Wilcoxon test for assessing the significance of the differential effect of the prognostic variable between the two data subgroups. Ranking of the “top scoring” variables accounting for the best differential effect.
2. Extension of BayesianCART algorithm: we propose a unified method that addresses this problem by combining ideas of model-based clustering and Bayesian CART. This is accomplished by using an MCMC procedure in which the transition between configurations is achieved by randomly choosing between a clustering or a tree move. For the former, a terminal node may be split into two subclusters or two connected terminal nodes may be merged. For the latter, one of the four moves suggested in Chipman et al., JASA, 93(443), 1998,  and Denison et al., Biometrika, 85(2), 1998, is proposed.

Collaborative Biostatistician
(with Donna Neuberg, Sc.D.)

Biostatistician for the NeuroStemCell workgroup at the Dana-Farber Cancer Institute (Rosalind Segal, M.D., Ph.D., and Charles Stiles, Ph.D., PIs). Analysis of clinical trials data on Secondary AML.

(with Rebecca Gelman, Ph.D.)
Biostatistician for CFAR (Harvard University Center for Aids Research).

(with Paul Catalano, Sc.D.)
development of educational materials for in-person and online sessions to be done in conjunction with the Dana-Farber Cancer Institute’s Clinical Trials Education Office.