Molin Wang

Associate Professor in the Departments of Epidemiology and Biostatistics

Department of Epidemiology

Department of Biostatistics


As a faculty member with joint appointments in both Epidemiology and Biostatistics, Dr. Wang’s current methodological  researches focus on statistical challenges encountered in epidemiological studies, including assessment of timing of effect in survival analysis, causal inferences for clustered data, evaluation of etiological disease heterogeneity, methods for pooling biomarker data, and measurement error and missing data problems. She has also developed semi-parametric methods for reducing the impact of nuisance parameters.

She has been the lead statistician for the Nurses’ Health Study II and the Health Professionals Follow-up Study, and the lead statistician for several projects based on the Harvard cohorts. She is also the lead statistician for the Pooling Project on Diet and Cancer in Women and Men, the Circulating Biomarkers and Breast and Colorectal Cancer Consortium, the Pooling Project on Gestational Weight Gain in Low- and Middle-income Countries and several HIV studies conducted in Uganda, Tanzania and India. She has been actively working on analyses, providing input into the development of analytic procedures and their interpretation, and overseeing software development for the routine implementation of advanced and novel statistical methods.

Before October 2010, she had worked on statistical collaborations in various oncology projects with Harvard – Dana Farber Cancer Institute biomedical investigators and the Eastern Cooperative Oncology Group (ECOG), with a focus on the design and analysis of breast cancer clinical trials.

Here are SAS macros and R functions developed by Dr.Wang’s team.


Ph.D. in Biostatistics, Emory University

Selected First/Senior-Author Publications

          1. Wang M, Hanfelt JJ. Adjusted profile estimating function. Biometrika 2003;90:845-858.
          2. Wang M, Williamson J, Redline S. A semiparametric method for analyzing matched case-control family studies with a continuous outcome and proband sampling. Biometrics 2004;60(3):644-650.
          3. Wang M, Williamson JM. Generalization of the Mantel-Haenszel estimating function for sparse clustered binary data. Biometrics 2005; 61:973-981.
          4. Wang M, Fitzmaurice G. A simple imputation method for longitudinal studies with non-ignorable non-responses. Biom J 2006;48:302-318.
          5. Wang M, Hanfelt JJ. Orthogonal locally ancillary estimating functions for matched-pair studies and errors-in-covariates. J R Stat Soc-Series B 2007;69:411-428.
          6. Dahlberg SE, Wang M. A proportional hazards cure model for the analysis of time to event with frequently unidentifiable causes. Biometrics 2007; 63(4):1237-1244.
          7. Wang M, Hanfelt JJ. Robust modified profile estimating function with application to the generalized estimating equation. J Stat Planning Inference. 2008;.138:2029-2044.
          8. Wang M, Hanfelt JJ. A robust method for finely stratified familial studies with proband-based sampling. Biostatistics 2009;10:364-373. PMCID: PMC2648900
          9. Zhang JJ, Wang M. An accelerated failure time cure model for time-to-event data with masked cause of failure. Biom J 2009; 51:932-945.
          10. Wang, M, Liao, X. Spiegelman, D. Can efficiency be gained by correcting for misclassification? Journal of Statistical Planning and Inference. 2013; 143 (11): 1980-87. PMC3810993
          11. Wang, M., Kuchiba, A, Ogino, S. A Meta-Regression Method for Studying Etiologic Heterogeneity across Disease Subtypes Classified by Multiple Biomarkers. American Journal of Epidemiology 2015; 182(3):263-70.
          12. Wang M, Spiegelman D, Kuchiba A, Lochhead P, Kim S, Chan AT, Poole EM, Tamimi R, Tworoger SS, Giovannucci E, Rosner B, Ogino S. Statistical methods for studying disease subtype heterogeneity. Stat Med. 2016; 35(5): 782-800.
          13. Wang M., Xiaomei liao, Laden, F., Spiegelman D. Quantifying risk over the life course-latency, age-related susceptibility, and other time-varying exposure metrics: estimation and inference in prospective cohort studies. Stat Med. 2016;35(13):2283-95.
          14. Liu L*, Nevo D*(*co-first author), Nishihara R, Cao Y, Song M, Twombly TS, Chan AT, Giovannucci EL, VanderWeele TJ, Wang M.†, Ogino S† (†co-senior author). Utility of inverse probability weighting in molecular pathological epidemiology. Eu J Epidemiol. 2018 33(4):381-392.
          15. Nevo D, Nishihara R, Ogino S, Wang M. The competing risks Cox model with auxiliary case covariates under weaker missing-at-random cause of failure. Lifetime Data Anal. 2018; 24(3):425-442.
          16. Aschard H, Spiegelman, D, Laville V, Kraft P, Wang M. A test for gene-environment interaction in the presence of measurement error in the environmental variable. Genetic Epidemiology. 2018. 42(3):250-264.
          17. Nevo D, Hamada T, Ogino S, Wang M. A novel calibration framework for survival analysis when a binary covariate is measured at sparse time points. Biostatistics. 2018. doi: 10.1093/biostatistics/kxy063. [Epub ahead of print].
          18. Pesko S, Spiegelman D, Wang M. There is no impact of exposure measurement error on latency estimation in linear models. Statistics In Medicine. 2019;38(7):1245-1261.
          19. Sloan A, Song Y., Gail M, Betensky R, Rosner B, Ziegler RG, Smith-Warner, SA, Wang M. Design and analysis considerations for combining data from multiple biomarker studies. Statistics In Medicine. 2019;38(8): 1303-1320.
          20. Sloan A, Smith-Warner, SA, Ziegler RG, Wang M. Statistical methods for biomarker data pooled from multiple nested case-control studies. Biostatistics. 2019. Doi:10.1093/biostatistics/kxz051. [Epub ahead of print].