Software For Downloading

The development of the software provided here has been supported by the following grants: NIH ES009411, CA050597, CA081345, and CA055075.

Here are some of the people who develop and maintain this software, analyze the data, and design the studies, on the occasion of the celebration of 16 years of working with Ellen Hertzmark.

Software for measurement error and misclassification correction

%blinplus Implementing Rosner B, Spiegelman S, Willett W. Correction of logistic regression relative risk estimates and confidence intervals for measurement error: the case of multiple covariates measured with error. American Journal of Epidemiology 1990;132: 734-735.

betacomp.f Implementing Spiegelman D, Rosner B. Estimation and inference for binary data with covariate measurement error and misclassification for main study/validation study designs.  Journal of the American Statistical Association, 2000; 95:51-61.

goodwin.f77  Implementing Crouch EAC, Spiegelman D. The evaluation of integrals of the form f(t)exp{-t2}dt: Application to logistic-normal models. Journal of the American Statistical Association 1990; 85: 464-469.

Multsurr method Implementing Weller E, Milton D, Eisen E, Spiegelman D.  Regression calibration for logistic regression with multiple surrogates for one exposureJournal of Statistical Planning and Inference 2007; 137:449-461.  An S plus version is also available.

%relibpls8 Implementing Rosner B, Spiegelman D, Willett W, Correction of logistic regression relative risk estimates and confidence intervals for random within person measurement error. American Journal of Epidemiology 1992; 136: 1400-1413

%rrc Implementing the new method developed Liao X, Zucker D, Li Y, Spiegelman D. Survival analysis with error-prone time-varying covariates: a risk set calibration approach. Biometrics 2011 Mar; 67(1):50-58.

Software for study design/power calculation

ge_int.f Implementing Foppa I and Spiegelman D. Power and sample size calculations for case-control studies of gene-environment interactions with a polytomous exposure variable. American Journal of Epidemiology 1997; 146:596-604.

ge_trend_v2 Implementing power and sample size calculations developed in Spiegelman D and Logan R. Power and sample size for case-control studies of gene-environment interactions: a new method with comparison to old.  Submitted for publication, American Journal of Epidemiology, January, 2002.

holcroft.f77  Implementing Holcroft C, Spiegelman D.   Design of validation studies for estimating the odds ratio of exposure-disease relationships when exposure is misclassified. Biometrics, 1999; 55:1193-1201.

OPTITXS.r  Implementing sample size and power calculations for longitudinal (repeated measures) studies method in The Design of Observational Longitudinal Studies.

Software for studies of disease heterogeneity

%contrasttest The %contrasttest macro conducts heterogeneity test for comparing the exposure-disease associations obtained from separate subtype-specific analysis based on the cohort or nested case-control studies.

%icc9 Intraclass correlation coefficients (ICC) and their 95 percent confidence intervals.  Hankinson SE, Manson JE, Spiegelman D, Willett WC, Longcope C, Speizer FE. Reproduciblity of plasma hormone levels in postmenopausal women over a two to three year period. Cancer Epidemiology, Biomarkers and Prevention 1995; 4:649-654.

%meta_subtype_trend The %meta subtype trend macro tests whether the exposure-subtype association has a trend across the ordinal cancer subtypes. The user runs separate Cox (for cohort studies) or conditional logistic models (for nested case-control studies) for each subtype, and then tests the heterogeneity hypothesis using the outputs from the separate models, or the user takes the estimates (and standard errors) from the literature and test the heterogeneity hypothesis. In the subtype-specific analysis, the confounders-disease associations are allowed to be different among the subtypes.


%subtype Macro to examine whether the effects of the exposure vary by subtypes of a disease. It can be applied to data from the cohort studies, nested or matched case-control studies, unmatched case-control studies and case-case studies.

Software for meta-analysis

%metaanal produces Laird-Der Simonian estimators for fixed and random effects models in meta- and pooled analysis.

%metadose SAS macro for meta-analysis of dose-response. It is used when only limited data are available from research reports studying on the same dose-response relationship with different exposure or treatment levels. It is a two step macro: First, for each study, it uses the Greenland method (AJE, 1992) to get a single pooled estimate and its variance estimate across different exposure or treatment levels; Second, it does meta analysis for all relevant studies using the pooled numbers.  Submitted for publication, 2010.

tcs  Implementing Takkouche B, Cadarso-Surez C, Spiegelman D.  An evaluation of old and new tests for heterogeneity in meta-analysis for epidemiologic research. American Journal of Epidemiology, 1999;150:206-215.

Software for analysis/graphics

glmcurv9 The %GLMCURV9 macro uses SAS PROC GENMOD and restricted cubic splines to test whether there is a nonlinear relation between a continuous exposure and an outcome variable. The macro can automatically select spline variables for a model. It produces a publication quality graph of the relationship.

%kmplot9 Makes publication-quality Kaplan-Meier plots of survival data, following JAMA guidelines. Produces numerical output of the censoring summary, as well as of tests among subgroups (e.g., log-rank).

%lefttrunc macro makes publication-quality Kaplan-Meier-type curves using left-truncated data for a whole sample or for subgroups/strata.

%lgtphcurv9 Implementing Durrleman and Simon’s restricted cubic spline methodology to fit possibly non-linear exposure response curves in Cox and logistic regression models. Publication quality graphs are provided and a stepwise knot selection procedure is available to enhance the flexibility of the method. Govindarajulu U, Spiegelman D, Thurston SW, Eisen EA.  Comparing smoothing techniques for modeling exposure-response curves in Cox models. Statistics in Medicine, 2007; 26:3735-3752

%mediate Calculates the point and interval estimates of the percent of treatment (exposure) effect (PTE) explained by an intermediate variable.

%par Computing full and partial population attributable risks and their confidence intervals, for cohort studies.  Cancer Causes Control 2007 Jun;18(5):571-9

%relrisk9 Implementing log-binomial and log-Poisson models to get risk, prevalence and rate ratios and risk, prevalence and rate differences.  Am J Epidemiol 2005;162:199–200.

%robreg9  Robust linear regression empirical standard errors and p-values for when reasonable to use PROC REG. Point and interval estimates of effect on the (unitless) percent change scale.

%table1  Produces publication quality MS Word table with a breakdown of study/cohort characteristics, typically by categories of an exposure variable.

%yoll  Uses PROC PHREG to compute the time from a specific start time (or age) to an outcome (expected time after the start time to the outcome) or the time from the outcome to a specific time (or age) (expected time lost before the end time.

Miscellaneous/other software and tutorials

%int2way Makes all the 2-way interaction variables from a list of variables.

%pctl9  This macro is intended to make any desired number of quantiles for a list of variables. It can also make quantile indicators and median-score trend variables. A subset of the data can be used to determine the quantile boundaries.

Tutorial: How to control finely for confounding using continuous variables that may have a non-linear association with the outcome

Tutorial: How to run a longitudinal GEE model with very large datasets in a reasonable amount of CPU time