Core C: Bioststistics

ABSTRACT

The Biostatistics Core will provide centralized statistical and analytical expertise to all Center projects. The Core Faculty are drawn from the Environmental Statistics Program within the Department of Biostatistics and the Exposure, Epidemiology, and Risk Program within the Department of Environmental Health at the Harvard School of Public Health. Core members bring expertise in the general statistical methods needed for the projects, such as linear regression and ANOVA, correlated data analysis (including longitudinal and spatial data analysis), measurement error, generalized additive models, meta-analysis, structural equation models, and Bayesian data analysis.

The Biostatistics Core will provide: 1, support for statistical analysis for all five proposed projects, including substantial design consultation and analytical work, and; 2, training for investigators in both statistical issues involved in the data analysis as well as in SAS and Splus/R software. In addition, the Core will 3, conduct mission-related methodological research when existing methodology does not fully address the scientific question of interest.

A critical component in all of the projects is power and sample size calculations. Prospective calculations allow project investigators to be confident that the Center projects will have high power to detect meaningful differences. Core investigators have worked closely with Center investigators to: 1, determine effect sizes of interest and: 2, calculate the number of samples necessary to achieve a desired level of power, usually 80% or 90%.

To the extent allowable by design and outcome commonalities among the five projects, Core investigators will ensure that a unified approach to data analysis, in terms of modeling strategy and choice of data transformations, is applied to all Center data. Data analyses will apply good exploratory data analysis techniques, such as univariate explorations of the data, distributional checks, and outlier identification to data from all projects. For model building, residual analysis and other model diagnostics to confirm model fit, identify possible nonlinear relationships between predictors and outcomes, and identify highly influential data points will be routinely employed as part of a sound data analysis strategy. Once the data have been checked and modeling assumptions verified, primary analysis methods will include ANOVA and regression techniques, with the particular form and correlation structure of the data dictating the particular method. The main methods of analysis will be linear models/generalized linear models, multi-way ANOVA, semi-parametric regression modeling, and mixed/multivariate models for correlated responses.

Methods developed by Core investigators with support from the current Center will play a large role in future Center investigations. These techniques include smoothing methods, distributed lag models, exposure measurement error corrections, and case-crossover analyses. Future methodological developments will focus on spatial modeling of pollution, methods to address model uncertainty, and methods to assess the health effects of complex pollution mixtures.

1. OBJECTIVES

The Biostatistics Core will provide centralized statistical and analytical expertise to all projects. The Core Faculty are drawn from the Environmental Statistics Program within the Department of Biostatistics and the Exposure, Epidemiology, and Risk Program within the Department of Environmental Health at the Harvard School of Public Health. Core members bring expertise in the general statistical methods needed for the projects, such as linear regression and ANOVA, correlated data analysis (including longitudinal and spatial data analysis), measurement error, generalized additive models, meta-analysis, structural equation models, and Bayesian data analysis. The Biostatistics Core will provide: