Department of Biostatistics
Statistical Methods in Epidemiology Working Group
2015 - 2016
ABSTRACT: We propose a network method to monitor health behaviors and point out the general conditions for it to work effectively. The method helps to identify effective informants for monitoring future health behaviors and to triangulate self-reports of sensitive health behaviors. We demonstrate the method by studying the smoking behaviors of over 4000 middle school students in China. Using students' observations of their schoolmates smoking in the past 30 days, we construct smoking detection networks and examine the patterns of smoking detection through exponential random graph models. We find that smokers, optimistic students, and popular students make better informants than their counterparts. We also find that using three to four (or the 3rd quartile of) positive peer reports can uncover a good number of under-reported smokers while not producing excessive false positives. In short, the method we proposed may be used to improve future survey designs and data quality with a low cost.
ABSTRACT: Reference charts for fetal measures are used for early detection of pregnancies that should be monitored closely. Construction of reference charts corresponds to estimation of quantiles of a distribution as a function of gestational age. Existing methods have been developed under various modeling assumptions, typically by fitting a polynomial regression to certain functionals of the distributions. We relax the assumptions of a parametric polynomial link between the distribution parameters and age and consider cubic splines and discretization of age in order to compare charts based on more flexible and simpler models, respectively. We compare the different methods using various tools and demonstrate the importance of considering performance measures calculated from age-stratified data. We compare our charts to similar charts that have been recently published and emphasize that the source of an apparent heterogeneity should be discussed.
ABSTRACT: Due to challenges generating a nationally representative sample of the population in South Sudan, the true sickle cell mortality rate is currently unknown. Via an incomplete dataset, to estimate the excess risk of mortality associated with the SS genotype, we regressed genotype (SS versus AS and AA combined) on age using a generalized estimating equation with a logit link. We used an exchangeable correlation structure within household to account for clustering in genotypes among relatives. We excluded all individuals missing either age or genotype. The age coefficient from the generalized estimating equation is the change in the log odds of having SS genotype per year increase of age. Multiplying the age coefficient by negative one yields the additive increase in hazard of mortality associated with the SS genotype. The estimated change in log odds of having the SS genotype per year increase in age was -0.00058 (95% CI -0.0359, 0.0242). This represents a non-statistically significant 2.9% increase in five-year mortality for individuals with the SS genotype relative to those with AS and AA genotypes.
ABSTRACT: The missing covariate indicator method has been considered a biased approach and thus dismissed as a method for dealing with missing data on covariates that may confound the exposure-outcome association in epidemiologic studies. However, the magnitude and determinants of such bias has never been assessed. We derived the formula for the relative bias arising from use of the missing covariate indicator method for estimating the relative risk of outcome in relation to exposure. When the covariate is not a confounder but only a risk factor for the outcome, the missing covariate indicator method is unbiased. In addition, we found that the relative bias does not depend on the disease prevalence or the association between the exposure and outcome, but is a function of 5 other parameters: the prevalences of the exposure and covariate, the proportion of missingness of the covariate, the relative risk of outcome in relation to the covariate, and the relationship between the exposure and the covariate. In an extensive numerical study, we found that, over a wide range of these 5 parameters, the median relative bias was always zero across any of the parameters averaged over all the others and the relative bias exceeded 10% in only 3% of the parameter space explored. In settings with covariate missingness of no greater than 50%, the percentage of scenarios with relative bias greater than 10% was less than 5%. In the Nurses' Health Study and Health Professionals Follow-up studies, the proportion of missing covariate data is low (for example, ‹10% for established breast cancer risk factors), and the missing covariate indicator method produced results that were materially the same as those obtained by the multiple imputation method in head to head comparisons for a number of exposure-disease associations where reviewers requested multiple imputation. In conclusion, the missing covariate indicator method is nearly valid almost always in settings typically encountered in epidemiology and its continued use is recommended, unless the covariate is missing in an extreme proportion or acts as a strong confounder, with a relative risk for the outcome in relation to the confounder greater than 5 or with a very strong association between the exposure and confounder, both of which rarely occur in practice.
ABSTRACT: None Given
ABSTRACT: None Given
ABSTRACT: None Given
|Back to SPH Biostatistics||
Maintained by the
Last Update: January 6, 2016