Seminar with Forrest Crawford – 1/17

MECHANISM AND STRUCTURE: Inferential Approaches for Complex Processes in Epidemiology & Public Heath

Many important inferential problems in public health arise from complex biological or behavioral processes over which investigators have little control. “Mechanistic” models capture features of dynamic data-generating processes, permitting inferences with real-world interpretations and detailed predictions. “Agnostic” approaches refrain from specifying the full joint distribution of the data, and provide inferences with desirable robustness properties. Statisticians often disagree about which of these paradigms is most useful, with competing claims about model realism, bias, and credibility of estimates. In this talk, I outline two projects that use complex highly structured data to balance these concerns, and make useful inferences for public health. 1) Social network link- tracing methods like respondent-driven sampling (RDS) are widely used for sampling members of hidden or hard-to-reach populations such as drug users, sex workers, or homeless people. RDS is not a sampling design, but researchers routinely use survey sampling assumptions to make inferences about the network and population of interest. I show formally which network features are identified by RDS studies, and which are not. A continuous-time network-structural view of RDS permits reconstruction the induced subgraph of respondents. I describe a technique for estimating the size of a hidden population from an RDS sample. 2) Estimating the causal effect of an intervention (e.g. vaccine) on an infectious disease outcome in an interconnected population is a profoundly difficult problem in epidemiology. Contagion produces interference between subjects because infection in one unit can be transmitted to another. I define a symmetric structural model of infectious disease transmission in continuous time, and investigate the causal features of this process recovered by traditional regression approaches. I exhibit the circumstances under which a regression coefficient in a marginal model implies an effect whose direction is opposite that of the true treatment effect. I explain these findings in the epidemiologic language of confounding and Simpson’s paradox, and propose alternative estimation strategies.