My colleagues and I have proposed a taxonomy of biases in causal inference research. We described the structure of the biases by using causal diagrams known as directed acyclic graphs or DAGs, which we cover in our book and edX course. (It is no secret that we love big DAGs.) We distinguish three types of systematic bias: confounding, selection bias, and measurement bias.
Confounding is the bias that arises when treatment and outcome share causes because treatment was not randomly assigned. Economists refer to confounding as “selection bias” or “selection on treatment”, but that terminology is a bit confusing because there is a different type of bias that is due to selection of individuals into the analysis. Economists also refer to confounding as “omitted variable bias”, which I find long and not specific enough. Regardless of your preferred terminology, confounding adjustment requires subject-matter knowledge, as we explain here:
- Hernán MA, Hernández-Díaz S, Werler MM, Mitchell AA. Causal knowledge as a prerequisite for confounding evaluation: an application to birth defects epidemiology. American Journal of Epidemiology 2002; 155(2):176–184.
Using real data, we argue that blind statistical adjustment for confounding may actually result in more bias than lack of adjustment. We have also explained how causal diagrams can be used to predict the direction of bias due to confounding:
- VanderWeele TJ, Hernán MA, Robins JM. Causal directed acyclic graphs and the direction of unmeasured confounding bias. Epidemiology 2008; 19(5): 720-728.
We have devoted much thought to the bias that arises from the selection of individuals into the analysis, which we refer to as selection bias. The structure of selection bias under the null can be summarized as conditioning on a collider:
- Hernán MA, Hernández-Díaz S, Robins JM. A structural approach to selection bias. Epidemiology 2004; 15(5):615–625.
(By the way, the above paper also pioneered the use of causal diagrams to represent sufficient component causes.)
Selection bias comes in many flavors. We have discussed how selection bias may be induced by matching in case-control studies, by conditioning on mediators, and by competing events (see here for an introduction to competing events from a causal standpoint). We have also discussed how conventional estimates of per-exposure effect in infectious disease research may be affected by selection bias. Here I consider forms of selection bias that appear even when the treatment has a non-null effect:
- Hernán MA. Selection bias without colliders. American Journal of Epidemiology 2017; 185 (11): 1048-1050.
- Hernán MA. Discussion on ‘Perils and potentials of self-selected entry to epidemiological studies and surveys’ by Keiding N, Louis TA. Journal of the Royal Statistical Society Series A 2016; 179 (Part 2): 346-347.
Using causal diagrams that jointly represent confounding and selection bias, we proposed a structural approach to represent the familial coaggregation of disorders. And here, inspired by causal diagrams but without explicitly showing them, we describe confounding and selection bias when estimating case fatality risks during outbreaks.
Finally, we have described measurement bias, that is, the bias that arises when data items are measured with error:
- Hernán MA, Cole SR. Causal diagrams and measurement bias. American Journal of Epidemiology 2009; 170(8):959-962.
- VanderWeele T, Hernán MA. Results on differential and dependent measurement error of the exposure and the outcome using signed directed acyclic graphs. American Journal of Epidemiology 2012; 175(12):1303-1310.
Ps. We have also used causal diagrams to represent compound treatments with many versions. Compound treatments are common in observational research, and the reason why many causal questions are hopelessly vague, as mentioned here.