Structure of Bias

My colleagues and I have proposed a taxonomy of biases in causal inference research. We described the structure of the biases by using causal diagrams known as directed acyclic graphs or DAGs, which we cover in our book and edX course. (It is no secret that we love big DAGs.) We distinguish three types of systematic bias: confounding, selection bias, and measurement bias.

Confounding is the bias that arises when treatment and outcome share causes because treatment was not randomly assigned. Economists refer to confounding as “selection bias” or “selection on treatment”, but that terminology is a bit confusing because there is a different type of bias that is due to selection of individuals into the analysis. Regardless of your preferred terminology, confounding adjustment requires subject-matter knowledge, as we explain here:

Using real data, we argue that blind statistical adjustment for confounding may actually result in more bias than lack of adjustment. We have also explained how causal diagrams can be used to predict the direction of bias due to confounding:

We have devoted much thought to the bias that arises from the selection of individuals into the analysis, which we refer to as selection bias. The structure of selection bias under the null can be summarized as conditioning on a collider:

(By the way, the above paper also pioneered the use of causal diagrams to represent sufficient component causes.)

Selection bias comes in many flavors. We have discussed how selection bias may be induced by matching in case-control studies, by conditioning on mediators, and by competing events (see here for an introduction to competing events from a causal standpoint). We have also discussed how conventional estimates of per-exposure effect in infectious disease research may be affected by selection bias. Here I consider forms of selection bias that  appear even when the treatment has a non-null effect:

Using causal diagrams that jointly represent confounding and selection bias, we proposed a structural approach to represent the familial coaggregation of disorders. And here, inspired by causal diagrams but without explicitly showing them, we describe confounding and selection bias when estimating case fatality risks during outbreaks.

Finally, we have described measurement bias, that is, the bias that arises when data items are measured with error:

Ps. We have also used causal diagrams to represent compound treatments with many versions. Compound treatments are often the elephant in the room in observational research, and the reason why many causal questions are hopelessly vague, as mentioned here.