My research explores how advances in causal inference, statistical machine learning, and computational statistics can empower discovery in the biomedical and health sciences. I focus primarily on the development of model-agnostic, assumption-lean statistical inference procedures, doing so while emphasizing a science-first, translational philosophy that stresses the rich interplay between the applied sciences and statistical methodology--how emerging questions in the former spur advances in the latter, which, in turn, help to refine the scientific discovery process. This approach leverages causal inference as a framework for the translation of scientific questions into interpretable statistical estimands, and then aims to formulate analytic methods that incorporate flexible learning strategies (i.e., machine learning) and draw upon semi-parametric efficiency theory to avoid imposing modeling restrictions that cannot be justified by domain knowledge alone. I am also interested in high-performance statistical computing and the role that open-source software/programming play in the responsible practice of applied statistics and statistical data science, especially as these relate to the promotion of transparent, reproducible, and replicable science.
My methodological work often draws upon tools and ideas from semi- and non-parametric inference, high-dimensional and large-scale inference, targeted or debiased machine learning (e.g., targeted minimum loss estimation, method of sieves), and computational statistics. Areas of recent focus include the study of (1) inference on treatment effects from data collected via biased, outcome-dependent sampling designs, including extensions to sequentially adaptive sampling schemes; (2) causal effect heterogeneity for optimal treatment regime and subgroup discovery; (3) doubly robust and propensity score approaches for evaluating dose-response phenomena; (4) causal mediation analysis (i.e., direct and indirect effects) for investigating questions of mechanism; and (5) safely drawing causal inferences from data exhibiting network dependence or interference structures.
My past substantive collaborations have spanned diverse areas of the biomedical and public health sciences, from toxicology and computational biology to environmental health and nutritional epidemiology. Recently, I've been captivated by the rich scientific and statistical problems that abound in the infectious disease sciences, especially in the evaluation of investigational therapeutic and vaccine regimens via clinical trials and observational studies but also including public health virology/microbiology and infectious disease epidemiology. My work has contributed novel methods and insights for immune correlates analyses of vaccine efficacy trials (of HIV, COVID-19, and malaria), clinical trials of therapeutics and curatives (of COVID-19 and TB/HIV co-infection), and observational studies of the post-acute sequelae of COVID-19.
Here are a few reflections on the intertwined philosophies of science and of statistics that have shaped my own perspective:
"Far better an approximate answer to the right question, which is often vague, than the exact answer to the wrong question, which can always be made precise." --John Tukey
"Everyone is sure of this [that errors are normally distributed]...since the experimentalists believe that it is a mathematical theorem, and the mathematicians that it is an experimentally determined fact." --Henri Poincare
"Science is the belief in the ignorance of experts." --Richard Feynman
BA, 2015, Molecular & Cell Biology, Psychology, Public Health
University of California, Berkeley, Berkeley, CA, USA
MA, 2017, Biostatistics
University of California, Berkeley, Berkeley, CA, USA
PhD, 2021, Biostatistics
University of California, Berkeley, Berkeley, CA, USA
Postdoc, 2022, Causal Inference, Machine Learning
Weill Cornell Medicine, New York, NY, USA
NSF Mathematical Sciences Postdoctoral Research Fellowship2021-2022
National Science Foundation
The Wallace Lowe Fellowship2020
UC Berkeley School of Public Health
The Eki & Nobuta Akahoshi and Seiko Baba Brodbeck Endowed Fund Scholarship2019
UC Berkeley School of Public Health
Tom Ten Have Memorial Award (for "exceptionally creative or skillful research in causal inference")2019
American Causal Inference Conference
The Wellness Scholarship in Honor of Chin Long Chiang2018
UC Berkeley School of Public Health
Honorable Mention for Tom Ten Have Memorial Award2017
American Causal Inference Conference
NIH BD2K Biomedical Big Data Training Program Fellowship2017-2018
UC Berkeley