Program Faculty

With over 50 faculty members in the Department, students have access to a large group of interdisciplinary leaders at the top of their fields.

Learn more about a few members of our program faculty below.


Dr. Jukka-Pekka Onnela

Director, Master’s Program in Health Data Science
Associate Professor of Biostatistics
Department of Biostatistics
Harvard T.H. Chan School of Public Health

Dr. Onnela’s research involves two interrelated research themes. In statistical network science, the study of network representations of physical, biological, and social phenomena, he focuses on social and biological networks and their connection to human health. In digital phenotyping, the moment-by-moment quantification of the individual-level human phenotype using data from digital devices, he focuses on psychiatric disorders. Please see Dr. Onnela’s lab webpage for more details.

Dr. Onnela teaches BIO261 – Data Science II. This course presents a set of tools for modeling and understanding complex datasets. Specifically, the course will provide practical regression and tree-based techniques for big data. Specific topics are: linear model selection and regularization: LASSO and regularization; principal component regression and partial least squares; tree-based methods: decision trees; bagging, random forests, and boosting; unsupervised learning: principal components analysis, cluster analysis.

Research Keywords: Digital Phenotyping, Large-scale Social Networks, Network Inference, Cluster Randomized Trials, & Mental Health

Dr. Heather Mattie

Executive Director, Master’s Program in Health Data Science
Instructor of Data Science
Department of Biostatistics
Harvard T.H. Chan School of Public Health

Dr. Mattie’s current research focuses on the estimation and prediction of tie strength and node and link imputation in large-scale social networks. She is particularly interested in how the strength of social relationships affect health as well as health disparities in underrepresented populations.

Research Keywords: Tie Strength, Link Prediction, Imputation, Health Disparities

Dr. Christine Choirat

Chief Innovation Officer,
Swiss Data Science Center, ETH Zürich and EPFL
Adjunct Lecturer,
Department of Biostatistics, Harvard T.H. Chan School of Public Health


Dr. Choirat is the Chief Innovation Officer of the Swiss Data Science Center (SDSC), an initiative to accelerate the use of data science and machine learning techniques within academic disciplines and the industrial sector, in Switzerland and internationally.  She was trained as a statistician (PhD, Paris Dauphine).  At SDSC, she provides leadership over the lifecycle of sponsored projects and partnerships in the domains of environmental science, health IT, health science and technology, personalized medicine, and open science.  She also fosters engagement with partners to facilitate the adoption of FAIR and reproducible data science with the Renku platform.

Dr. Choirat teaches BST262 – Computing for Big Data.  The course offers a critical presentation of theoretical approaches and software implementations of tools to collect, store and process data at scale.

Research Keywords: Health data science, Reproducible research, Statistical software, Environmental policy, Health policy

Dr. Christoph Lange

Professor of Biostatistics
Harvard T.H. Chan School of Public Health


Dr. Lange’s research interests are at the intersection of biostatistical methodology, numerical analysis and computer science. Formally trained in all 3 areas, Dr. Lange headed the analysis of the first genome-wide association study for complex diseases, the longitudinal analysis of BMI in the Framingham Heart Study. To address the challenges of this first wave of “big data” in genetics, Dr. Lange’s group developed statistical methodology to maximize power in such studies that minimize the effects of the multiple testing problem in GWAS and developed massively parallel software implementations for the analysis that also efficiently scale with the study size (PBAT software package). Dr. Lange’s group then successfully commercialized the package in collaboration with GoldenHelix.

Currently, Dr. Lange groups focuses on the statistical and computational challenges posed by whole-genome sequencing studies, e.g. powerful analysis methodology, clustering approaches of the genome, approaches to permutation testing, storage algorithm, etc.

Dr. Lange is also involved in collaborative research in Asthma Genetics, COPD Genetics and Alzheimer Genetics.

Dr. Jeffrey Miller

Assistant Professor of Biostatistics
Harvard T.H. Chan School of Public Health


The overall theme of Dr. Miller’s research is developing practical methods for finding structure in complex systems. His work has focused on robustness to model misspecification, nonparametric Bayesian models, frequentist analysis of Bayesian methods, and efficient algorithms for inference in complex models. He is particularly interested in understanding and combating diseases of aging, and is currently engaged in collaborative work using large genomics data sets to better understand Alzheimer’s disease and cancer.

Dr. Miller will teach Applied Regression and Machine Learning. In this course, students will apply their knowledge to real-world problems such as in Kaggle competitions. In the context of these problems, students will learn about and implement modern methods for classification and regression. Topics covered may include penalized regression, LDA, CART, random forests, neural networks, deep learning, boosting, and ensembles.

Research Keywords: Statistical Methodology, Alzheimer’s Disease, Cancer Genomics, Aging, Model Misspecification