Program Faculty

With over 50 faculty members in the Department, students have access to a large group of interdisciplinary leaders at the top of their fields.

Learn more about a few members of our program faculty below.


Dr. Jukka-Pekka Onnela

Co-Director, Master’s Program in Health Data Science
Professor of Biostatistics
Department of Biostatistics
Harvard T.H. Chan School of Public Health

Dr. Onnela’s research involves two interrelated research themes. In statistical network science, the study of network representations of physical, biological, and social phenomena, he focuses on social and biological networks and their connection to human health. In digital phenotyping, the moment-by-moment quantification of the individual-level human phenotype using data from digital devices, he focuses on psychiatric disorders. Please see Dr. Onnela’s lab webpage for more details.

Dr. Onnela teaches BIO261 – Data Science II. This course presents a set of tools for modeling and understanding complex datasets. Specifically, the course will provide practical regression and tree-based techniques for big data. Specific topics are: linear model selection and regularization: LASSO and regularization; principal component regression and partial least squares; tree-based methods: decision trees; bagging, random forests, and boosting; unsupervised learning: principal components analysis, cluster analysis.

Research Keywords: Digital Phenotyping, Large-scale Social Networks, Network Inference, Cluster Randomized Trials, & Mental Health

Dr. Heather Mattie

Co-Director, Master’s Program in Health Data Science
Lecturer on Biostatistics
Department of Biostatistics
Harvard T.H. Chan School of Public Health


Dr. Mattie’s research focuses on the intersection between biostatistics, data science, and network science. Specifically, she has used network science and machine learning to study interactions in communities, as well as the development and application of artificial intelligence in healthcare research. Her research has also involved the notion of algorithmic fairness, in terms of an algorithm compounding inequities working against underrepresented or disadvantaged groups in society. Her work has found links between unhealthy weight control behaviors and the use of mobile dating applications, particularly in racial and ethnic minorities. She has developed methods that predict tie strength in a network, which assists in modeling the spread of disease and information. Additionally, her work has examined the potential for artificial intelligence to improve inference from data for care and population health, as well as the challenges related to bias and scalability in such models.

Dr. Mattie is involved in the diversity and inclusion efforts of the Biostatistics department and enjoys mentoring students.

Research Keywords: Algorithmic Bias, Network Science, Tie Strength, Link Prediction, Health Disparities

Dr. Tamar Sofer

Associate Biostatistician and Director of Biostatistics Core in Sleep Epidemiology
Brigham and Women’s Hospital
Assistant Professor
Harvard Medical School


Dr. Sofer is a biostatistician studying genetics and ‘omics determinants and signatures of sleep, cognitive decline, and related phenotypes. She develops statistical and computational approaches for assessing genetic associations in challenging scenarios, especially when the target population is diverse: genetically, and in other ways. Recently she has been extending her research program into ‘omics (DNA methylation, gene expression, metabolomics, etc.). Dr. Sofer strongly believes that statistical methods should be developed and applied to address specific scientific questions, and good application of statistics comes when one is immersed in the scientific research. Therefore, she is always eager to collaborate and work as an integral part of the scientific team.

Research keywords: Statistical genetics, Omics integration, Genetic risk prediction, Diverse populations, Hispanics/Latinos

Dr. Santiago Romero-Brufau

Instructor of Biostatistics
Department of Biostatistics
Harvard T.H. Chan School of Public Health
Assistant Professor of Medicine and Healthcare Systems Engineering and Principal Data Scientist
Department of Medicine
Mayo Clinic

Santiago Romero-Brufau, MD, PhD, is an assistant professor of medicine and healthcare systems engineering at the Mayo Clinic, where he also serves as principal data scientist for the Department of Medicine. His work focuses on the development of machine-learning clinical decision support tools, and their implementation into clinical practice.

Research keywords: Clinical Decision Support, Machine Learning, Implementation, Early Warning Scores, Healthcare Systems Engineering

Dr. Christine Choirat

Chief Innovation Officer,
Swiss Data Science Center, ETH Zürich and EPFL
Adjunct Lecturer,
Department of Biostatistics, Harvard T.H. Chan School of Public Health


Dr. Choirat is the Chief Innovation Officer of the Swiss Data Science Center (SDSC), an initiative to accelerate the use of data science and machine learning techniques within academic disciplines and the industrial sector, in Switzerland and internationally.  She was trained as a statistician (PhD, Paris Dauphine).  At SDSC, she provides leadership over the lifecycle of sponsored projects and partnerships in the domains of environmental science, health IT, health science and technology, personalized medicine, and open science.  She also fosters engagement with partners to facilitate the adoption of FAIR and reproducible data science with the Renku platform.

Dr. Choirat teaches BST262 – Computing for Big Data.  The course offers a critical presentation of theoretical approaches and software implementations of tools to collect, store and process data at scale.

Research Keywords: Health data science, Reproducible research, Statistical software, Environmental policy, Health policy

Dr. Christoph Lange

Professor of Biostatistics
Harvard T.H. Chan School of Public Health



Dr. Lange’s research interests are at the intersection of biostatistical methodology, numerical analysis and computer science. Formally trained in all 3 areas, Dr. Lange headed the analysis of the first genome-wide association study for complex diseases, the longitudinal analysis of BMI in the Framingham Heart Study. To address the challenges of this first wave of “big data” in genetics, Dr. Lange’s group developed statistical methodology to maximize power in such studies that minimize the effects of the multiple testing problem in GWAS and developed massively parallel software implementations for the analysis that also efficiently scale with the study size (PBAT software package). Dr. Lange’s group then successfully commercialized the package in collaboration with GoldenHelix.

Currently, Dr. Lange groups focuses on the statistical and computational challenges posed by whole-genome sequencing studies, e.g. powerful analysis methodology, clustering approaches of the genome, approaches to permutation testing, storage algorithm, etc.

Dr. Lange is also involved in collaborative research in Asthma Genetics, COPD Genetics and Alzheimer Genetics.