Program

Extracting Knowledge From Data

The new master’s degree program in Health Data Science will focus on the interface of statistics, computational science, and software engineering in public health, medical, and basic biology applications. The program will train students to extract knowledge from data and to communicate and share this knowledge.

The first year will consist of case-based training in the areas of statistical inference, machine learning, and computing, as well as training in public health and biomedical sciences. Through this case-based approach, students will simultaneously learn computing skills necessary to manage and analyze health science data and will start gaining experience in answering scientific questions with data.   These skills will be further developed during a intensive semester-long course (7.5 credits) during the third semester focusing on project-based work in health data science. This culminating research experience will allow students to integrate the knowledge they have learned to analyze health data generated by current research projects involving program faculty.

Students will receive training in quantitative methods, including:

  • Data wrangling
  • Data visualization and exploratory data analysis
  • R, Python and shell programming
  • Probability and statistical inference
  • Applied linear regression and machine learning
  • Computing for big data

Course Requirements:

Core Curriculum

A total of 60 credits of coursework are required for the MS in Health Data Science. This includes a 25 credit ordinally graded core curriculum consisting of:

BST 260    Introduction to Data Science (5 credits)
BST 261    Data Science II (2.5 credits)
BST 222    Basics of Statistical Inference (5 credits)
BIO xxx    Applied Regression and Machine Learning (5 credits)
EPI 201     Introduction to Epidemiology Methods: 1 (2.5 credits)
BIO xxx    Computing for Big Data (5 credits)

Computing Requirement

The program is designed to produce strong programmers. Students will also be required to take an additional 5 credits of coursework in computer science, choosing from the following:

BST 234    Introduction to Data Structures and Algorithms
BMI 713    Computational Statistics for Biomedical Science
CS 105       Privacy and Technology
CS 124       Data Structures and Algorithms
CS 164       Software Engineering Computer Science
CS 165       Data Systems
CS 171        Visualization
CS 181        Machine Learning
CS 187       Computational Linguistics
STAT 171    Introduction to Stochastic Processes

Project-Based Research Course

The program will provide a culminating research experience that tests all competencies through a hands-on semester-long project-based research course (7.5 credits). This course will allow students to immerse themselves in five health data science projects in public health and biomedical science.

BIO xxx       Health Data Science Practice (7.5 credits)

Elective Courses

A minimum of 22.5 additional credits will come from the following list of elective courses offered by the departments of biostatistics, biomedical informatics, computer science, statistics, and epidemiology. These are in addition to the computer science courses listed under the computing requirement, which could also be counted as electives once the 5 credit requirement has been met.  Students can choose from the following:

BST 212            Survey Research Methods in Community Health
BST 223           Applied Survival Analysis
BST 226           Applied Longitudinal Analysis
BST 247           Advanced Statistical Genetics
BST/BIST 282  Introduction to Computational Biology and Bioinformatics
BST/BIST 290  Advanced Computational Biology and Bioinformatics
BST/BIST 267  Introduction to Social and Biological Networks
EPI 202            Elements of Epidemiologic Research: Methods 2
EPI 203            Study Design in Epidemiologic Research
EPI 204            Analysis of Case-Control and Cohort Studies
EPI 271             Propensity Score Analysis
EPI 288            Data Mining and Predictive Modeling
EPI 289            Models for Causal Inference
EPI 515             Measurement Error and Misclassification
ID 271               Advanced Regression for Environmental Epidemiology
BMI XXX          Data Visualization
BMI XXX          Precision Medicine I: Integrating Clinical and Genomics Data
BMI XXX          Precision Medicine II: Genomic Medicine
BMI 701            Foundations in Biomedical Informatics I
BMI 726           Big Data Innovations in Population Health
ME 530             Clinical Informatics
CS 187               Computational Linguistics