Kelly Cho

Kelly Cho, PhD,  MPH
Director of Clinical Data Science and Analytics, Division of Population Health and Data Science, MAVERIC, VA Boston Healthcare System
Assistant Professor of Medicine, Brigham and Women’s Hospital, Harvard Medical School

Big Data Challenges for Phenomic Science: Experiences from the US Department of Veterans Affairs’ 24M Population EHR and the Million Veteran Program (MVP)

In the era of precision medicine with advanced healthcare systems generating “big data”, challenges remain in optimizing the utility of diverse and vastly available health data all around us.  A major effort has focused on the development of methods dealing with the genomic data.  Equally important and perhaps more challenging is capturing consistent and quality phenotypic information from various data sources for clinical research.  The electronic health record (EHR) databases are one of the largest and widely used phenotypic data banks.  In particular, the EHR database in the US Veterans Affairs (VA) Healthcare System represents the largest single payer system with over 16 years of longitudinal data on approximately 24 million overall users.  These EHR phenotypic data have quickly become high dimensional big databases organized in many layers of data domains requiring innovative management and analytical approaches.  In 2011, the VA Office of Research and Development launched the Million Veteran Program (MVP), a mega-biobank cohort, and began to establish a national database of genetic, military exposure, lifestyle, and health information in partnership with Veterans that combines the data from survey instruments, EHR, genomics, and biospecimens.  In August 2016, MVP reached the 500,000-enrollment milestone, and as of October 2018, there were over 702,656 participants in MVP.  MVP is currently actively engaged in numerous test projects (alpha, beta, gamma) and expanding its computing environment through the collaboration with the Department of Energy to enhance computational efficiency and scalability.  We share our experience of advancing phenomic science forward through innovative analytics and development of metadata knowledgebase library.