Three concepts permeate all aspects of biological life: genotype, phenotype, and environment. The phenotype of an organism refers to the collection of its observable traits, and these traits arise jointly from the genetics of the organism and the environment of the organism. For humans, of the different phenotypic categories ranging from our morphology to our biochemical properties, behavior has traditionally been an especially challenging phenotype to study because of its temporal and contextual dependence, and traditional approaches have relied on surveys to learn about behavior. Smartphones are now ubiquitous and can be harnessed to offer a wealth of data on social, behavioral, and cognitive phenotypes. This approach has enormous potential, but data alone are not enough: the data need to be coupled with appropriate statistical learning techniques–techniques that are specific to the given domain and driven by specific scientific questions, in order to fully leverage their potential. Only then can the data be transformed into clinically valid and useful information, the kind of information that can transform research and discovery and, ultimately, patient health. We have coined the term digital phenotyping to refer to the moment-by-moment quantification of the individual-level human phenotype, in situ, using data from personal digital devices, in particular smartphones. Our research in this area includes but is not limited to behavioral patterns, sleep, social interactions, physical mobility, gross motor activity, cognitive functioning, and speech and language production. As part of our efforts in this area, our group has developed and maintains the open-source Beiwe research platform for high-throughput smartphone-based digital phenotyping. As an alternative to deploying the open-source version, we make it possible for anyone to use this platform through our Beiwe Service Center. Unlike the large majority of commercially available smartphone applications, Beiwe is intended for biomedical research, and the most important aspect of the front-end application is the collection of raw sensor and phone use data. Our primary activity in this area is the development of statistical learning methods, their implementation in software, and their incorporation in the AWS-based Beiwe back-end system.
Statistical Network Science
Many systems of scientific and societal interest consist of large numbers of interacting components. The structure of these systems can be represented as networks, where network nodes represent the components and network edges interactions among the components. Network analysis is used to study how pathogens, behaviors and information spread in social networks, which has important implications for our understanding of epidemics and the planning of effective interventions. In a biological context, at a molecular level, network analysis is applied to gene regulation networks, signal transduction networks, and protein interaction networks. Network science is an interdisciplinary field that draws heavily from mathematics, statistics, statistical physics, computer science, and the social sciences. Two major paradigms to the modeling of networked systems are the physics-based approach and the statistics-based approach: the former involves modeling network growth and evolution from microscopic mechanisms, whereas the latter involves statistical modeling of data that arrive in the form of a network. Much of our methodological work focuses on bridging some of the gaps that exists between these two approaches.