Connecting Disparate Data through Computational Biology
A study showing how genes and proteins interact over time as a cell matures into a macrophage demonstrates the potential power of computational biology to connect disparate sets of data, such as those generated by different researchers at HSPH, said co-author Winston Hide, scientific director of the School’s Bioinformatics Core.
“It’s clear that the genetic factors that control the behavior of genes work less as obedient servants of master conductors and more as large highly communicative teams,” said Hide, a visiting professor of bioinformatics in the Department of Biostatistics.
The study by Hide and his co-authors is one of a trio of related papers published online in Nature Genetics by members of the international FANTOM (Functional Annotation of Mouse) consortium based in Japan. The three large studies reveal new genetic elements and a network of coordinated interactions that seem to control how cells transform and evolve from precursors to mature cells.
The finding of extensively coordinated team activity among two dozen transcription factors contrasts with the prevailing hierarchical model in which the process is controlled by a small number of “master regulators,” said co-author Oliver Hofmann, associate director of the HSPH Bioinformatics Core.
This model of transcriptional teamwork emerged from a computational feat. Hide, Hofmann, and their colleagues digitized and integrated mountains of data from different research teams using different technologies to study the same transforming cells.
The experimental data came from leukemia cells chemically tweaked to switch from uncontrolled growth to more normal specialized cells. Other research teams measured thousands of data points over time as the cells transformed. The different technological platforms included in-depth sequencing techniques to look at genome-wide changes in gene expression and a method for targeting the exact position of transcription start sites.
The results may advance studies into the therapeutic potential of stem cells to regenerate tissues and organs and, in Hide’s research focus, to understand how the genetic network goes wrong in cancer cells.
“Linking experimental insights back to the behavior of genes is hard to do,” he said. “This study gives the next level of understanding of how phenotype relates to genotype by complex regulatory mechanisms.”
Genome technology is more and more commonly employed by researchers working in a clinical paradigm to seek new therapies and drug targets, Hide said. He hopes the same tools will generate original insights and meaningful context in the hands of public health researchers.
Computational biology can bring together raw data from different disciplines in fresh ways for fundamentally new discoveries and endow existing knowledge with additional meaning and context, he said.
One big challenge is finding a common digital language, Hide said. Scientists in different areas use different terms to speak about the same thing. The latest study, for example, employed techniques of computational biology and bioinformatics to join the thousands of genetic languages—a virtual “Tower of Babel”—into one song.
“It moved from a raucous noise into a chorus,” he said.
Likewise, Hide sees unlimited potential in the connections that can be made between the mountains of data generated by seemingly incongruent disciplines ranging from molecular biologists to epidemiologists working at scales ranging from proteins to society. The HSPH Bioinformatics Core supports more than a dozen research projects across the School.
“Public health today is not a domain of simple cohort studies and clinical trials and statistical techniques,” Hide said. “It’s being inundated by a tsunami of genomic data at a volume we cannot begin to comprehend. It is essential to take advantage of molecular information that gives insight into how it’s all related.”
-- Carol Cruzan Morton
HPH NOW