image image Harvard Public Health NOW
image

Search Archives
image

HSPH's Biostatisticians Crunch Data Vital to AIDS Research, Genetics

ImageThese days, a genetics researcher’s best friend is a biostatistician. That’s because advanced technologies and the mapping of the human genome have created immense amounts of genetic data available for analysis. Sifting through the data is nothing short of a Herculean task, but HSPH’s biostatisticians are meeting the challenge.

"We did not choose to concentrate on genetics, but that is where the field is going, and it’s very interesting work," said Stephen Lagakos, chair of the Department of Biostatistics.

With more than 50 faculty members, Lagakos’ department is one of the largest at the school. Many of the department’s members are working on some kind of genetics-related research.

Broadly defined, statistical genetics is the development of methods to analyze DNA. In recent years, the term has been more specifically applied to gene mapping, or the search for locations of genes related to diseases, and to the analysis of drug therapies.

Mapping Genes

Nan Laird, a professor in the Department of Biostatistics, knows a thing or two about gene mapping. In 1998, she and colleagues identified a gene as a risk factor for Alzheimer’s disease, using a new statistical genetics method they had developed.

"Family-based association" is an analysis tool that compares the genes of affected individuals to those of closely related family members, often parents and children. With Alzheimer’s and other late-onset diseases, DNA from long-deceased parents may not be available. In an inspired stroke, Laird and her colleagues developed a method that uses siblings instead of parents. The affected siblings hold the genetic mutation. The unaffected siblings become the statistical equivalent of a random sample, representing the population at large.

By comparing the two groups, Laird can piece together the genetic material each parent must have had and identify differences between genes.

Seven years ago, Laird spotted the potential in the statistical genetics specialty and took a sabbatical to learn the subject. She has not regretted the move. Currently, she is in the early stages of creating methods that would test gene-environment interactions, a monumental challenge given the number of environmental factors that could influence disease development. She has experienced some preliminary success with a study of Attention Deficit/Hyperactivity Disorder (ADHD) in collaboration with researchers at the Massachusetts General Hospital. Together, they have identified a gene implicated in ADHD.

Laird is also looking at the links between genes and asthma and between genes and mental illnesses, such as bipolar disorder.

"Gene mapping is not new. It’s been around since the 1920s," said Laird, "but as a result of new information about the human genome, there has been an explosion in the field, and researchers are now being encouraged to map genes of complex disorders that would have been very difficult just a decade ago."

Individualized Medicine and HIV

Statistical genetics plays a significant role in what some experts predict to be the future of pharmaceutical therapy–individualized medicine.

"The idea is that you would monitor a patient regularly to gain a sense of when a drug treatment has become ineffective," explained Lagakos. "You would then know when to change a drug and what would be a good replacement based on a genetic profile of the patient."

Although applicable to diseases such as cancer, individualized medicine holds much promise in treating AIDS, where HIV notoriously mutates to defy therapies.

Several major issues complicate HIV treatment. Currently, there are more than a dozen HIV drugs grouped by classes. Standard therapy regimens require patients to take combinations of small numbers of the drugs. Patients usually need to be treated for the remainder of their lives, creating concerns about long-term toxicities that may sicken people.

A broader worry shadows AIDS treatments. Unlike many other infections, HIV tends to become resistant to drugs very quickly because the virus replicates frequently and often with errors that produce mutant strains.

"Then interesting things happen within the body," said Lagakos. "Imagine that some of the virus becomes resistant to a drug. However, the drug continues to successfully prevent the replication of virus that was not resistant. What survives and continues to replicate is the resistant virus. It dominates."

Drug-resistant strains threaten all HIV-infected people because they are transmissible and because there are a limited number of treatments available. Also, some drugs are chemically very similar so that resistance to one confers resistance to several drugs. Physicians must carefully prescribe combinations so as not to run out of suitable medications. If infections become resistant to too many drugs, then patients are flat out of pharmaceutical options. A study presented at the American Society of Microbiology conference in Chicago on December 19 indicated that drug resistance is a larger problem in the United States than previously thought. More than 75 percent of the 209,000 HIV patients in the study had developed resistance to at least one of the 15 anti-AIDS drugs in the market and more than 40 percent were unaffected by all protease inhibitors, one of the most commonly prescribed anti-AIDS drugs.

Victor DeGruttola, a professor in the Department of Biostatistics, directs the Statistical and Data Analysis Center for the AIDS Clinical Trials Group at HSPH. The group designs and analyzes most of the federally funded clinical trials for new AIDS treatments across the country. About 70 people work in the group at HSPH and elsewhere. At any given time, dozens of trials and hundreds of sub-studies may be going on.

DeGruttola is searching within reams of data for patterns of HIV mutations and drug resistance. "We would like to predict the patient response to treatment given the genetic sequences of the swarm of viruses infecting a person," he said.

Considering the propensity of HIV to mutate, DeGruttola may look at hundreds of variations of a virus. He cited one study involving 75 HIV-infected people in which 72 of them had different mutated forms of HIV.

To organize the data, DeGruttola uses several statistical approaches. He starts off by asking how he can cluster the mutations so that similar genetic sequences are grouped together. He then uses a mathematical model to figure out the relationship of the cluster to drug resistance, looking for patterns.

DeGruttola also employs a "tree-based" method that begins with a large group of factors predictive of drug resistance. He teases out a predictor, such as the length of time on a therapeutic regimen, and then groups data related to that factor. He chooses a second predictor and narrows the field, and so on, until he has more specific information about the correlation between a drug and resistance.

DeGruttola compares the data from tree-based studies to cluster studies to see if the results are similar. He has found that clustering indeed appears to be an important method. He feels that by using clustering, researchers one day may be able to group people, based on their genes, into classes of sensitivities to drugs, virtually customizing therapeutic regimens.

Challenges remain. The large number of new AIDS drugs under development will create more data to be researched. It remains unknown if the work done on the most common subtype of HIV found in the United States and Europe will apply to subtypes elsewhere in the world, such as sub-Saharan Africa, where infection rates remain dismally high. But DeGruttola remains hopeful, "I am fairly confident that given enough information, we can understand well the patterns of mutations that confer resistance."


Wong Combines Powerful Computer Programming with Biostatistical Methods

Wing WongThe laboratory of Wing Wong, professor of computational biology in the Department of Biostatistics, is unlike any other at HSPH. It is made of powerful, sleek-looking computers tucked amid the school’s servers in the basement of the Kresge Building. Computational biology is the creation and application of computer and biostatistical methods to study biology, behavioral and social systems. For Wong, the emerging specialty may hold the answers to better treatments for diseases, captured in bytes and binary code.

"We are trying to develop methods, algorithms and software to help biologists analyze and interpret large amounts of data coming from the Human Genome Project and other genome and biology studies," said Wong, whose background is in mathematics and statistics.

The field of biology is shifting in scale from looking at single genes to researching entire genomes. Automated, high-volume technologies generate huge amounts of information to be analyzed.

"You cannot organize this amount of data in an Excel spreadsheet," observed Wong.

Instead, he and scientists in his laboratory have created several software packages that researchers use to crunch information. An advocate of sharing research, Wong has made the packages available for free on a web site.

ChartHe spends much of his time analyzing digitized data that other researchers send him and specifically works with information derived from microarrays, slides that visually show which genes are "turned on," or expressed, in the presence of sample tissue. Knowing the expression of genes helps scientists figure out their functions.

"You’d have to have been born after 1970 to understand a lot of the technologies Wing uses," joked Stephen Lagakos, chair of the Department of Biostatistics.

Wong collaborates with scientists in different biological specialties, but much of his work involves cancer research. He is currently assisting scientists at the Dana-Farber Cancer Institute, who have large collections of tissue samples from cancer patients dating back many years. The scientists have used traditional methods to analyze the samples, and now they are running the samples through microarrays to get a more detailed look at cells.

Wong interprets the information, sometimes working with thousands of data points about the genes of cancer patients. He runs the data through the software he and his colleagues have developed to find clear "signals" or patterns that may be used to classify the cancer cases or help predict responses to a therapy.

Wong hopes that by refining the samples, information from the gene expression profiles provided by the microarrays can be correlated retrospectively to factors such as patient history, illness duration and outcomes. Making such correlations may help physicians in the future predict a cancer patient’s disease course based on genetics and available treatments.

"High throughput genomic and proteomic technologies allow scientists to generate immense amounts of data on all aspects of human biology," said Wong. "Future advances in biomedical research will depend on whether we can extract and make effective use of the information embedded in these data. Computational biology will play a central role in this regard."


Harvard Public Health NOW is published biweekly by the
Office of Communications
Harvard School of Public Health
665 Huntington Ave., SPH 1-1204
Boston, Massachusetts 02115
617-432-6052
Editor and Layout: Christina Roache
Photos Credits: Richard Chase, Christina Roache, Wing Wong, Harvard AIDS Institute, National Institues of Health


Archived Issues || HSPH Home

Copyright, 2007,  President and Fellows of Harvard College

Archived Issues McCloskey Researches Effects of Domestic Violence on Women and Kids Harvard Mentoring Project Kicks Off National Campaign this Month with Media Messages Around the School Exams and Defenses Calendar Office of Communications