Harvard School of Public Health
Telephone: (617) 432-1056
Department of Biostatistics
FAX: (617) 739-1781
During the last decade, the technology for measuring cellular (and even molecular) level biologic phenomena has rapidly improved. In addition to recording and analyzing clinical outcomes following treatment for cancer, it is now possible to measure the way genes and proteins are expressed in tumors and, in some instances, to measure the genomic or proteomic response to tumors following therapy. The promise of this technology is that it will allow for the development of therapies specifically targeted to the genetic irregularities that are the cause of growing and metastasizing malignancies.
Research on the cellular expressions of cancer is data-intensive, and the Department of Biostatistical Science at the Dana-Farber Cancer Institute (DFCI) is directly involved in much of this new work. Statistics faculty, post-doctoral fellows and graduate students are making fundamental contributions to methodology in this area in addition to collaborating with DFCI scientists on ground-breaking projects. On the methodologic side, teams at the DFCI have designed and built two of the most widely used software systems for computational biology. The development of BioConductor (www.bioconductor.org) is led by Professor Robert Gentleman and is based on the successful open source statistical language R, for which Professor Gentleman was the principal architect. The program dChip (www.dchip.org), developed and maintained by Professor Cheng Li, is a menu-oriented system for analyzing gene expression levels that has been designed for direct use by biologists.
Professor Li's work on dChip and other projects is focused on the low-level and high-level analysis issues of Affymetrix oligonucleotide microarrays. Such microarrays are used to simultaneously interrogate the expression levels of tens of thousands of genes, or the genotypes of thousands of polymorphic markers. His research work is centered around issues such as proper adjustment of multiple arrays so that they are comparable, computation of summary statistics better reflecting the mRNA abundance of samples, visualization of expression values or marker genotypes along the chromosome and cytobands, and sample clustering techniques to correlate with biological knowledge such as GeneOntology.
During the summer of 2002, the Dana Farber opened a Microarray Core Facility (MACF) to perform gene expression profiling and SNP analyses using Affymetrix microarrays for DFCI investigators. Many of the investigators are new to microarray analysis, and so making sense of the plethora of data can be a formidable task. To address this need, the doctoral students Yu Guo, Ryung Suk Kim, Denise Scholtens, and Bin Zhang worked as research assistants (under the supervision of Professor Cheng Li) in the MACF during the past year to educate investigators about approaches and tools for microarray data analysis. Using software such as dChip and GeneCluster, the RAs assisted in the analysis of microarray data arising from a variety of settings including factorial designed cell line experiments, time course studies, and patient tissue samples. A description of the services provided by the MACF, as well as instructional material introducing some basics of microarray data analysis is available at http://chip.dfci.harvard.edu.
The investigators in Dr. J. Dirk Iglehart's lab in the DFCI Department of Cancer Biology have studied the pathogenesis and pathophysiology of human breast cancer. One recent study, led by Dr. Alexander Miron, measured gene expression in an estrogen receptor positive breast cancer cell line after the application of various treatments in a factorial designed experiment. The main goal of the experiment was to uncover both direct and downstream target genes of estrogen. Robert Gentleman and Denise Scholtens worked with Dr. Miron to apply a linear model to the microarray data, and then interpret the parameters within the context of the living, dynamic cellular system. The upstream regions of the primary target genes currently show some evidence of sequence similarity, indicating a possible common transcriptional regulatory mechanism. Both the primary and secondary target genes are also being used in conjunction with patient tumor sample data to increase the understanding of different breast cancer subgroups seen in patients at a molecular level.
Professor Xiaole (Shirley) Liu is the newest faculty member in the Department working within the intersection of statistics, computation and biology. Her research interests are computational biology methodology for genomics, microarray and cancer research. Her current focus is on sequence analysis, especially finding transcription factor binding motifs and binding sites in gene promoter regions using computational approaches. The challenging issues include estimating the collaborative binding of several transcription factors on the regulation of one gene, and using comparative genomics methods to find transcription factor motifs from higher eukaryotes.
All of the work described above is computationally intensive, and the Department of Biostatistical Science has doubled its computing power in the last three years, and will likely redouble computational facilities in the next 2-3 years. Recently the department increased the CPU power on each desktop, added a multi-processor cluster, and increased storage capacity by an order of magnitude. The next year will see the addition of another cluster, high speed Unix, Linux and MS Windows workstations, and more disk space.
Statistical genetics is an exciting and rapidly expanding area for departmental research. This new research area reflects in part the rapid expansion of the use of genetic data in medical and public health studies. Ongoing collaborations between department members and colleagues in the Harvard medical community encompass genetic studies of asthma, COPD, bipolar disorder, nicotine addiction, behavioral inhibition, Alzheimer's disorder, and birth defects, among others.
The discipline of statistical genetics is nearly as old as statistics itself. Fisher, Pearson and Galton made early important contributions to the field. The general objective of the discipline is development of statistical methods for understanding how traits are inherited, and for locating genes that influence specific traits or disorders. Without the ability to obtain DNA data, such a task is understandably difficult, and success in humans was previously limited to disorders which were easily identified and exclusively genetic, such as Duchene muscular dystrophy.
Two recent technological developments have radically changed the character of research in statistical genetics: the advent of modern technologies which allow us to obtain, relatively inexpensively, DNA data (also known as marker data) and the Human Genome Project, data on families to locate these markers on a particular chromosome and to determine their relationship to other markers on that chromosome. Having access to a wealth of data on DNA data on individuals has lead researchers to undertake studies to determine the genetic basis of many common, complex disorders, which may be influenced by more than one genetic defect, and may have non-genetic causes as well. For example, searches are now underway for genes influencing cancers, asthma, COPD, obesity, diabetes, hypertension, and numerous psychiatric disorders.
Note that genes are comprised of several thousands of base pairs of DNA; typically there will be many different possible variants in the DNA of a gene, at different locations within the gene. We use the term marker to refer to data at one particular location (also called locus); even if we study only one gene, we may have data on many different DNA markers at different loci in that gene. Marker data can refer to information just on the value of a particular base pair, or information on a deletion of several base pairs at a locus, or the number of repeats at a locus. Sometimes we speak of 'genes causing disease'. In fact this is a bit of a misnomer; it is actually one or more mutations at specific loci within or flanking the gene that cause susceptibility to disease. It can be a long process between showing that a gene is associated with the disease and finding exactly what mutation in the gene is responsible.
As technology has improved and costs have dropped, it is feasible to do large-scale genetic studies involving hundreds or even thousands of individuals. In addition there may be hundreds or thousands DNA markers at different locations throughout the genome on each person. There are two major types of studies that use these types of data sets to find genes: linkage and association. Linkage studies require DNA marker data and trait data on affected subjects and their affected relatives. By analyzing inheritance patterns of the DNA markers, they enable one to determine if a marker is "linked", or physically near to a gene influencing the trait. Linkage studies do not require that makers be very close to the suspected disease gene, so one can scan the entire genome with only 300 markers or so. These studies have been very successful in determining the specific chromosomal regions where one should look for genes, but the region can still be very broad, encompassing hundreds or even thousands of genes. As a result, linkage studies are just the first step; much additional work has to be done to find the specific genes associated with disease within the region of linkage. This, and the fact that affected relatives are required, has led to the popularity of association studies.
Genetic association studies are much like ordinary association studies in statistics; the objective is to show that a particular DNA marker is associated with the disease. They will only be successful at locating genes if the DNA marker data is "close", or tightly linked, to the suspected disease gene. This is both a blessing and a curse. From the perspective of giving better information about where the putative disease gene is, the association study is preferred over linkage. However your markers must be placed closer to the putative disease gene so that many more markers may be required, especially if the location is completely unknown. A recent "whole genome scan" association study reportedly used over 800,000 markers on each individual. This is clearly infeasible in most settings where genotype costs are still around 50 cents per marker. More commonly, association studies are used in two more limited settings: scanning regions shown to be linked to the disease and in candidate gene studies. Candidate genes are those with biological/chemical properties that implicate them in the disease process. One can usually cover all the variable sites in a single gene with 20-30 markers. As we continue to identify more genes and understand their function, we can expect to see many more candidate gene studies. Likewise, as the costs of genotyping decreases, the feasibility of using large numbers of markers in association studies increases.
Association studies can be carried out with two very different study designs, and there is some controversy about the desirability of each design. The first is an ordinary case-control design. The distribution of marker data is compared in independent samples of unrelated diseased and non-diseased subjects; in the simplest case, an ordinary chi-square test can be used for the analysis. Case-control studies have many attractive features: no relatives are need for the study, it may be easy to recruit suitable controls, they are powerful, and the analysis extends easily to more complex settings, for example testing for gene-environment interaction. The biggest drawback of a case-control study is that it only tests for association and not for linkage. Confounding, often due to genetic stratification or population admixture, can lead to spurious results, and there are many unreplicated findings of association from case-control studies. A well known example of confounding occurred in a large (>5,000) study of Native American of the Pima and Papago tribes. Without adjusting for the extent of caucasian ancestry, there is a very strong association between a variant of the Gm gene (part of the system of human immunoglobulin G) with type-2 diabetes mellitus. However, by estimating the degree of causasian ancestry of each subject and stratifying on that, the association vanished.
The second study design is family-based. Family-based designs are popular because they test for both linkage an association; this makes them immune to confounding by admixture. Thus many investigators regard family-based designs as worth the extra effort required to find relatives willing to participate.
The simplest and most popular family based design obtains data on the diseased offspring and his/her parents. With data on parents, one can use Mendel's laws under H0 to calculate the expected distribution of genotypes among affected offspring,. If the observed deviates from the expected, this is evidence for association between disease and the marker. There are many extensions to this basic design and a number of analytic issues that arise with more complex designs. How does one compute the null distribution of offspring genotypes when parental genotype data are missing? What about association with quantitative or time-to-onset traits? What is the appropriate distribution if there are multiple offspring and it is already known that linkage is present? How to test for gene-environment interactions?
The department's Family-Based Association Test (FBAT) project was initiated in 1998 with a grant funded by the National Institute of Mental Health. The specific aims of the project were to develop a comprehensive methodology for testing for association with family based designs that can accommodate data on trios, nuclear families and pedigrees, with possibly missing parents, can handle different types of traits, different genetic models, and tests of linkage as well as tests of association in the presence of linkage.
Much of the basic methodological work has been completed, and an overview of the methodology and some references can be found at our websites (http://www.biostat.harvard.edu/~fbat/default.html or http://www.biostat.harvard.edu/~fbat/pbat.html. The computer package FBAT (at the fbat default website) provides the software to analyze the data from any type of family-based study design. Recent work has focused on the development of power and sample size calculations (contained in PBAT available at both websites), the construction of optimal test statistics, the analysis of different types of traits (dichotomous, continuous, time-to-onset) and multiple traits, gene environment interactions, pedigrees, Monte Carlo and exact tests for small samples, tests for multiple tightly linked markers (haplotypes) in a narrow region and methods for estimating effect sizes.
Department members collaborate in asthma studies with the Genetic Epidemiology Program at the Channing. One of the major goals of this collaboration is the statistical analysis of the genetic data from the Childhood Asthma Management Project (CAMP). CAMP was a clinical trial in which children with mild to moderate asthma were assigned to one of three treatment arms. During the follow-up period of five to seven years, asthma-related phenotypes (genetic traits are often referred to as phenotypes) were repeatedly measured and recorded. In an ancillary genetic study, the DNA samples from most children enrolled in CAMP and their parents were collected.
The diagnosis of asthma, like the diagnosis of many other complex diseases, is made from different combinations of abnormalities in sets of disease related phenotypes. Family-based association tests using simply the affection status often fail to find significant associations. Alternatively, the phenotypes can be tested individually. However, standard adjustments for multiple testing may become too severe to detect associations that reach over-all significance.
Our analysis of the CAMP genetic data was designed to take this complex definition of asthma into account. We extended the FBAT-methodology so that multiple phenotypes which form natural symptom groups can be tested simultaneously, thereby reducing the number of comparisons. In order to avoid the requirement of a priori grouping, we further developed an algorithm that first finds the combination of phenotypes that gives the most powerful test statistic. Then, in the second step of the algorithm, only the selected phenotypes are tested. The algorithm reduces the number of multiple comparisons substantially and out performs all other methods for multiple comparisons dramatically.
By applying these new methods to the CAMP data set, we have been able to find many significant associations between a series of candidate genes and asthma-related phenotypes that would have been missed otherwise. The application of our new methods has been formalized and implemented in the PBAT program. All new family-based genetic data sets that are collected in the Channing are now screened routinely for potential associations with PBAT.
A second collaborative project seeks to understand the genetic basis of bipolar disorder. Subjects in this study are being recruited from a large clinical trial, Systematic Treatment Enhancement Program-Bipolar (STEP-BP) based at Mass General and the University of Pittsburgh. There have been many linkage studies of bipolar disorder, but no conclusive findings that have been consistently replicated. The STEP-BP is very large trial (approximately 5,000 bipolar subjects enrolled), and offers a unique opportunity for genetic sub-studies. Linkage studies are not feasible with this population because they require multiple affected relatives, and traditional case-control studies are not possible because it is not feasible to obtain the necessary information for psychiatric diagnosis of potential controls. Thus a family based design using a single known affected offspring is ideal; it does not require making the diagnosis on any other participants. Because the population of subjects is 18 years and older, it is expected that many subject's will not have available parents, and siblings with unknown affectation status will be sought for family controls. Part of the motivation for the PBAT software was to allow one to calculate power of the study when siblings with unknown affectation status are used in the family-based design. With rare disorders, such as bipolar, the power loss relative to using unaffected siblings is negligible. Assuming only 2,500 patients agree to participate, we still were able to show that this study will have over 90% power using standard assumptions about disease genes and their effect sizes.
Several members of our group also collaborate with investigators at MGH on ongoing studies of Alzheimer's disease. Alzheimer's disease patients rarely have parents, but often data are available on siblings. Using methods developed by our group, we found a mutation in the alpha-2 macroglobulin (A2M) gene was strongly and significantly associated with Alzheimer's disease. The finding was exciting, since previous linkage studies failed to show any connection, and the results were obtained using methods developed by our group for analyzing data when parents are not available. The finding remains controversial because it has not been consistently replicated in other studies. It has, however, continued to be a strong finding in the MGH data when additional families and markers added.
Currently, we are working on extensions of the FBAT-approach to large pedigrees and on extensions that allow for more sophisticated disease models. In linkage-studies for complex diseases, often a small number of very large pedigrees is collected. After a region of linkage has been identified in such a study, it is most cost and time efficient to use the data on the extended pedigrees also in a family-based association analysis.
For example, for a linkage-study of COPD in Costa Rica, our collaborators at the Channing will collect about 20 to 25 large pedigrees with up to 50 family-members in an isolated valley region. In order to have sufficient power for the FBAT analysis and to ensure also that the symptotic theory for FBATs holds, we will have to incorporate the entire pedigree structure in the test statistics.
In a collaboration with the Royal Twin Registery in London, we will analyse pairs of female twins with at least one sibling diagnosed with osteoporosis. As with Asthma, Osteoporosis and it's severity are much better described by several quantitative measurements rather than by the affection status or by an aggregated measure. For Osteoporosis, the most relevant quantitative phenotypes are a set of bone density measurements at well defined positions in the skeleton. The strength of the genetic effect is likely to depend upon age and menopause status. A meaningful FBAT-analysis should incorporate these potential interactions between the gene and the covariates. At the moment we are working on a unified approach that includes such interactions and covariates in the FBAT-statistics.
Many department members past and present have contributed to the development of the project, including Chris Corcoran, Steve Horvath,Ming-Chih Kao, Nan Laird, Steve Lake, Christoph Lange, Kathy Lunetta, Dan Rabinowitz, Mei-Chiung Shih, Kristel van Steen and Marcia Wilcox. Xin Xu (Assistant Professor, Department of Environmental Health Sciences, HSPH) developed the FBAT software and Christoph Lange developed the PBAT software.
The challenges of the future for family-based association studies will be to keep up with the fast growing knowledge about the genetics of complex diseases and with the rapid technological development. By understanding the genetic components of diseases better, FBATs will only remain a powerful tool as long as the growing knowledge about genetics can be incorporated in the analysis. Adding this knowledge to the FBAT approach while keeping its key-advantages, robustness against population admixture and simplicity of the test statistic, will be the major methodological problem that has to be solved.
The department awarded eleven doctoral degrees this June. Minhee Kang received the department's first Doctor of Philosophy degree. The Doctor of Science degree was awarded to Elizabeth Brown, Meredith Goldwasser, Karen Han, Andy Houseman, Jie Huang, Amy Stubbendick, Mahlet Tadesse, Chien-Cheng Tseng, Theodore Whitfield and Zi-Fan Yu.
Fourteen students were awarded Master of Science degrees. Lesego Gabaitiri, Tom LaFramboise, Chelu Mfalila and Ann Thomas have moved on to employment or further education outside HSPH. Amy Carey, Andrea Cook, Laura Forsberg, Melody Goodman, Beth Ann Griffin, Meg Karamitis, Bill King, Kevin Rader, Brisa Sanchez and Suzanne Szwarc are continuing in our department as doctoral students.
Congratulations to all of our graduates and best wishes for continued success!
Graduating Students from left to right: (back row) Elizabeth Brown, Amy Stubbendick, Andres Houseman, Jie (Jenny) Huang, Theodore (Hatch) Whitfield; (front row) Meredith Goldwasser, Zi-Fan Yu, Karen Han
We asked graduates who will no longer be students in our department to comment on their experiences here. The following is a selection of their responses.
1) What are you doing now?
Minhee: I will start my position as a Biostatistician at Massachusetts General Hospital in September, with a faculty position (as an instructor) at the Harvard Medical School.
Mahlet: I am currently a Bioinformatics post-doctoral trainee in the department of Statistics at Texas A&M University. My official title, however, is research assistant professor.
Andres: I have a research associate position here at HSPH.
Amy: Working as a biostatistician for Biogen. I am working on Phase IV trials involving Avonex, a drug for the treatment of multiple sclerosis.
Tom: Starting my postdoc with Vince Carey and Victor DeGruttola on HIV genomics.
2) To which areas, if any, do you wish you had received more exposure as a student?
Minhee: I've been very happy with the program. I enjoyed TAing different classes each year. TAing in Tokyo and a summer internship at Merck were excellent experiences. I got everything I was looking for in a doctoral program.
Mahlet: I wish I had had the opportunity to closely collaborate with biologists and clinicians.
Andres: I would have preferred more exposure to domain knowledge in areas of application. For example, the opportunity to take more classes in non-biostat subjects.
Amy: Applied areas--clinical trial protocols, stat plans, collaboration with investigators.
3) What was the job search process like?
Minhee: It was short for me. My strategy was to start with academic positions as I was still working on my thesis, and then look into industry if I didn't find a good match in academia. I applied for two academic positions and found a good match right away. Making the commitment was the most difficult part--because I didn't expect to get offers so quickly. Timing is very important.
Mahlet: It was pretty straightforward for me. I wanted the position at Texas A&M and it's essentially the only one I applied for and got.
Andres: My job search was very limited because, once it was clear that it was feasible to finish within four years, I was under some pressure to finish quickly.
Amy: Not too bad, just lots of self-initiating stuff --searching on the web, sending out resumes, trying to find contacts, talking to recruiters, etc.
4) What were employers looking for in terms of skills, qualities, coursework and applied experience?
Minhee: In my case, I think that previous work experience in the ACTG helped a lot. Generally speaking, because the work of a biostatistician usually involves teamwork with experts of various disciplines, I assume that the employers look for candidates with good communication skills. And to communicate the concepts well to others, a solid coursework that provides a good understanding is important.
Mahlet: I'm in a training program in Bioinformatics, so the employers were looking for statisticians who are interested in getting involved in the field and have good collaborative skills. I had the advantage that my thesis work was in the same area of research, which I believe was a plus.
Andres: Hard to say, since I didn't go through any formal interviews. For the HSPH research associate position, I suspect that professional reputation and personality played a big role.
Amy: Everyone wanted applied experience. Good writing and speaking skills as well. SAS programming is a plus. Coursework in survival, multivariate, GEE, ANOVA, the basics.
5) What advice would you offer to newer students in the program?
Minhee: Take advantage of all that the department has to offer, rather than just working on courses! The department provides wonderful opportunities for internships at first-rate pharmaceutical companies, TAing in foreign countries and other summer opportunities abroad. Enjoy being a graduate student. Having worked for four years prior to entering the program, I fully appreciated the student life. (But now it's time for me to move on...)
Mahlet: Get as much exposure as possible to different areas of research by taking various classes, and most importantly try to acquire some practical and collaborative experience by working closely with biologists or clinicians.
Andres: Decide early whether you want to be more focused on statistical research or on practical applications. If the latter, find a way to become educated in at least one non-biostat area. To this end, use the minor requirements wisely. Also, spend time developing good working relationships with faculty.
Amy: If you're interested in an industry job, I would suggest a summer internship.
Tom: Start before your first year. Spend the summer going through as much of, for example, Casella & Berger as you can. Reinforce your linear algebra skills as preparation for Regression and ANOVA.
Chelu Mfalila, Steve Lagakos and Lesego Gabaitiri with Chelu's niece and nephew at the post-graduation reception.
Marvin Zelen, Lesego Gabaitiri,
Liliana Orellana, Jarek Harezlak and Chelu Mfalilia (with her niece and nephew)
at the post-graduation reception.
By Stephen W. Lagakos
This has been a busy year in the Department. Some of these activities are described in this issue of the newsletter, and many others are detailed on our web page (www.biostat.harvard.edu), which I encourage you to browse. I would like to focus on three significant events: our new acting chair, adoption of the PhD degree, and the establishment of a departmental Alum award.
New Chair: I am pleased that Professor L.J. Wei will be serving as acting chair of the department from July 1, 2003 until June 30, 2004, while I am on sabbatical leave. LJ's many outstanding attributes, including his broad vision of our field, his interest in students and education, and his concern for the future of our department, make him an ideal choice for this position. We all wish LJ the best in his new role.
PhD Degree: For quite some time, the biostatistics faculty has discussed the possibility of adopting the 'PhD' as our doctoral degree. The decision to offer the PhD had nothing to do with differences in the nature or level (in terms of coursework) between the training programs, but rather involved other considerations, including:
Alum Award: In order to maintain closer ties with our 'Alums', and to recognize their contributions, the department has initiated several activities:
With best wishes, Steve Lagakos
The Department of Biostatistics at the Harvard School of Public Health has named Dr. Wayne A. Fuller, Professor, Departments of Statistics and Economics, Iowa State University, as the recipient of the 2003 Marvin Zelen Leadership Award in Statistical Science. The lecture, "Analytic Studies with Complex Survey Data" was given on a Friday, May 30, immediately following the Schering-Plough Workshop at Harvard.
Dr. Fuller is Distinguished Professor Emeritus at Iowa State University. He has cooperative research projects with the NRCS, the USDA National Agricultural Statistics Service, Westat, and the Census Bureau. His research interests include time series, particularly estimation for autoregressive and unit root processes, measurement error models and survey sampling. Current research activities include nonlinear measurement error models, estimation for small areas, estimation for two phase samples, and imputation for survey samples.
In his lecture, Dr. Fuller spoke about the use of data collected under a complex survey design. He identified some of the implications of complex designs for model specification and estimation and discussed alternative superpopulation models. He illustrated that survey designs pose challenges for the analyst, but also offer opportunities for testing of the subject matter model and for the development of expanded subject matter models.
This annual award, supported by colleagues, friends and family, was established to honor Dr. Marvin Zelen's long and distinguished career as a statistician and his major role in shaping the field of biostatistics. The award recognizes an individual in government, industry or academia, who by virtue of his/her outstanding leadership, has greatly impacted the theory and practice of statistical science.
LJ Wei, Wayne Fuller and Louise Ryan
Dr. Danyu Lin, Dennis Gillings Distinguished Professor, Department of Biostatistics, University of North Carolina School of Public Health was this year's winner of the Myrto Lefkopoulou Distinguished Lecturer award. This annual award was initiated in 1993 in memory of Myrto Lefkopoulou, a former beloved faculty member and student in the Department of Biostatistics. Dr. Lefkopoulou tragically died of cancer in 1992 at the age of 34 after a courageous two-year battle.
Dr. Lin presented the Myrto Lefkopoulou Distinguished Lecture on "Selection and Assessment of Regression Models" on September 19, 2002 in the Snyder Auditorium. In the talk, Dr. Lin presented objective and informative strategies for model selection and assessment based on the cumulative sums of residuals over certain coordinates (e.g., covariates or fitted values) or some related aggregates of residuals (e.g., moving averages and kernel smoothers). He showed how the distributions of these stochastic processes under the assumed model can be approximated by the distributions of certain zero-mean Gaussian processes whose realizations can be easily generated by computer simulation. Each observed residual pattern can then be compared, both graphically and numerically, with a number of realizations from the null distribution. Such comparisons enable one to assess objectively whether a specific aspect of the model (e.g., the functional form of a covariate, the link function or the proportional hazards assumption) has been correctly specified. They also provide helpful hints on how to obtain an appropriate model. Finally, he applied his approach to a wide variety of statistical models and data structures, and provide illustrations with several clinical and epidemiologic studies. The lecture was followed by a presentation of a plaque and a reception in Dr. Lin's honor.
The winner of the 2003 Myrto Lefkopoulou Distinguished Lecturer award is Marie Davidian, Ph.D., of North Carolina State University. Dr. Davidian will give her lecture on Thursday, September 18, 2003 at Harvard School of Public Health.
Dr. Danyu Lin
This year's Harvard/Schering-Plough Workshop was held on May 29-30, 2003. The theme was impact of statistics on development and approval of oncology drug products. Over 200 participants from academia, industry and government attended the workshop, including a large number of former students. The first session, chaired by Donna Neuberg (HSPH), provided an overview of statistical challenges in oncology drug development, such as new study designs, testing non-inferiority hypotheses, time to progression as a cancer drug endpoint, and barriers to translating basic research to drug products for patient care. Presenters in this session included Lee Nadler (Harvard Medical School and Dana-Farber Cancer Institute), Renzo Canetta (Bristol-Myers Squibb), Gang Chen (FDA), Grant Williams (FDA), and Stephen George (Duke University Medical Center). The second session, chaired by Stephen Lagakos (HSPH), focused on survival analysis. Terry Therneau (Mayo Clinic) discussed the use of mixed effects Cox models, with an application to incorporating familial genetic effects in a breast cancer family study. LJ Wei (HSPH) presented a review of recent developments in survival analysis as well as areas where further research is needed. The third session, chaired by Robert Gray (HSPH), addressed statistical issues in observational studies and causality. Donald Rubin (Harvard University) discussed causal inference with surrogate outcomes, with a case study of human and macaque trials of anthrax vaccine. Andrea Rotnitzky (HSPH) discussed how to assess treatment effects in the presence of censoring by death. The workshop concluded with a final session on experiences and expectations, chaired by Marcello Pagano (HSPH). A variety of statistical issues were discussed and illustrated in applications, such as design and analysis of two-stage stratified sampling designs, adaptive randomization in the evaluation of multi-stage therapeutic strategies, decision analysis, and stratified randomization. Presenters in this session included Norman Breslow (Fred Hutchinson Cancer Research Center), Peter Thall (M.D. Anderson Cancer Center), Milton Weinstein (HSPH), and Steven Piantadosi (Johns Hopkins School of Medicine). Also in this session, Yasuhiro Fujiwara (National Cancer Center Hospital, Japan) presented an overview of oncology drug development and approval in Japan. The workshop was followed by the presentation of the year 2003 Marvin Zelen Leadership Award in Statistical Science to Professor Wayne A. Fuller (Iowa State University).
Nan Laird and Samuel Heft
Our Panel of Distinguished Presenters
The Department of Biostatistics is pleased to announce the establishment of the Distinguished Alum Award. This award is being initiated to honor the exemplary achievements of a former graduate of the department.
Each year the Distinguished Alum Award will be given to a former student who has achieved a high level accomplishment in any or all of the following areas: collaborating in health research, developing statistical methods or theory, organizational responsibility, or teaching. The award will be open to all who have an earned degree through the department, regardless of length of time since graduation or type of degree. The award recipient will be invited to deliver a lecture on their career and life beyond the Department at the Harvard School of Public Health. The recipient will also be presented with a plaque. A committee of alumni was formed to develop the criteria for award, and to make the selection.
Nominations for next year's award should be sent to the Distinguished Alum Committee, Department of Biostatistics, Harvard School of Public Health, 655 Huntington Avenue, Boston, MA 02115. Nominations should include a letter describing the contributions of the candidate, specifically highlighting the criteria for the award, and a curriculum vita. Supporting letters and materials are welcome but not required. All nominations must be received by January 31, 2004. The first award will be given in May 2004.
This past January was the first run of the School of Public Health's newly created WinterSession. The aim of the WinterSession, as given on the School's website, is to provide both students and faculty with an opportunity for creativity and innovation in learning and teaching.
Within the Department of Biostatistics, funded students were required to participate in activities equivalent to the full workload of a 2.5 credit course, about 80 hours of real-time work. Students choosing to not take courses were required to work on appropriate projects, as approved by their advisor. Complete WinterSession guidelines for Biostatistics students are detailed on the Department's website.
Courses offered by the Department of Biostatistics during this past WinterSession most relevant to it's students were BIO 229, Practical Pattern Recognition of Genomic Data, and BIO 286, Introduction to Genomics. Biostatistics students also took courses outside of the Department, including ID 285, Environment Health Risk: Cases and Concepts.
In an attempt to evaluate the newly created WinterSession from a student's perspective, the 2003 Curriculum Committee of the Department of Biostatistics provided each Biostatistics student who took a WinterSession course with a course-evaluation form. The form was designed by the Curriculum Committee and Donna Neuberg, and consisted of eleven questions. One item in particular that the Curriculum Committee wanted to investigate was whether the short time frame of the WinterSession caused courses to be too rushed.
Overall, most responses were positive. As BIO 229 presented so many interesting topics, some students felt that they could have used more time to study this area in more depth. Of particular note was the general enthusiasm for the discussion-based format of BIO 286. Most students thought that this approach was ideal for the short time frame of the WinterSession. When asked if they had ideas for other types of courses to be offered in future WinterSessions, several students suggested a computing course, such as programming in R or C/C++.
This summer, the June program for minority college undergraduates was reorganized to become a broad introduction to research methods and topics in public health. In addition to introducing students to options in public health careers, we hope to recruit them to our doctoral programs at HSPH in Biostatistics, Epidemiology and Society, Health and Human Development. To that end, Math and non-Math majors were accepted, and the Summer Program in Biostatistics became the Summer Program in Quantitative Sciences for Public Health Research. This summer, we had an excellent group of students who came from all over the United States from universities including the University of Texas, San Antonio and the University of Puerto Rico, Rio Píedras, to Hampton University and Harvard.
Most of the session consisted of lectures and classwork. Students attended a morning class in introductory Biostatistics, taught by Dr. Scarlett Bellamy, a graduate of the department and currently Assistant Professor at the University of Pennsylvania School of Medicine. In the afternoons, students attended modules of lectures in Epidemiology, Society, Health and Human Development and Applying to Graduate School. Friendly competition between the other departments was noted, with Epidemiology bringing forward excellent professors like Marc Lipsitch and Julie Buring to excite the students about Infectious Disease Epidemiology and Epidemiology. SHHD chair, Lisa Berkman, wowed the students with a great presentation on Population Health and the impacts of society on health. Several students were considering work in these areas as a result of the excellent lectures.
Students also chose group research projects, and the students with strong math backgrounds were able to pursue their interests by learning new statistical methods. One of the groups, which was composed of three biostatistics majors, worked with Dr. Gregory DiRienzo and graduate student Yannis Jemiai to correlate CD4 and Viral Load counts with the efficacy of the drug Indinavir from a clinical trial. The students learned about survival analysis methods, programmed in R and now all three plan to apply to graduate programs in biostatistics. Other student projects included: Gene Expression and Endometrial Cancer Risk, Utilization of Dental Services at Children's Hospital, and the Correlation between BMI and Postnatal Depression in Pregnant Women.
In addition, students had many opportunities to meet with faculty from the departments outside of lectures to discuss research interests, including a faculty lunch convened by Epidemiology and a trip to dim sum with Dr. LJ Wei from Biostatistics. During one meeting, several alumni of the Biostatistics department were able to spend time with students and talk about academic careers, including Dr. DeJuran Richardson and Dr. Stuart Lipsitz. The graduate student perspective was also well represented by the graduate student mentors who supervised the research projects. Informal lunches and meetings convened by all three departments added to student-faculty interactions.
Summer Program students with LJ Wei, Louise Ryan and Farzaneh Behroozi (center) in Chinatown.
Thanks to the efforts of Masahiro Takeuchi who graduated from our department in 1991, we have a partnership with the Department of Biostatistics at Kitasato University, Tokyo. Over the years faculty from the HSPH Department of Biostatistics have given courses at Kitasato. Students from our department have accompanied these faculty members as teaching assistants. We wondered about the TA experience of our students in a foreign country where English was not the primary language. So we requested some of them to share their experiences with us. What comes through is the extraordinary hospitality extended to our students by their colleagues at Kitastao.
In November of 2000, I had the good fortune to assist Steve Lagakos teach the Analysis of Failure Time Data course to the students at Kitasato University's Department of Biostatistics in Tokyo, Japan. It was indeed a fantastic experience. The department was fairly new at the time, but had already attracted approximately a dozen enthusiastic and talented students, mostly professionals working in the field at pharmaceutical companies in Japan. Many students commuted from cities as far away as Osaka, up to two hours by "bullet" train, to participate in the course. I was highly impressed by the level of courtesy the faculty (headed by HSPH grad Masahiro Takeuchi), staff and students showed Steve and me, we were treated with VIP status. This included first class accommodations, catered lunches, and dinners in some of Tokyo's finest restaurants.
The weekend between our two-week stint was no exception. Three of the students volunteered to accompany Steve and myself on a guided hike with views of some of the most beautiful landscapes I have seen on Japan's Mt. Tanigawa, located a couple of hours north of Tokyo. I took a break the next day and was treated to a guided tour of the city, which included many fascinating temples and museums. The day ended with an authentic Japanese meal at the home of Yasuyo Kodaka, the department's administrative director. Steve, our never tiring leader, went on another challenging hike closer to Mt. Fuji, also with stunning views.
The course was a huge success, all of the students worked so hard and mastered the material so well for such an intensive course (not to mention the fact that it was offered in English!) It was the first of what I hope to be many visits to Japan, and I consider the experience one of the highlights of my years spent in the Department of Biostatistics at HSPH.
Karen Eckstein Han (Sc.D. 2003)
Because I was used to TAing courses where the students were vocal, what struck me the most was how quiet the students were. The first few sections were pretty quiet, and I wondered how much the students understood, since English is not their primary language. I tried to speak more slowly in proper English and avoid using slang and words such as "like" and "you know", but I found myself returning to my normal way of speaking after about 10 minutes! I wasn't sure at first if the students were not understanding me well or if they were simply shy. After the first couple of days, I asked a few students whom I happened to see at a restaurant. They told me that it is better to pick students to answer questions. I suppose that the students didn't want to look like they were showing off by raising their hands and answering questions! So I tried the Socratic method, and it worked fine. Also, Garrett and I started a contest where the students who speak the most in the class would get prizes! After a few classes, I got to know the students better and the classes and sections became more fun. And yes, we kept our promise and gave out prizes to the top three vocal students at the end of the session!
The politeness and helpfulness of the students and the staff also left strong impressions on me. As an example, I wish I had kept the detailed itinerary that a student made for me, when I asked him for recommendations on visiting Kyoto. I was so surprised when the student returned a few days later and handed me a 5- or 6-page itinerary of how I could enjoy Kyoto! He had made copies of maps and drew arrows so that I could find various places, wrote down suggestions for restaurants, pleasant walks, places for shopping - you name it, he had it all! Also, the department secretary, Yasuyo, was most thoughtful in every way. Who, but Yasuyo, would think of carefully penciling in the English translation of the information on the train tickets? She thought of everything and was so efficient in her ways.
In other words, I'd be ready to pack up my bags to TA in Tokyo again any time! I hope the TA opportunities for the department students continue.
Minhee Kang (Ph.D. 2003)
The two weeks while I was TAing in Kitasato University were full of fun and pleasant memories. I'm still very grateful for the hospitality of the faculties, staffs and students at Kitasato University. The students were very polite and quiet during class, but they were really fun to talk to and to hang out with after class. There is one student who is from Osaka but knows almost all the best restaurants and fun places in Tokyo, and we called him "Master of Dining". Of course we followed Master of Dining's recommendations and tasted some really good and authentic Japanese foods in Tokyo. I'd also like to thank another student, also from Osaka, who took the time to show me around Kyoto. Without his guide, I'd get lost in this beautiful city. He was so considerate that he wrote down all the places we had visited in Kyoto in both Japanese and English. I still have this piece of paper which always remind me of the happy days I had in Japan. It's really my pleasure to know such a class of wonderful students in Japan.
Yihua (Mary) Zhao
David Wypij Leads Biostatistics High Growth Venture at Children's Hospital in Boston
From left to right front (Henry Feldman, Clarissa Valim) and rear (Mei-Chiung Shih, Armando Teixeira-Pinto, Leslie Kalish, David Wypij)
Did you know that there is an expanding group of current and former faculty and students of the Department of Biostatistics located in the Clinical Research Program (CRP) at Children's Hospital Boston? The role of biostatistics was greatly expanded when David Wypij (current HSPH faculty) became the Director of Biostatistics in the CRP which has experienced substantial biostatistical personnel growth during a relatively brief period. Since David's arrival he has recruited and has been joined by several biostatistical professionals with ties to HSPH's Biostatistics department including Mei-Chiung Shih (current HSPH faculty), Henry Feldman (former HSPH faculty '78-'89), Leslie Kalish (ScD '86 and faculty '87-'93), Clarissa Valim (SM '02 and ScD '03 in Immunology and Infectious Diseases) and Armando Teixeira-Pinto (current doctoral student). Ming Lin (current doctoral student) has also previously worked at the CRP. These individuals and several other statisticians have joined together to create an exciting new opportunity for Biostatistical research and collaboration within the Longwood Medical Area.
Formed in 1998 to enhance the quality of clinical research at Children's Hospital, the CRP includes biostatisticians, epidemiologists, physicians, and specialists in survey design, protocol development and review, database and web design, and data management. "It has been challenging to promote the visibility of statistical science at Children's over the last few years, but also very rewarding to see the corresponding increase in respect for our contributions to research," says David Wypij. The CRP provides methodological and biostatistical support for clinical research throughout the hospital. This includes consultation on study design, protocol development and grant writing; review of protocols and other study proposals; database design and management; design and development of randomization and blinding procedures, data collection forms and interim monitoring procedures; statistical analyses; and collaboration on study publications and presentations.
The collaborative research that the CRP participates in includes a very wide variety of research designs, levels of research experience of medical investigators, and topics studied. Biostatistical involvement in any particular project might range from a single brief consult to a long-term collaboration. One of the major collaborations in the CRP is with the Glaser Pediatric Research Network (GPRN), a national network of investigators studying pediatric illnesses, funded by the Elizabeth Glaser Pediatric Research Foundation. The CRP serves as data coordinating center for the GPRN. The CRP is also active in training and education. Dr. Wypij is co-director of the "Introduction to Clinical Research" course, offered twice a year for junior clinical investigators at the hospital. Henry Feldman also gives several lectures in this course, as well as a separate course aimed at residents. In addition, CRP statisticians serve as mentors to individual medical fellows and junior faculty, helping them design and analyze their research projects. Within the CRP, the statisticians meet regularly for informal discussions of statistical design and analysis problems they encounter in their work.
Funding for the CRP comes from many sources. The hospital provides some core funding as part of its' support for clinical research and training. Additional funding comes from the hospital's General Clinical Research Center (GCRC) and from many federal and foundation research grants. "We are happy to provide limited consulting support to investigators, but if the project requires more than a few hours of our time, we try to get specific funding," says David Wypij. Having added two new statisticians to its staff in just the past few months, and searching for another, the CRP is a healthy and growing organization with great academic investment potential.
Zelen honored with Diploma of Doctor Honoris Causa
The Department is very proud to announce that Dr. Marvin Zelen has been nominated to obtain the Diploma of Doctor Honoris Causa from the University Victor Segalen Bordeaux 2 in France. The official ceremony has been tentatively scheduled for September 2003. Lagakos elected Member of the Institute of Medicine The Department extends congratulations to Dr. Steve Lagakos who has been elected a member of the Institute of Medicine of the National Academies.
Betensky and Fitzmaurice Elected Fellows of ASA
Rebecca Betensky and Garrett Fitzmaurice, HSPH Biostatistics faculty members, have been elected as fellows of the American Statistical Association. The awards ceremony for new fellows will be held during the coming Joint Statistical Meetings in August 2003. The department congratulates Rebecca and Garrett for this well deserved honor.
Wing Wong Elected to AAAS
The Department is happy to announce that Professor Wing Wong has been elected a fellow of the American Association for the Advancement of Science. Our congratulations to Wing!
Biostatistics Alumni Reception
A Biostatistics Alumni Reception will be held at the 2003 Joint Statistical Meetings in San Francisco, California on August 4, 2003 from 5:30 PM to 7:00 PM at the Hilton San Francisco. All alumni and friends are welcome to join us at the reception. More information may be found online at http://www.amstat.org/meetings/jsm/2003/onlineprogram/.
Department & Schering-Plough Receive Partnership Award
In recognition for our long-standing and successful partnership with Schering-Plough Research Institute, Harvard and Schering-Plough have been selected by the ASA to receive the 2003 SPAIG (Statistical Partnerships among Academia, Industry, and Government) Award. The award will be presented during the Presidential Address session of the annual meeting of the ASA in San Francisco on August 5, 2003.
New Faculty join the Department
Welcome to Dr. Christoph Lange and Dr. Peter Kraft to the Department of Biostatistics! As a result of a recent Junior Faculty Search, Christoph Lange, formerly a post doc in the department, has joined us as an Assistant Professor in Biostatistics. Dr. Kraft has been appointed to a joint position with Epidemiology.
Senior Faculty Search Ends
Congratulations to Dr. Robert Gray, of the Dana Farber Cancer Institute and the Department of Biostatistics in his promotion to tenured Professor!
The following is recent news from our alums:
Thanks to C. Ralph Buncher (ScD '67), who tells us that there are many measures of the quality of the graduates of Harvard statistics programs. One measure of success is Fellowship in the American Statistical Association, which is only conferred on about 5 to 10% of members. Many worthwhile and successful careers do not result in Fellowship, so this is only one measure. In the 1960s (actually the data refer to 1962 through 1973), the major program was the Cambridge PhD program, and the smaller program was in Biostatistics. Of the five ScD graduates in biostatistics, four have become fellows. Of the approximately 30 PhD graduates in the same period, 20 have become fellows. This is a truly impressive achievement for the institution and the teachers of these students. (email@example.com)
Sheldon Fishman (MS '73) is working as Webmaster for Swidler Berlin Shereff Friedman, a large law firm in Washington, DC. He is also an instructor at Johns Hopkins University. After many years in the US Public Health Service, his current contribution to PHS is genetic--his daughter works in the Food and Drug Administration's General Counsel office. In June, his four children and son-in-law gave him and his wife a 30th wedding anniversary party. At the party, there were 38 couples who had been married for at least 20 years for a total of 1207 years of marriage. No estimates of sex acts were undertaken. (firstname.lastname@example.org)
Mike Parzen (ScD '93) and his wife Andrea just had their first baby boy, Zeke Parzen. Mike will be leaving the University of Chicago for Emory University this fall. The Bris in Chicago was attended by Stuart Lipsitz (ScD '88) and DeJuran Richardson (Adjunct Associate Professor of Biostatistics). (email@example.com)
Nick Horton (ScD '99) has accepted a position as Assistant Professor in the Mathematics Department at Smith College, joining Katherine Taylor Halvorsen (ScD '84), Professor of Mathematics. Nick will have lots of opportunities to teach statistics at Smith. The Mathematics Department offers students a minor in Applied Statistics, and the Mathematics major includes a Statistics track.
John Spritzler (ScD '92) had his book on World War II published in May. It challenges the "good war" story of World War II. The book, called "The People As Enemy: The Leaders' Hidden Agenda in World War II," by Black Rose Books, is available at the Coop. (firstname.lastname@example.org)
Jonathan French (ScD '00) and his wife Alexandra had a baby girl on June 10, 2003. Her name is Caroline Camille French. All are doing well. (email@example.com)
Christl Donnelly (ScD '92) and Ben Hambly had a baby boy, Evan Matthew Donnelly Hambly on June 5, 2003 in Oxford, England. (firstname.lastname@example.org)
Amy Herring (ScD '00) married David Dunson on June 14. She is an Assistant Professor at UNC Biostatistics, and David is Senior Investigator, National Institute of Environmental Health Sciences, and Adjunct Associate Professor, Institute of Statistics and Decision Sciences, Duke University. (email@example.com)
Phuong Dang (MS '95) is currently a Sr. Supervisor in Medical Affairs at Genentech. (firstname.lastname@example.org)
Wendy Leisenring (ScD '92) is still enjoying her work in Seattle where she is an Associate Member at the Fred Hutchinson Cancer Research Center. She is also kept busy and happy by her two active kids; Annie who is almost 5 and Cameron who is 2.
We would love to hear from you too! Please send your news to email@example.com.
Our NIH-funded IMSD grant (Initiative for Minority Student Development) was successfully renewed last year for another four years. The grant supports 10 predoctoral students at HSPH, as well as our Summer Program and an annual workshop. This year, our webpage was used as a model for the school-wide diversity webpage design. Our IMSD students continue to do well - three more this year successfully applied for their own individual NIH grants. Renee Boynton-Jarrett, Melody Goodman and Larry León, and Jennifer James received a Ford Foundation Fellowship. In addition, several students from the group will graduate this year: Kevin Roberts and Cassandra Arroyo from the Department of Biostatistics, Maleeka Glover from the Department of Health and Social Behavior, and Cheryl Clark from Stanford Medical School and the Department of Health and Social Behavior. Kevin will go on to a post-doc in biostatistics at Columbia, Cassandra will take a position at Morehouse, Cheryl is a resident in internal medicine at Brigham and Womens Hospital in Boston, and Maleeka will join the Epidemic Intelligence Service of the CDC.
The Community-Based Research Seminar, which is affiliated with our NIH IMSD Training Grant, hosted monthly talks, including a discussion of the Baltimore lead chelation trial with Dean James Ware, the study design of the Project on Human Development in Chicago Neighborhoods with Dr. Steve Buka and a lively presentation by Dr. Alba Cruz and Toni Williams on The Family Van, a community clinic service of Harvard Medical School. The seminar group continues to thrive as a social support for students, as well as a forum to present practice talks, discuss career plans, get help on leads for summer research projects, and a sounding board for student concerns such as what to do about the dreaded Boston winter blues. A highlight of the year was a lunch visit with Dr. Kenneth Olden, the director of NIEHS, who spoke to the students about their career plans and his own experiences at Harvard and Washington.
This year, we have expanded what was formerly known as the "Summer Program in Biostatistics" to become the "Summer Program in Quantitative Methods". The expansion reflects efforts to attract students to other quantitative fields in public health besides biostatistics, namely epidemiology. This year´s program was a great success, with some excellent presentations from faculty in the Epidemiology Department and the Department of Health and Social Behavior [which has now become the Department of Society, Health and Human Development.] Several faculty from the biostatistics department (Greg Dirienzo, Mei-Chiung Shih) served as faculty mentors for the program, while several of our students served as grad student mentors (Yannis Jemiai, Eric Tchetgen). Dr. Scarlett Bellamy (a former department grad, now on the faculty at U Penn) was our summer program instructor.
We are lucky this year to have a great program coordinator, Farzaneh Behroozi, to help keep things running well. Farzaneh comes to us with great experience, having been a research assistant at HMS and having worked in the community with City Year. When she is not coordinating the program, Farzaneh is a graduate student in public health and medicine at Boston University.
We are looking forward to welcoming new students in the fall, and want to wish our graduates the best of luck! We definitely encourage anyone interested in participating in our diversity program (as a speaker, mentor, whatever!) to contact us.