PQG Seminar Series

2009-2010

Fall seminars will be held from 12:00-1:30 PM in FXB G13 Spring seminars will be held from 12:30-2:00 PM in Kresge G2.

The Program in Quantitative Genomics at the Harvard School of Public Health will again host a regular monthly seminar series starting in the fall. The seminar series alternates on a bi-weekly basis with the Bioinformatics Forum.

The mission of the HSPH PQG is to improve health through an interdisciplinary study of genetics, behavior, environment and medicine through the development and application of quantitative methods, especially for high dimensional data, as well as the training of quantitative genomic scientists.

 

Upcoming Seminar



Tuesday, December 1, 2009
12:00-1:30, FXB G13

a pizza lunch will be provided

David Reich, Ph.D.
Associate Professor
Harvard Medical School, Department of Genetics
Associate Member, Broad Institute

"Reconstructing Indian Population History and Implications for South Asian Gene Discovery"

India has been underrepresented in genome-wide surveys of human variation. Here I describe an analysis of patterns of variation in 25 diverse Indian groups to provide strong evidence for two ancient populations, genetically divergent, that are ancestral to most groups today. One, the "Ancestral North Indians" (ANI), is genetically close to Middle Easterners, Central Asians, and Europeans, while the other, the "Ancestral South Indians" (ASI), is not close to any group outside the subcontinent. By introducing methods that can estimate ancestry without accurate ancestral populations, we show that ANI ancestry ranges from 39-71%, and is higher in traditionally upper caste and Indo-European speakers. Groups with only ASI ancestry may no longer exist in mainland India. However, the Andamanese are an ASI-related group without ANI-related ancestry, showing that the peopling of the islands must have occurred before ANI-ASI gene flow on the mainland. Allele frequency differences between groups in India are larger than in Europe, which we show reflects strong founder effects whose genetic signatures have been maintained for thousands of years due to endogamy. There are two key medical implications. First, our observations predict that there will be an excess of recessive diseases in India, different in each group, whose risk variants should be easy to identify using standard genetic methods. Second, the genetic risk factors that are only present in the ASI will be very difficult to discover without building specific genetic variation resources for South Asians.



Additional Seminar Dates


*** new room and time change in the spring***

February 2, 2010
12:30-2:00 PM, Kresge G2
Shamil Sunyaev, Brigham & Women's Hospital, Harvard Medical School

March 2, 2010
12:30-2:00 PM, Kresge G2
Hakon Hakonarson, Children's Hospital of Philadelphia

April 6, 2010
12:30-2:00 PM, Kresge G2
Nick Patterson - The Broad Institute

May 4, 2010
12:30-2:00 PM, Kresge G2
Tianxi Cai - Harvard School of Public Health

Past Seminars


Tuesday, November 3, 2009
12:00-1:30, FXB G13

Giovanni Parmigiani, Ph.D.
Chair, Department of Biostatistics & Computational Biology, DFCI
Professor, Department of Biostatistics, HSPH

"Cross-study Differential Gene Expression"

In this lecture I will present statistical issues associated with combining microarray data across studies. I will focus on the role of hierarchical Bayesian models in constructing useful rules to shrink across both genes and studies, and to classify genes based on the patterns of concordance across studies. I will describe in detail a model we call XDE, and evaluate its performance in a comprehensive fashion, using both artificial data, and a "split sample" validation approach that provides an agnostic assessment of the model's behavior not only under the null hypothesis but also under realistic alternatives. Compared to a more direct combination of t- or SAM-statistics, the 1 - AUC values for the Bayesian model is roughly half of the corresponding values for the t- and SAM-statistics. In small studies, XDE generally outperforms other methods when evaluated by AUC, FDR, and MDR across a range of simulation parameters, and this difference diminishes for larger sample sizes in the individual studies. Finally, I will illustrate our model using four breast cancer studies employing different technologies (cDNA and Affymetrix) to estimate differential expression in estrogen receptor positive tumors versus negative ones. Software and data for reproducing our analysis are publicly available.

A technical report can be obtained from: http://www.bepress.com/jhubiostat/paper158/

 


Tuesday, October 6, 2009
12:00-1:30, FXB G13

Peter Park, Ph.D.
Assistant Professor of Pediatrics HMS Center for Biomedical Informatics Children's Hospital Informatics Program

"ChIP-sequencing: Data Analysis and Applications"
ChIP-seq combines chromatin immunoprecipitation (ChIP) with next-generation sequencing to identify protein-DNA interactions on a genome-wide scale. After a brief introduction to next-generation sequencing, a number of practical issues in analysis of ChIP-seq data will be discussed, including experimental design, detection of binding sites, and determination of whether a sufficient depth of sequencing has been achieved. Application of ChIP-seq to the study of X-chromosome dosage compensation in Drosophila and nucleosome positioning will be described. If time allows, updates from the Cancer Genome Atlas and the model organism ENCODE projects will be given.

 


Tuesday, September 29, 2009
12:00-1:30, FXB G13

Mitchell Gail, M.D., Ph.D.
Senior Investigator Biostatistics Branch Division of Cancer Epidemiology and Genetics National Cancer Institute

"The value of adding single nucleotide polymorphism data to a model that predicts breast cancer risk"

Seven single nucleotide polymorphisms (SNPs) have recently been confirmed to be associated with breast cancer. I assessed the value of adding these SNPs to the Breast Cancer Risk Assessment Tool (BCRAT), which is based on ages at menarche and at first live birth, family history of breast cancer, and history of breast biopsy examinations. The model with these SNPs (BCRATplus7) had an area under the receiver operating characteristic curve (AUC) of 0.632, compared to 0.607 for BCRAT. This improvement is less than from adding mammographic density to BCRAT. I also assessed how much BCRATplus7 reduced expected losses in deciding whether a woman should take tamoxifen to prevent breast cancer and in deciding whether a woman should have a mammogram. In addition, I examined whether BCRATplus7 was more effective than BCRAT in allocating a scarce public health resource, such as access to mammography, based on ranking women on their breast cancer risk and allocating the resource to those at highest risk. In none of these applications did BCRATplus7 perform substantially better than BCRAT. A cross-classification of risk by the two models indicated that some women would change risk categories, depending on the risk threshold, if BCRATplus7 were used instead of BCRAT, but it is not known if BCRATplus7 is well calibrated. These results were hardly changed if three additional very recently identified SNPs were added. I conclude that the available SNPs do not improve the performance of models to estimate breast cancer risk enough to warrant their use outside the research setting.

Please feel free to contact us with any comments or questions at: sandelma@hsph.harvard.edu