Working Group Series

The PQG will host another monthly working group series for the 2013-2014 school year. Please see the full list of dates and speakers below, as they continue to be updated.

The PQG organizes the less formal working group seminar for all local students, postdocs, and faculty. The goal is to provide the opportunity to present and participate in the discussion of works-in-progress, focused on the methods and analysis of high-dimensional data in genetics and genomics.

Co-organizers: Bjarni Vilhjalmsson & Sasha Gusev

Please direct any logistical questions to Shaina Andelman


Upcoming Working Group


Tuesday, April 22, 2014
12:30-2:30, Building 2-Room 426
a pizza lunch will be provided

Kaitlin Samocha
Analytic and Translational Genetics Unit | MGH

Identification of a Set of Highly Constrained Genes from Exome Sequencing Data

A major challenge of medical genetics is to determine which variants, if any, contribute to disease in a patient. While variants can be prioritized based on their predicted deleteriousness, information about the disrupted gene can also be used to highlight those variants that are more likely to contribute to disease. For example, damaging variants in genes expressed in the relevant tissue might be prioritized over variants in genes that are not expressed in the tissue. Another potential way to prioritize variants is by the evolutionary constraint of the gene.

We developed a sequence context based model of de novo variation to create per-gene probabilities of synonymous, missense, and loss-of-function mutations. We noticed a high correlation (0.94) between the probability of a synonymous mutation in a gene and the number of rare, synonymous variants identified in that same gene using the NHLBI’s Exome Sequencing Project data ( We predicted the number of variants that we would expect to see in the dataset and – in order to quantify deviations from those expected values – created a Z score of the chi-squared difference between the observed and expected variation. While the distribution of these Z scores for the synonymous variants was normal, there is a marked shift in the missense distribution towards having fewer variants than predicted.

We identified a list of excessively constrained genes representing roughly 5% of all genes. This set of genes identified as excessively constrained showed enrichment for entries in the Online Mendelian Inheritance in Man (OMIM) database and, in particular, for those with a dominant inheritance pattern. Using published data, we found that de novo loss-of-function variants identified in patients with autism and intellectual disability were in a constrained gene more often than expected (p < 0.0001 for both). This trend did not hold for those genes with a de novo loss-of-function variant in a control (p = 0.66), indicating that this approach can effectively prioritize genes in which mutations can strongly predispose to disease.

2014 Working Groups

  • April 22, 2014
    12:30-2:00, Building 2-Room 426
    Kaitlin Samocha, Analytic and Translational Genetics Unit | MGH


Working Group Archive


Please feel free to contact us with any comments or questions at: