Sarah Gagliano

Sarah Gagliano
University of Michigan School of Public Health

Dense imputation of the UK Biobank genetic data reveals disease-associated rare loss of function variation

SA Gagliano, D Taliun, W Zhou, J Nielsen, J LeFaive, R Dey, S Das, GR Abecasis

Loss of function (LoF) variants, such as those that introduce a premature stop or shift the reading frame, eliminate or greatly diminish protein action.

To pinpoint disease-associated LoF variants, we imputed variants into the UK Biobank cohort using 60K deeply sequenced individuals from the multi-ethnic TOPMed project. This allowed us to expand the number of variants characterized from 39M to 178M. The vast majority (94%) of the imputed variants are rare (MAF<0.5%), of which 0.03% (49,892) are predicted LoF. To identify disease-associated LoF variants, we conducted single-variant and gene-burden tests for >1,400 traits.

In the single-variant analyses, we identified five rare LoF variants (not found in the 39M dataset) to be associated with disease. This included a frameshift in CHEK2 and a gained stop in PALB2 associated with breast cancer. Although present in ClinVar, this is the first time these variants have been characterized in a generally healthy adult population.

Beyond single-variant analyses, we found burden signals in genes previously implicated in familial disease, but for which no single significant LoF variants, such as USH2A LoF and hereditary retinal dystrophies.

We demonstrate that association studies in biobanks can yield pathogenic findings previously only detected in clinical-cases or difficult-to-collect family cohorts.