Manuscript: Efficient Cross-Trait Penalized Regression Increases Prediction Accuracy in Large Cohorts using Secondary Phenotypes (Chung et al. (2019))
Download: CTPR Github
Description: The CTPR software was developed for multi-trait polygenic risk prediction in large cohorts. It utilizes multiple secondary traits based on individual-level genotypes and/or summary statistics from large-scale GWAS studies to improve prediction accuracy. Based on penalized least squares methods, we propose a novel cross trait penalty function with the Lasso and the minimax concave penalty (MCP) to incorporate the shared genetic effects across multiple traits and implement it for large-sample GWAS data. Our approach extracts information from the secondary traits that is beneficial for predicting the primary trait but tunes down information that is not. Our novel implementation of a distributed memory parallel computing algorithm makes it feasible to apply our methods to biobank-scale GWAS data. We compared our multi-trait methods with other existing methods such as MTGBLUP, MTAG and showed that our approach outperforms them in predictive performance.
Manuscript: A Genome-Wide Cross-Trait Analysis from UK Biobank Highlights the Shared Genetic Architecture of Asthma and Allergic Diseases (Zhu et al. (2018)), Integrative Approaches for Large-scale Transcriptome-Wide Association Studies (Gusev et al. (2016)) and Heritability and Genomics of Gene Expression in Peripheral Blood (Wright et al. (2014))
Description: In order to estimate twin-based heritability and conduct eQTL analysis, we used standard ACE model including additive genetic (A), common environmental (C) and environmental effects (E). For each transcript, the twin-based heritability and shared environmental effects can be estimated. The ACE model can be fit using variance-component maximization with the restricted profile likelihood.
Manuscript: Mixed Effects Models for GAW18 Longitudinal Blood Pressure Data (Chung et al. (2014))
Description: This software has been implemented on top of the widely used R packages, R/qtl (Broman et al., 2003) and R/qtlbim (Yandell et al., 2007). The MCMC algorithm written in C and data manipulation procedure in R were modified for handling longitudinal data. For the choice of the optimal number of grid points, the gridbayes provides both DIC and simplified BPIC scores for our Bayesian model.
Manuscript: Bayesian Parametric and Nonparametric Methods for Multiple QTL Mapping and SNP-Set Analysis (Chung et al. (2013))
Download: GridGP Github
Description: This software has been implemented on top of the C code developed by Zou et al. (2010) for univariate trait mapping. The MCMC algorithm and data manipulation procedure were modified for longitudinal data. For the choice of the optimal number of grid points, the gridgp provides both DIC and simplified BPIC scores.