SKAT (SNP Set/Sequence Association Test)
(1) Association tests between a set of common and rare SNPs and continuous and dichotomous (case-control) phenotypes using kernel machine methods for data from GWAS and genome-wide sequencing association studies
(2) Sample size and power calculatons for sequencing association studies.
- Lee, Seunggeun, et al. (2012). Optimal Unified Approach for Rare-Variant Association Testing with Application to Small-Sample Case-Control Whole-Exome Sequencing Studies . The American Journal of Human Genetics, 91.2, 224-237.
- Lee, S., Wu, M.C. and Lin, X. (2012). Optimal tests for rare variant effects in sequencing association studies. Biostatistics, 13.4, 762-775. Supplementary Materials.
- Wu, M. C., Lee, S., Cai, T., Li, Y., Boehnke, M. and Lin, X (2011) Rare Variant Association Testing for Sequencing Data Using the Sequence Kernel Association Test (SKAT). American Journal of Human Genetics, , 89.1, 82-93.
- Wu, M. C., Kraft, P., Epstein, M. P.,Taylor, D., M., Chanock, S. J., Hunter, D., J., and Lin, X. (2010) Powerful SNP Set Analysis for Case-Control GenomeWide Association Studies. American Journal of Human Genetics, , 86, 929-942.
MetaSKAT (Meta-analysis for multiple markers)
MetaSKAT is a R package for multiple marker meta-analysis across studies. It can carry out meta-analysis of SKAT, SKAT-O and burden tests with individual level genotype data or gene level summary statistics.
- Lee, S., Teslovich, T.M., Boehnke, M. and Lin, X. (2013) General framework for meta-analysis of rare variants in sequencing association studies, American Journal of Human Genetics, in press.
CEPSKAT (Continuous Extreme Phenotype SKAT)
CEPSKAT extends the SKAT framework to the setting of continuous extreme phenotype samples. You can download the R package for CEPSKAT here. For Windows, download the compiled binary version instead. Consult the help files in the package for instruction and examples of usage.
- Barnett, I., Lee, S., Lin, X. (2012) Detecting Rare Variant Effects Using Extreme Phenotype Sampling in Sequencing Association Studies . Genetic Epidemiology . In press.
coxKM (cox Kernel Machine)
coxKM (cox Kernel Machine) is an R package for conducting SNP-set association tests for right-censored survival outcomes based on kernel machine cox regression framework. coxKM is meant for common genetic variants only. coxKM tests for association between a SNP-set (made up of common variants) and a right-censored survival outcome. Software download , manual download .
- Lin X, Cai T, Wu M, Zhou Q, Liu G, Christiani D and Lin X. 2011. Survival Kernel Machine SNP-set Analysis for Genome-wide AssociationStudies. Genetic Epidemiology 35:620-31. doi: 10.1002/gepi.20610
- Cai T, Tonini G and Lin X. 2011. Kernel machine approach to testing the significance of multiple genetic markers for risk prediction. Biometrics, 67:975-86. doi:10.1111/j.1541-0420.2010.01544.x
gSKAT (family based association test)
gskat is a R package implements a family based association test via GEE Kernel Machine (KM) score test. It has functions to perform both burden test and SKAT test with family members as well as unrelated individuals. The package allows for both continuous and discrete traits in the association test.Software download
User groups: Feel free to join in the group to ask / discuss / comment about the package on the forum.
- Wang X, Lee S, Zhu X, Redline S, and Lin X. (2013) GEE-Based SNP Set Association Test for Continuous and Discrete Traits in Family-Based Association Studies. Genet Epidemiol. 37:778-86.
- Lin, X., Lee, S.,Wu, M.,Wang, C., Chen H., Li, Z. and Lin, X. Test for rare variants by environment interactions in sequencing association studies. Biometrics, in press.
- Lin, X., Lee, S., Christiani, D. C., and Lin, X. (2013). Test for the Interaction between a Genetic Marker Set and Environment in Generalized Linear Models. Biostatistics, 14: 667-681. doi:10.1093/biostatistics/kxt006.
GMMAT (Generalized linear Mixed Model Association Test)
GMMAT is an R package for performing genetic association tests for outcomes with distribution in the exponential family (e.g. binary outcomes) based on the generalized linear mixed model. It can be used to analyze genetic data from individuals with population structure and relatedness. GMMAT fits a generalized linear mixed model under the null hypothesis of no genetic association, and then performs a score test for each individual genetic variant.
- Breslow NE and Clayton DG. 1993. Approximate Inference in Generalized Linear Mixed Models. Journal of the American Statistical Association 88: 9-25.
- Chen H, Wang C, Conomos MP, Stilp AM, Li Z, Sofer T, Szpiro AA, Chen W, Brehm JM, Celedon JC, Redline SS, Papanicolaou GJ, Thornton TA, Laurie CC, Rice K and Lin X. Control for Population Structure and Relatedness for Binary Traits in Genetic Association Studies Using Logistic Mixed Models. Submitted.
SMAT (Scaled Multiple-phenotype Association Test)
The current version of the R package is 0.98. Please download the source .tar.gz file or the .zip file for installation. Please download the manual PDF here. Some example files are also available for download.
- Schifano, E.D., Li, L., Christiani, D.C., and Lin, X. (2012) Genome-wide Association Analysis for Multiple Continuous Secondary Phenotypes. (in revision)
- Roy, J., Lin, X., and Ryan, L. (2003). Scaled Marginal Models For Multiple Continuous Outcomes. Biostatistics, 4, 371-384.
- Lee, S., Epstein, M.P., Duncan, R. and Lin, X. (2012) Sparse principal component analysis for identifying ancestry-informative markers in genome-wide association studies. Genetic Epidemiology , 36.4, 293-302.
sLDA Pathway Test
Logistic Kernel Machine
- Wu, M.,C., Zhang, L., Wang, Z., Christiani, D. C., Lin, Sparse linear discriminant analysis for simultaneous gene set/pathway significance test and gene selection. , Bioinformatics, , 25,1145-1151.
- Liu, D., Ghosh, D. and Lin, X. (2008) Estimation and Testing for the Effect of a Genetic Pathway on a Disease Outcome Using Logistic Kernel Machine Regression via Logistic Mixed Models. BMC Bioinformatics, 9, 292.
- Liu, D., Lin, X. and Ghosh, D. (2007) Semiparametric Regression of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines and Linear Mixed Models. Biometrics, 63, 1079-1088.
SAS Macro SPMM
SAS Macro Spline_Mixed
SAS Macro GAMM1
- Zhang D., Lin X., Raz J., and Sowers M. (1998). Semiparametric stochastic mixed models for longitudinal data, Journal of the American Statistical Association, 93, 710-719.
- Lin X. and Zhang D. (1999). Inference in generalized additive mixed models using smoothing splines, Journal of the Royal Statistical Society, Series B, 61, 381-400.
- Zhang D., Lin X. and Sowers M. (2000). Periodic semiparametric regression for longitudinal hormone data from multiple menstrual cycles. Biometrics, , 56, 31-39.