Xihong Lin
Professor of Biostatistics
Department of Biostatistics
Education
PhD, 1994, Unversity of Washington
Research
Dr. Lin’s major statistical research interests lie in developing statistical methods for high-dimensional and correlated data. Examples of high-dimensional data include genomic and proteomic data in basic, population and clinical sciences. Examples of correlated data include longitudinal data, clustered data, hierarchical data and spatial data. She is particularly interested in developing statistical and computational methods for “omics” data in population-based studies, such as genetic epidemiology, genetic environmental sciences and clinical studies. She currently serves as the coordinating director of the Program of Quantitative Genomics of Harvard School of Public Heath http://www.hsph.harvard.edu/pqg. Her statistical research is funded by the MERIT award from the National Cancer Institute on “Statistical Methods for Correlated and High-Dimensional Biomedical Data” http://www.cancer.gov/researchandfunding/MERIT/Lin
Dr. Lin’s specific areas of statistical research include statistical learning methods for high-dimensional data, dimension reduction, variable selection, nonparametric and semiparametric regression models, measurement error, mixed (frailty) models, estimating equations, missing data.
Dr. Lin’s areas of applications include cancer, genetic epidemiology, gene and environment, genome-wide association studies, genomics in population science, biomarkers and proteomics.
Publications
Liu, D., Lin, X., Ghosh, D. (2007) Semiparametric regression for multi-dimensional genomic pathway data: Least square kernel machines and linear mixed models. Biometrics, 63, 1079-1088.
Kwee, L, Liu, D., Lin, X., Ghosh, D., and Epstein, M. (2008) A powerful and flexible multilocus association test for quantitative traits. American Journal of Human Genetics, 82, 386-397.
Harezlak, J., Wu, M. C., Wang, M., Schwartzman, A., Christiani, D. C., and Lin, X. (2008) Biomarker discovery for arsenic exposure using functional data analysis and feature learning of mass spectrometry proteomic data. Journal of Proteome Research, 7, 217-224.
Liu, D., Ghosh, D. and Lin, X. (2008) Estimation and Testing for the Effect of a Genetic Pathway on a Disease Outcome Using Logistic Kernel Machine Regression via Logistic Mixed Models. BMC Bioinformatics, 9, 292.
Wu, M.,C., Zhang, L., Wang, Z., Christiani, D. C., Lin, X. (2008) Sparse linear discriminant analysis for simultaneous gene set/pathway significance test and gene selection. Bioinformatics, in press.
Yu, Z. and Lin, X. (2008) Nonparametric regression using local kernel estimating equations for correlated failure time data. Biometrika, 95, 123-137.
Li, Y., Prentice, R. L. and Lin, X. (2008) Maximum likelihood estimation in semiparametric normal transformation models for bivariate survival data. Biometrika, 95, 947-960.
Long, Q., Little, R. J., Lin, X. (2008) Causal inference in hybrid intervention trials involving treatment choice. Journal of the American Statistical Association, 103, 474-484.