Jill Moore

Jill Moore
University of Massachusetts Medical School

Systematic Evaluation of Methods for Linking Enhancers with Target Genes

Jill Moore, Kyle Garnick, Michael Purcaro, Henry Pratt, and Zhiping Weng

During the third phase of the ENCODE project, we developed the Registry of candidate Regulatory Elements (cREs), a collection of over 1.3 million human and 430 thousand mouse putative regulatory regions. The target genes for the majority of these cREs are unknown, as 75% have enhancer-like signatures (ELS) and are distal from TSSs. While many labs have developed computational methods for predicting enhancer targets, these methods are trained and tested on different datasets, making comparisons difficult. Therefore, to evaluate these methods, we developed a benchmark of ELS-gene pairs using Hi-C, ChIA-PET, and eQTL datasets. Using this benchmark, we evaluated unsupervised (e.g., signal correlation) and supervised target gene prediction methods. Overall, correlation-based approaches performed poorly, while most supervised machine learning methods had comparably high performance. TargetFinder, developed by the Pollard lab, consistently had the best performance across all benchmark datasets. We then modified TargetFinder to run with fewer input features and using this model, identified a novel multiple sclerosis GWAS risk gene. Our work establishes a pipeline for generating a benchmark of enhancer-gene pairs, which we hope to use to develop and evaluate new target gene prediction methods.